Hello, this is Jan Lehnardt and you're visiting my blog. Thanks for stopping by.
plok — It reads like a blog, but it sounds harder!
↑ Archives
Last weekend, Damien committed the results of a few weeks worth of intensive work on CouchDb to the public Subversion repository.
To speedily gain access to your documents in CouchDb, you create views that define a certain structure. Up to now, Damien’s home-grown Fabric language was used to describe the views. This is no longer the case. You now simply use Javascript functions to do it.
For a couple of months now, we were discussing the merit of switching from XML to JSON for transferring data from and to CouchDb. The only advantage of XML was that it was in already. JSON, on the other hand, would add so much goodness that we just couldn’t hold it back any longer.
I began working on the transition in June. During August, Damien picked up the work and eventually committed most of it over the weekend, while I spent some time getting the build-process done and looking for ways to integrate Spidermonkey, Mozilla’s Javascript Engine.
CouchDb now internally uses the JSON representation to handle documents. We got rid of a whole lot of XML-related boilerplate code in the process — and less code is always good. JSON then allowed us to get rid of Fabric too, the advantage being that people interested in CouchDb didn’t have to learn yet another language. Javascript is pretty well known by now and easy to pick up, too. At last, we even added another buzzword to the ever growing list of CouchDb’s features. Views are now created using a map-reduce architecture. An architecture that has already been proven to be massively scalable.
CouchDb documents are now in fact first class JSON objects:
{
"id":1234,
"rev":1,
"type":"Person",
"name":"John",
"age":24
}
Defining a view is now as easy as sending a Javascript function to CouchDb. It receives one document at a time, you do whatever calculations you need and then return either the data you want in the view or nothing, if the document should not be in the view. It’s this easy:
function(doc)
{
if("Person" == doc.type) {
return doc.name;
}
}
What happens here? You define an anonymous function that expects a single argument, a document. You then test the document, is it of a Person-type? If it is, put its name-value into the view. If it’s not, do nothing.
This gets repeated for every document you send in and the views you later query get built up in the process, making retrieving your data an instant operation.
This allows you not only to get information back in the way of
Give me all the documents from July that have more than 2,500 words in their Text field.
but also
Give me all the documents from July where a the Potato field is a two-element array where the first element is another JSON object with 7 fields, one of which has the value 24.
It not only frees you from the Excel-way of thinking in rows and columns, it lets you model your data in free form. And you can still search through it in no time.
Get your code now (Try the «json» branch to try things out [Update 9/7/07: The JSON branch was merged back to trunk]).
The Javascript engine is hooked up in a way that you can easily plug in any other language. So you could as well write your table definitions in your favourite scripting language, or real™ programming language.
Great sutff! It’s probably not too soon to start looking at Tamarin. The NBL marches on…
Looks like relational algebra is making a comeback! :)
@Robert On the contrary! We hope to put it finally to rest. CouchDb is the antagonist of everything relational (sort of :-)
Er, how so? A JSON dict is a "labeled relation" in just about the purest sense of the term, and you’re performing predicate logic on them. How is a CouchDB view not an implementation of the relational-algebraic operators "select" and "project"? Not having joins doesn’t mean it’s not relational—"relation" refers to the relationships within a single entity (or "document", to use your terminology).
My point was that you’re pursuing an algebraic (imperative logic) approach as distinct from a calculus (declarative logic, e.g. SQL).
@Robert Fair enough. But there is probably no need in trying to apply this to CouchDb (and probably scaring away folks in the process :-)
How is the JSON comming along? Do you have a new osx build ?
We’re preparing a new release pretty soonish. I’ll be providing an OS X build then. No earlier, sorry :) If you don’t fear Terminal.app, you can make one yourself.
Does not work for me on Linux, but I didn’t build it right…
/opt/couchdb# bin/startCouchDb.sh Erlang (BEAM) emulator version 5.5.5 [source] [async-threads:0] [kernel-poll:false]
couch 0.6.4 (LogLevel=info) CouchDb is starting sh: /opt/couchdb: is a directory sh: line 0: exec: /opt/couchdb: cannot execute: Success [error] [] Abnormal shutdown of JS process (exit_status: 126). CouchDb has started. Time to Relax. Eshell V5.5.5 (abort with ^G) 1>
I wanted to say, that I think I missed something in the build process. It just compiled way to fast :-)
I have had no lucking building on OS X either.
My journey ended with: {"init terminating in doboot",{{badmatch,error},[{buildcouch,compileall,0},{init,startit,1},{init,start_em,1}]}}
Crash dump was written to: erlcrash.dump init terminating in doboot () make: *** [couch.boot] Error 1
@Nick I’ll reply to your mail later today. In the meantime you might try the json branch instead of trunk.
Jan; Thanks for the tip and the help. I get on it after lunch.
Do this: install erlang and ICU
get svn checkout http://couchdb.googlecode.com/svn/branches/json couchdbjson
cd couchdbjson/CouchProjects
./build.sh if it asks you to set up environment variables, do so
cd js
make -f Makefile.ref
copy the attached files to ./CouchDb
mkdir CouchDb/conf cp ./dist/common/conf/mime.types CouchDb/conf
cd CouchDb
erl and then, in the erlang shell
couch_server:start().
That should do the trick and run couchdb from the source tree.
We’ll get to streamlining the installation soon.
So this was obviously copied from a mail. Here are the The attached files
Build worked great now that I am on the json branch, and have those great instructions. It all seems to be running. Can’t do much with it except have my browser tell me that ‘{"error":{"id":"error","reason":"illegaldatabasename"}}’ when i hit: http://localhost:8888/$utils/peek.html
but I am started and that’s what counts.
Thanks for the help thus far!
peek is no longer, as it uses the old XML API.
http://localhost:8888/_utils/couch_tests.html If you see the line that says "tests passed", it ran the tests!
You can also try a command shell at: http://localhost:8888/_utils/shell.html
In general, if there was a ‘$’ in the URL, try a ‘_’ now.
There we go! The problem was i just didn’t know what I was doing….
Thanks for all your help jan. I can’t wait to start playing with couchdb.
Hi Jan, thanks for the detailed instructions. I haven’t had a lot of luck messing around with CouchDB so far, though. GET commands work fine, but any other kind of request (PUT/POST/DELETE) returns a 501 ("not implemented") error code. Opening
/_utils/couch_tests.html
, for example, shows me an error in firebug:DELETE http://localhost:8888/test_suite_db/ 501
which of course halts the tests.
I’m on OS X 10.4 running erlang 5.5.5.
@Brian First, note that the json branch is still in development, things might not work as expected or advertised. For the problem at hand: CouchDb ships with a modified version of inets and with the instructions above this does not get used. (I haven’t noticed since I have modified my system’s inets). What you can do is compiling CouchDb’s inets and copy the files into your system’s Erlang library dir. This is not a scary change to make since it doesn’t break anything, it only adds support for more HTTP verbs. When we have the installer ready, this won’t be an issue anymore.
Thanks! I understand that the branch is still under heavy development. I just wanted to make sure I could get as far as the end of your instructions above before I started exploring the source.
Running build.sh had already built couch_inets, so I just copied those beam files into my system’s Erlang directory and the tests all pass now.
Nice!
the json branch has been merged back to trunk
Some notes, as I took rev156 /branches/json for a spin on debian linux:
cp build/FabricServer /usr/local/couchdb/bin/ cp: cannot stat `build/FabricServer’: No such file or directory make: *** [install] Error 1
./bin/startCouchDb.sh sh: /usr/local/couchdb: is a directory sh: line 0: exec: /usr/local/couchdb: cannot execute: Success [error] [] Abnormal shutdown of JS process (exit_status: 126). CouchDb has started. Time to Relax.
Checked out the json branch but it seems to be missing couch.js (you would expect it to be here http://couchdb.googlecode.com/svn/branches/json/CouchProjects/dist/common/htutils/couch.js) as couchtests.html references it as being in the same directory.
I’m a little confused how anyone else is running this (the tests) successfully… is this file supposed to be generated somewhere or was it recently removed from svn?
Wow this is so cool
I also think it’s a really good idea to make it easy to use ‘real’ ;) languages to plug in on the map/reduce side. I could envisage optimizations of common or frequent queries in a large scale web app where performance was paramount. Thus one could use javascript initially and migrate the most used or intensive queries to something faster like erlang. In particular erlangs pattern matching would be a god send, maybe the API for such extension could include both map and reduce parts as well as compatable alternatives.
It would be great to a have a simple way to add Erlang query modules, it just blows SQL type queries into last century!!
regards Al
@Al Exactly! :-) If you’ve got Erlang chops, feel free to get on it.
Well we are in testing this month so don’t have much time, but I will be cracking open the source and taking a peek.
When I get chance, I will have a little play, looking forward to it, somehow I think this is important.
regards Al