Hello, this is Jan Lehnardt and you're visiting my blog. Thanks for stopping by.
plok — It reads like a blog, but it sounds harder!
CouchDb is a document database. It stores information based around the concept of documents rather than tables and rows like traditional relational databases do.
When I had a look at it earlier, it was available for Windows only. This made me go to other things. But when Damien, the author of CouchDb, wrote about integrating database driven web applications that run on the client’s computer and even work offline(!), I was all fire about trying it out. I won’t touch Windows for development stuff because it slows me down significantly, so I had to make CouchDb build and run on Unix.
CouchDb is free—as in GPL—software and it comes with its source code. I grabbed it, tried to make sense of it and sent Damien a mail detailing my intentions along with a plan. His reply clarified a few points about CouchDb’s architecture and how I should approach the port. Then I went to work.
CouchDb itself is written in Erlang but it uses a utility programme written in C++. I thought build the dependencies before approaching CouchDb. First because it makes sense and secondly because I know a thing or two about compiling code from the C family and I had no clue about Erlang.
The code came with project files for VisualStudio and I thought it would be a good idea to look for a tool that converts it to a Makefile. Such a tool exists as part of the ming project, but after Damien’s advice and looking at the project files, I decided against using the converter. However, I used the files to see where the build process got tricky.
Prior to that, I already installed the Erlang/OTP package on my system. It includes an Erlang compiler and complete runtime environment. We need that later.
The utility programme CouchDb uses is called FabricServer. It handles all the details needed to support the Farbic language. Fabric is supposed to be a “pragmatic and simplified language for the purpose of querying and processing data in CouchDb.” FabricServer in turn relies on a library called libfabric. libfabric then uses the output from a parser generator called Antlr. Easy!
`CouchDb --calls--> FabricServer --needs->Fabric --isBuiltUsing-> Antlr`
So first I had to build Antlr. This is straight C++ and a
g++ -c -I../ *.cpp did a wonderful job.
The result is a bunch of new .o files that are pretty useless on their own, but we deal with them later.
Next Step: Fabric. CouchDb is Unicode enabled, and that cascades down to Fabric. So icu, IBM’s Open Source Unicode library, is needed. The installation is only a usual ./configure;make;make install away.
To build fabric then, we need to specify the directory where the icu header files live:
To read and understand a language like Fabric several steps are needed. Antlr greatly simplifies these steps for the programmer. All he has to do is to specify the grammar of the language. The rest is then generated by Antlr. To convert the grammar into usable code, the antlr.jar programme must be run. It places the generated files in a subfolder called GeneratedCode/ where we can pick them up for compilation. antlr.jar is written in Java, so we need java. Luckily, it is already installed. For the Windows build, Damien used a script to trigger antlr.jar. I rewrote the few lines in sh, a scripting language, all modern Unices understand.
The generated code can be compiled just like antlr could be compiled. Pretty easy.
Actually it did not went that smooth all the time. I stepped over some Windows specific stuff, like the inclusion of
crtdbg.h, which is needed (I think) by the VisualStudio debugger. I also found some non-standard compliant code that refused to build. VisualStudio is a bit lax when it comes to enforcing standarized code, but Damien was very quick to fix these things.
For now we have a bunch of object files for Fabric and antlr. To make these available to the fabric server, they are combined into a static library:
`ar rcs libfabric.a *.o GeneratedCode/*.o antlr/*.o`
On—To the FabricServer! We want to build and link it in one step:
`g++ -I./Fabric/ -I./Fabric/GeneratedCode/ -I/usr/include/ -I./Fabric/antlr/ -LFabric/ FabricServer/FabricServer.cpp -o FabricServer -lfabric -licuuc -licui18n -licudata`
-lfabric tells g++ to link FabricServer against our libfabric.a. We’re half way through. While building, we found another issue that is Windows only. FabricServer uses the standard input and output streams to communicate with CouchDb. On Windows, you have to set them to use a binary mode. POSIX does not define such a thing and all standard I/O is considered binary safe.
Before we get to the Erlang part, there is one more C file to compile. It is an Erlang plugin that allows calling C-code directly from Erlang. Sounds cool? It is! Erlang plugins are loaded at runtime and the proper file format therefore is a dynamic or shared library. MacOS X (where the most parts of the port were done) and Linux differ a bit in how they handle shared libraries. In the end, you can use them similarly. For MacOS X we use
`gcc -bundle -flat\_namespace -undefined suppress -I/usr/local/erlang/include CouchDb/couch\_port\_driver.c -L/usr/local/lib -o couch\_erl\_driver.so -licudata -licuuc -licui18n`
and for Linux we use
`gcc -rdynamic -shared -I/usr/local/erlang/include CouchDb/couch\_port\_driver.c -L/usr/lib -o couch\_erl\_driver.so -licuuc -licudata -licui18n`
Both result in a
couch_erl_driver.so file that can be picked up by the Erlang runtime.
Fortunately, compiling erlang is really straightforward. There is a compiler
erlc and your set of source files with the file extension
.erl and all yo have to do is call the compiler with your file as an argument.
It creates a
.beam files are the compiled bytecode that is run by the Erlang interpreter. In concept it is much like Java. After building all necessary files, I took the Windows release and exchanged the binaries for what I had just compiled. CouchDb needs a bunch of helper files and by using the Windows release I made sure everything was in place. After rewriting the startup script in sh, I could try and start CouchDb. And it worked!
I quickly installed the Demo and tried a few things and everything worked as expected. In all, it took Damien and me four or five evenings of fiddling with things to get CouchDb ported to Unix. The latest release of CouchDb—0.5.0—comes with an automated build system for Unix.
How we created the build system is a story for another entry.