We are hitting a few walls with a CouchDB deployment and both Damien and I are a bit puzzled. This posting tries to attract someone with a clue to help us out. Our problems might result from not understanding the documentation correctly, but with evidently inaccurate material, we stand little chance
Long story short: We’ve got it all sorted out.
Memory Hogging Spidermonkey
Sam Ruby relayed a hint by “a Mozilla Developer”. Invoking Spidermonkey with the -b parameter and a value of 1000000, we are able to keep the memory footprint constant. We haven’t measured how this impacts performance, though.
Crashing Erlang VM
#erlang on irc.freenode.org helped to clarify how heart is supposed to work. We interpreted the documentation as heart being a monitoring process that restarts the Erlang VM, when it crashes. That is not the case and totally wrong. Since heart is started from the Erlang VM (it is a child process in the process hierarchy), it cannot start a new VM when the old one crashes because the OS wipes out all child process before they can do anything.
What is heart good for then? Apparently, the Erlang VM can potentially get stuck (tip o’ the head to noss). I don’t know how often and under what circumstances that happens (I guess it is seldom and rare), but it can happen. Heart is designed to to check the VM’s health every now and then and launch a utility programme that takes care of the application restart.
A side note, the minimum timeout that heart allows for the Erlang VM to not respond to health checks is 11 seconds. The heart man page clearly states the fact, but heart behaves unintuitive when you specify, say, 10 seconds because you failed to differentiate between < and <=. Instead of defaulting to the lowest possible (11) value, it assumes the default value of 60 seconds which makes testers (me) think, nothing happens at all. Now this is clearly a PEBKAC and RTFM-type of error, but to be frank, the fine manual is not very approachable and I decided to fall back to heart.c to see how things actually work.
Automatically restarting CouchDB
Noah Slater pimped the script that launches CouchDB in a way that, if you want to, CouchDB gets restarted automatically, in case the Erlang process dies. This is quite nice. Since CouchDB takes almost no time to restart, you have a nearly uninterrupted service. We also have heart configured in a way that in case the Erlang VM gets stuck, it kills the VM process and nothing else. The launch script then detects that the process is gone and restarts it. This takes at least 11 seconds, as outlined above. If you need less, you need to hack heart.c.
Thanks to all who sent in suggestions and words of help.