The view from down here


November 25, 2008

I haven't blogged in a while, so I thought I'd drop in and talk about the dark side of programming.


It is commonly said that if you're writing software, you're writing bugs. It is practically impossible to foresee all the subtle interactions and conditions. But I don't have to like it.

I had two bugs to deal with today.

First I had to track down why the (struct) cast in my script language wasn't doing the right thing when applied to an uninitialized sub structure member. That one was pretty easy. It was an eye-roller. I sort of had two if statement reversed, and it was impossible to get to the right place. The concept was right, but I missed the flow.

This other bug is nasty. I recently added thread control to the server object base class, which complicated the event model that it used to be based on. Now each thread runs its own event system, with the first event system taking control of signals and routing them as events to the other threads.

I already found one error that came from copying my global variables into the new thread_ctrl blocks, a place where I should have used a reference counted pointer but didn't. I still don't, but I at least changed the validate pointer function to check in the right places before dereferencing it.

This other bug is new, and may be related to the new signal handling I had to write in the event model after I saw that sigchlds were not getting delivered properly in some cases.

Luckily the system is telling me about a double free error, but the bug is never where it crashes. Some nasty bit of code frees memory or writes where it shouldn't, then some unsuspecting well behaved bit of code comes along and it blows it all to bits.

So I moved everything over to the older Linux system where we still have a license for purify, which is a bit of code that crawls all up in the other code's nasty business and waits for the land mind to be set. But I wasted a good hour tracking down ghosts because I forgot to compile everything with -g, and it seems to have been confused with the optimized binaries. Nice.

But overall, the system is coming together very well. You could think of it sort of like mod-perl running in apache, but instead of perl its my own script language, and instead of apache it's my own mini-web server, and there's also a rule/workflow server, a servlet harness, and dedicated search/retrieve/citation servers.

The idea was to move from doing everything in custom C++ and start implementing logic in the script language, with dynamic libraries extending the language to hook in to libcurl and imagemagick and such. But for that to work I need to stop tweaking the base classes and core libraries.

Overall, it's great fun, and we're doing some really good things with it, but deep down in the bowels of the system lies a horrid little bug.

It only takes one segv to stop the whole tangle of threads.

But I'll get it, with the help of purify, the -g flag, and several more hours squinting at hundreds of lines off cout statements.

Is it weird that I'm a little disappointed that Thanksgiving is going to truncate my bug hunt? I mean, I'm not actually saying that I want to forget about turkey, lock myself in the room with emacs, iTunes, and coffee until the bug is dead. I'm not saying that.

Stupid bug.

Fun times.

Copyright 2008 Daniel LaFavers