The Art of the Error Message
Yesterday, my boss and I were beating our heads against the wall trying to figure out just why cfengine was refusing to run properly on one system. We googled and googled some more, scanned through mailing lists, usenet posts, and all manner of debugging information, because the error message we got was so… Opaque.
# cfagent cfengine:: Received signal 13 (SIGPIPE) while doing [lock.cfagent_conf.myhost.link._var_cfengine_bin_cfenvd__usr_sbin_cfenvd_136] cfengine:: Logical start time Mon Mar 27 17:11:55 2006 cfengine:: This sub-task started really at Mon Mar 27 17:11:55 2006
As it turns out, after a couple hours of heading in the wrong direction, I came in to work this morning with a clear head (and a very helpful email from my boss regarding something he found in a mailing list archive), and managed to get it all working. Damn skippy.
The actual problem was that the client was not on the server’s “allowed IP address” list for client connections. So the server was rejecting the connection. A 30-second fix (edit file, commit to repository, update server’s copy) and it’s all working. Now, wouldn’t it have been nicer if the client could have said something like “Hey, I got disconnected from the server unexpectedly, please check it out, kthx!” But no, popping the raw error condition from the OS, or, rather, simply making the extent of your handling of the error be some cleanup and then DYING is apparently the way to go.
Frustrating, to say the least.

Leave a Reply
You must be logged in to post a comment.