Monday 28 April 2008

Machines are predictable, people are not


I suppose we would all agree with that and that's why smart people try to develop processes to make us more predictable. On the other hand nobody likes being constrained by anything and especially a process. Some people call this kind of lack of structure freedom, some call it chaos :). From my experience a bit of process might actually help a lot whereas a complete lack of it leads sooner or later to a disaster. Scrum is one of the approaches that let people develop software in a predictable way and that's the topic of the next MTUG event (29th April) that I'm not going to miss. See you there.

Tags: , ,

Wednesday 16 April 2008

Never ever synchronize threads without specifying a timeout value

Whenever there is more then one thread and more then one shared resource there must be some synchronization in place to make sure that the overall state of the application is consistent. Synchronization is not easy as it very often involves locking which very easily might lead to all sorts of deadlocks and performance bottlenecks. One of the ways of keeping out of trouble is to follow a set of guidelines. I can list at least a few sources of information worth getting familiar with:
And of course :) my two cents or rather lessons I've learnt writing and/or debugging multithreaded code:
  1. Minimize locking  -  Basically lock as little as possible and never execute code that is not related to a given shared resource in its critical section. The most problems I've seen were related to the fact that code in a critical section did more then it was absolutely needed.
  2. Always use timeout - Surprisingly all synchronization primitives tend to encourage developers to use overloads that never time out. One of the drawbacks of this approach is the fact that if there is a problem with a piece of code then an application hangs and nobody has an idea why. The only way to figure that out is to create a dump of a process (if you are lucky enough and the process is still hanging around) and debug it using  Debugging Tools for Windows. I can tell you that this is not the best way of tackling production issues when every minute matters. But if you use only API that lets you specify a timeout then whenever a thread fails to acquire a critical section within a given period of time it can throw an exception and it's immediately obvious what went wrong.

    Monitor.TryEnter(obj, timeout)
    WaitHandle.WaitOne(timeout, context)

    The same logic applies to all classes that derive from WaitHandle: Semaphore, Mutex, AutoResetEvent, ManualResetEvent.
  3. Never call external code when in a critical section - Calling a piece of code that was passed to a critical section handler from outside is a big risk because there is always a good chance that at the time the code was designed nobody even thought that it might be run in a critical section. Such code might try to execute a long running task or to acquire another critical section. If you do something like that you simply ask for trouble :)
I suppose it's easy to figure out which one has bitten me the most :).