Whenever there is more then one thread and more then one shared resource there must be some synchronization in place to make sure that the overall state of the application is consistent. Synchronization is not easy as it very often involves locking which very easily might lead to all sorts of deadlocks and performance bottlenecks. One of the ways of keeping out of trouble is to follow a set of guidelines. I can list at least a few sources of information worth getting familiar with:
And of course :) my two cents or rather lessons I've learnt writing and/or debugging multithreaded code:
- Minimize locking - Basically lock as little as possible and never execute code that is not related to a given shared resource in its critical section. The most problems I've seen were related to the fact that code in a critical section did more then it was absolutely needed.
- Always use timeout - Surprisingly all synchronization primitives tend to encourage developers to use overloads that never time out. One of the drawbacks of this approach is the fact that if there is a problem with a piece of code then an application hangs and nobody has an idea why. The only way to figure that out is to create a dump of a process (if you are lucky enough and the process is still hanging around) and debug it using Debugging Tools for Windows. I can tell you that this is not the best way of tackling production issues when every minute matters. But if you use only API that lets you specify a timeout then whenever a thread fails to acquire a critical section within a given period of time it can throw an exception and it's immediately obvious what went wrong.
The same logic applies to all classes that derive from WaitHandle: Semaphore, Mutex, AutoResetEvent, ManualResetEvent.
- Never call external code when in a critical section - Calling a piece of code that was passed to a critical section handler from outside is a big risk because there is always a good chance that at the time the code was designed nobody even thought that it might be run in a critical section. Such code might try to execute a long running task or to acquire another critical section. If you do something like that you simply ask for trouble :)
I suppose it's easy to figure out which one has bitten me the most :).