Thursday, 29 May 2008

WeakEvent - you wish it was here

At the first glance .NET events are an easy and harmless way to decouple components. The former statement is true but the latter is not. The reason is that whenever an instance of a class subscribes to an event published by another class a strong link between these two is established. By strong link I mean that the subscriber(listener) won't get garbage-collected as long as the publisher is alive. The only way to break that link is to unsubscribe from the event which might be easily omitted as the link is not explicit. Additionally there are cases when explicit(deterministic) cancellation of subscription is impossible. If additionally the publisher is a long living object than we might face a memory leak. In ideal world there would be a way of specifying that a subscription is weak which would mean that if the subscription is the only link to an object then the object can be garbage-collected and the subscription can be deleted. .NET does not provide that facility out of the box but fortunately it provides building blocks that in most cases let us build a good enough solution. The idea is to intercept calls that add and remove subscribers from/to an event and create a weak link between subscribers and publisher instead of the default, strong one.
When you define an event you don't have to write add/remove methods on your own because C# compiler generates them automatically. Basically the following code snippet:

public event EventHandler <EventArgs> MyEvent;

is just "syntactic sugar" that C# compiler transforms to much more verbose form. You can find detailed description of this process in CLR via C#  by Jeffrey Richter. From our perspective the most important thing is that we can overwrite the default behavior of the compiler and inject our own implementation of Add and Remove methods in a way that is completely transparent to subscribers. SomeClass and Subscriber classes show how it can be done. Don't worry about WeakEvent<T> class as it will be explained later.

public class SomeClass
{
private WeakEvent <EventArgs> myWeakEvent;
public event EventHandler <EventArgs> MyWeakEvent
{
add
{
myWeakEvent.Add(value);
}

remove
{
myWeakEvent.Remove(value);
}
}
private void SomeMethodThatNeedsToRiseMyWeakEvent()
{
OnMyWeakEvent(new EventArgs());
}

protected void OnMyWeakEvent(EventArgs args)
{
myWeakEvent.Invoke(args);
}
}
public class Subscriber
{
private SomeClass someClass;
public Subscriber()
{
someClass = new SomeClass();
someClass.MyWeakEvent += Method;
}

private void Method(object sender, EventArgs e)
{
}
}


Add and Remove methods take a delegate as the input parameter. Every .NET delegate is an object with 2 properties. One of them is a reference to the target of the delegate(the object the delegate will be called on) and the second one is a description of the method which is provided as an instance of System.Reflection.MethodInfo class. Static delegates have the target property set to null. The target field is the root of all evil as it keeps the subscriber alive(it is a strong reference to the object the delegate will be called on). Fortunately .NET framework provides a class that can act as man in the middle between the method and its target which lets us break the direct link between them.
The class that makes it possible is called (no surprise) System.WeakReference. An instance of System.WeakReference class keeps a weak reference(instead of strong) to the object that is passed to its constructor. The weak reference can be transformed into the strong reference by accessing its Target property and storing its value in an ordinary variable. In this way we resurrect the object. If the object is already garbage-collected then the property returns null. All aforementioned functionality is encapsulated in a custom class that I called WeakDelegate.

internal class WeakDelegate
{
private WeakReference target;
private MethodInfo method;

public object Target
{
get
{
return target.Target;
}
set
{
target = new WeakReference(value);
}
}

public MethodInfo Method
{
get { return method; }
set { method = value; }
}
}

WeakEvent<T> is a class that takes advantage of WeakDelegate class to solve the problem outlined in the first paragraph. Its below implementation is rather straightforward but 2 pieces of code might need some explanation. The first one is inside Invoke method. Internally we store instances of WeakDelegate class which means that we can not invoke them directly and every time one of them needs to be executed we have to assemble an instance of  System.Delegate class. I don't know if the way the code creates delegates is the fastest one but I measured the execution time of that statement and the average time was 0.005384 ms per delegate which is fast enough for me. The second one is related to the fact that the locking is done in a way that prevents threads from waiting forever. If a thread can't enter the critical section within 15 seconds then it throws an exception. The rationale behind that approach is explained here.

public class WeakEvent <T> where T : EventArgs
{
private readonly List <WeakDelegate> eventHandlers;
private readonly object eventLock;

public WeakEvent()
{
eventHandlers = new List <WeakDelegate>();
eventLock = new object();
}

public void Invoke(T args)
{
ExecuteExclusively(delegate
{
for (int i = 0; i < eventHandlers.Count; i++)
{
WeakDelegate weakDelegate = eventHandlers[i];
// don't move this line to the ELSE block
//as the object needs to be resurrected
Object target = weakDelegate.Target;

if (IsWeakDelegateInvalid(target, weakDelegate.Method))
{
eventHandlers.RemoveAt(i);
i--;
}
else
{
Delegate realDelegate = Delegate.CreateDelegate(typeof(EventHandler <T>),
target, weakDelegate.Method);
EventHandler <T> eventHandler = (EventHandler <T>)realDelegate;
eventHandler(this, args);
}
}
});
}

public void Remove(EventHandler <T> value)
{
ExecuteExclusively(delegate
{
for (int i = 0; i < eventHandlers.Count; i++)
{
WeakDelegate weakDelegate = eventHandlers[i];
Object target = weakDelegate.Target;

if (IsWeakDelegateInvalid(target, weakDelegate.Method))
{
eventHandlers.RemoveAt(i);
i--;
}
else
{
if (value.Target == target && value.Method == weakDelegate.Method)
{
eventHandlers.RemoveAt(i);
i--;
}
}
}
});
}

public void Add(EventHandler <T> value)
{
ExecuteExclusively(delegate
{
RemoveInvalidDelegates();

WeakDelegate weakDelegate = new WeakDelegate();
weakDelegate.Target = value.Target;
weakDelegate.Method = value.Method;

eventHandlers.Add(weakDelegate);
});
}

private void RemoveInvalidDelegates()
{
for (int i = 0; i < eventHandlers.Count; i++)
{
WeakDelegate weakDelegate = eventHandlers[i];

if (IsWeakDelegateInvalid(weakDelegate))
{
eventHandlers.RemoveAt(i);
i--;
}
}
}

private void ExecuteExclusively(Operation operation)
{
bool result = Monitor.TryEnter(eventLock, TimeSpan.FromSeconds(15));

if (!result)
{
throw new TimeoutException("Couldn't acquire a lock");
}

try
{
operation();
}
finally
{
Monitor.Exit(eventLock);
}
}

private bool IsWeakDelegateInvalid(WeakDelegate weakDelegate)
{
return IsWeakDelegateInvalid(weakDelegate.Target, weakDelegate.Method);
}

private bool IsWeakDelegateInvalid(object target, MethodInfo method)
{
return target == null && !method.IsStatic;
}
}



You might have noticed that there is some housekeeping going on whenever one of Add, Remove or Invoke methods is called. The reason why we need to do this is that WeakEvent<T> keeps a collection of WeakDelegate objects that might contain methods bound to objects(targets) that have been garbage-collected. In other words we need to take care of getting rid of invalid delegates on our own. Solutions to this problem can vary from very simple to very sophisticated. The one that works in my case basically scans the collection of delegates and removes invalid ones every time a delegate is added, removed or the event is invoked. It might sound like overkill but it works fine for events that have around 1000-5000 subscribers and it's very simple. You might want to have a background thread that checks the collection every X seconds but then you need to figure out what is the value of X in your case. You can go even further and keep the value adaptive but then your solution gets even more complicated. In my case the simplest solutions works perfectly fine.
Hopefully this post will save someone an evening or two :).

Monday, 28 April 2008

Machines are predictable, people are not

 

I suppose we would all agree with that and that's why smart people try to develop processes to make us more predictable. On the other hand nobody likes being constrained by anything and especially a process. Some people call this kind of lack of structure freedom, some call it chaos :). From my experience a bit of process might actually help a lot whereas a complete lack of it leads sooner or later to a disaster. Scrum is one of the approaches that let people develop software in a predictable way and that's the topic of the next MTUG event (29th April) that I'm not going to miss. See you there.

Tags: , ,

Wednesday, 16 April 2008

Never ever synchronize threads without specifying a timeout value

Whenever there is more then one thread and more then one shared resource there must be some synchronization in place to make sure that the overall state of the application is consistent. Synchronization is not easy as it very often involves locking which very easily might lead to all sorts of deadlocks and performance bottlenecks. One of the ways of keeping out of trouble is to follow a set of guidelines. I can list at least a few sources of information worth getting familiar with:
And of course :) my two cents or rather lessons I've learnt writing and/or debugging multithreaded code:
  1. Minimize locking  -  Basically lock as little as possible and never execute code that is not related to a given shared resource in its critical section. The most problems I've seen were related to the fact that code in a critical section did more then it was absolutely needed.
  2. Always use timeout - Surprisingly all synchronization primitives tend to encourage developers to use overloads that never time out. One of the drawbacks of this approach is the fact that if there is a problem with a piece of code then an application hangs and nobody has an idea why. The only way to figure that out is to create a dump of a process (if you are lucky enough and the process is still hanging around) and debug it using  Debugging Tools for Windows. I can tell you that this is not the best way of tackling production issues when every minute matters. But if you use only API that lets you specify a timeout then whenever a thread fails to acquire a critical section within a given period of time it can throw an exception and it's immediately obvious what went wrong.

    Default
    Preferred
    Monitor.Enter(obj)
    Monitor.TryEnter(obj, timeout)
    WaitHandle.WaitOne()
    WaitHandle.WaitOne(timeout, context)

    The same logic applies to all classes that derive from WaitHandle: Semaphore, Mutex, AutoResetEvent, ManualResetEvent.
  3. Never call external code when in a critical section - Calling a piece of code that was passed to a critical section handler from outside is a big risk because there is always a good chance that at the time the code was designed nobody even thought that it might be run in a critical section. Such code might try to execute a long running task or to acquire another critical section. If you do something like that you simply ask for trouble :)
I suppose it's easy to figure out which one has bitten me the most :).

Wednesday, 26 March 2008

MIX summary in Dublin

It looks like there will be a micro MIX like event in Dublin in May - http://visitmix.com/2008/worldwide/. It might be interesting.

Tags: ,

Sunday, 24 February 2008

There is no perfect job

I suppose we all know that there are always some "ifs" and "buts". Edge Pereira wrote a blog post about a few of them that are related to human-human interaction. If I had to choose a single sentence from his post I would go for this one: "if an employee does not know the reason of his daily work, he will never wear the company's jersey". Needles to say I totally agree with the whole post.

Tags: ,

Friday, 15 February 2008

ReSharper 4 - nightly builds available at last

At this stage I nearly refuse writing code without ReSharper. I know it's bad but that's not the worst addiction ever :). Fortunately, JetBrians decided to release nightly builds of ReSharper 4 to public. Sweet.

Tuesday, 12 February 2008

C# generics - parameter variance, its constraints and how it affects WCF

CLR generics are great and there is no doubt about that. Unfortunately, C# doesn't expose the whole beauty of it because all generic type parameters in C# are nonvariant though from CLR point of view they can be marked as nonvariant, covariant or contravariant. You can find more details about that topic here and here. In short the "nonvariant" word means that even though type B is a subtype of type A then SomeType<B> is not a subtype of SomeType<A> and therefore the following code won't compile:

List <String> stringList = null;
List <object> objectList = stringList; //this line causes a compilation error

Error 1 Cannot implicitly convert type 'System.Collections.Generic.List<string> to 'System.Collections.Generic.List<object>. An explicit conversion exists (are you missing a cast?)

The generics are all over the place in WCF and you would think that this is always beneficial to all of us. Well, it depends. One of the problems I noticed is that you can not easily handle generic types in a generic way. I know it does not sound good :) but that's what I wanted to say. The best example is ClientBase<T> that is the base class for auto generated proxies. VS.NET generates a proxy type per contract(interface) which might lead to a situation where you need to manage quite a few many different proxies. Let's assume that we use username and password as our authentication method and we want to have a single place where the credentials are set. The method might look like the one below: public void ConfigureProxy(ClientBase<Object> proxy) {     proxy.ClientCredentials.UserName.UserName = "u";     proxy.ClientCredentials.UserName.Password = "p"; } Unfortunately we can't pass to that method a proxy of type ClientBase<IMyContract> because of nonvariant nature of C# generics. I can see at least two options how to get around that issue. The first one requires you to clutter the method with a generic parameter despite the fact that there is no use of it.

public void ConfigureProxy <T>(ClientBase <T> proxy) where T : class   
{
proxy.ClientCredentials.UserName.UserName = "u";
proxy.ClientCredentials.UserName.Password = "p";
}
You can imagine I'm not big fun of this solution. The second one is based on the idea that the non-generic part of the public interface of ClientBase class is exposed as either a non-generic ClientBase class or a non-generic interface IClientBase. Approach based on a non-generic class:

public abstract class ClientBase : ICommunicationObject, IDisposable   
{
public ClientCredentials ClientCredentials
{
//some code goes here
}
}

public abstract class ClientBase <T> : ClientBase where T : class
{
}

Approach based on a non-generic interface:

public interface IClientBase : ICommunicationObject, IDisposable 
{
ClientCredentials ClientCredentials { get; }
}

public abstract class ClientBase <T> : IClientBase where T : class
{
}

Having that hierarchy in place we could define our method in the following way:


public void ConfigureProxy(ClientBase/IClientBase proxy)
{
proxy.ClientCredentials.UserName.UserName = "u";
proxy.ClientCredentials.UserName.Password = "p";
}

Unfortunately WCF architects haven't thought about that and a non-generic ClientBase/IClientBase class/interface doesn't exist. The interesting part of this story is that the FaultException<T> class does not suffer from the same problem because there is a non-generic FaultException class that exposes all the non-generic members. The FaultException<T> class basically adds a single property that returns the detail of the fault and that's it. I can find more classes that are implemented in exactly the same way FaultException<T> is implemented. It looks like ClientBase<T> is the only one widely used class that breaks that rule. I would love to see this inconvenience fixed as an extension of C# parameter variance.