Showing posts with label CLR. Show all posts
Showing posts with label CLR. Show all posts

Monday, 10 August 2009

Quick look at STM.NET performance

STM as an idea looks interesting and definitely has a lot of advantages over regular locking. It helps prevent deadlocks, makes easier to compose components that implement some kind of synchronization, helps eliminate nested locks, supports retries and so on. You can find the full list of goodness here.
When I found out that STM.NET(based on .NET 4.0) is out my first question was related to its performance. I put together a simple test and I have to say that in most scenarios STM.NET is an overkill. Sure, it’s a research project and it hasn’t been productized which means it’s not optimized and it’s not ready for prime time. But still, on average it’s 20x slower than the regular lock statement. That’s a lot taking into account that in most cases developers use very simple locking strategies and try to avoid nested locks. If Microsoft can bring it down to 2x then it is a completely different story.
The results:

The code of the benchmark:

using System;
using System.Threading;
using System.Collections;
using System.TransactionalMemory;
using System.Collections.Generic;
using System.Diagnostics;
class Program
{
public delegate void CustomAction();
[AtomicNotSupported]
static void Main(string[] args)
{
var iterations = 1000000;
Console.WriteLine("Number of Threads | STM.NET [ms] |" +
"Regular locking [ms] | Times slower");
for (var i =1 ; i < 15; i++)
{
Test(i, iterations);
}

Console.ReadLine();
}
private static void Test(int numberOfThreads, int iterations)
{
var storage = new Storage();
CustomAction action = () =>
{
Atomic.Do(() =>
{
storage.UpdateByOne();
});
};
var stm = Execute(numberOfThreads, iterations, action, storage);
storage = new Storage();
CustomAction action1 = () =>
{
lock (storage)
{
storage.UpdateByOne();
}
};
var locking = Execute(numberOfThreads, iterations, action1, storage);
int slower = (int)(stm.ElapsedMilliseconds / locking.ElapsedMilliseconds);
var message = String.Format("{0}\t\t\t{1}\t\t\t{2}\t\t\t{3}", numberOfThreads,
stm.ElapsedMilliseconds,
locking.ElapsedMilliseconds, slower);
Console.WriteLine(message);
}
private static Stopwatch Execute(int numberOfThreads, int iterations,
CustomAction action, Storage storage)
{
var stopwatch = Stopwatch.StartNew();
var threads = new List<Thread>();
for (var i = 0; i < numberOfThreads; i++)
{
var thread = new Thread(() =>
{
for (var j = 0; j < iterations; j++)
{
action();
}
});
threads.Add(thread);
}
foreach (var thread in threads)
{
thread.Start();
}
foreach (var thread in threads)
{
thread.Join();
}
stopwatch.Stop();
if (numberOfThreads * iterations != storage.Result)
{
var message = String.Format("Incorrect value. Expected: {0} Actual: {1}",
numberOfThreads*iterations, storage.Result);
throw new ApplicationException(message);
}
return stopwatch;
}
}
public class Storage
{
private long value;
public void UpdateByOne()
{
value = value + 1;
}
public long Result
{
get { return value; }
}
}

Wednesday, 22 July 2009

How many times can you load an assembly into the same AppDomain?

Actually, as many as you want, just call

Assembly.Load(File.ReadAllBytes(@"SomeFolder\MyLib.dll"));

in a loop. All copies will have their code base set to the currently executing assembly. Code base specifies the physical location of an assembly. By executing the following line

Assembly.LoadFrom(@"SomeOtherFolder\MyLib.dll");

we load another copy but this time from completely different location. Let’s add one more:

Assembly.Load("MyLib");

In this way we end up with several copies and 3 different code bases inside of the same AppDomain. This happens because each assembly is loaded into different load context. You can read about that here. Why is it like that? To be honest I’ve found no clear explanation. Some Microsoft bloggers mention that this makes order independent loading possible. But that’s it. From my perspective that’s not much to justify the fact that multiple copies of the same assembly can be loaded into the same AppDomain which is very often dangerous.
The main problem is that the runtime treats types in all copies as completely different types. This means that if you create an object of type MyClass using the type definition from the assembly loaded by Load method you won’t be able to pass it to a method that takes MyClass as a parameter but is defined in the assembly loaded by LoadFrom method. If something like that happens you get InvalidCastException which is hard to believe in because VS debugger clearly shows that the type of the parameter is correct :). It’s only when you dig deeper and check the assembly where the type comes from when you find out that they are from 2 different assemblies. This can be very confusing and hard to debug. Even if there is a valid reason behind this feature it would be great if I could run loader/runtime in some kind of strict mode where duplicates are not allowed and a meaningful exception is thrown when an assembly is requested to be loaded more than once. I can see more and more code that is loaded dynamically at runtime and this feature would be very useful.

Thursday, 29 May 2008

WeakEvent - you wish it was here

At the first glance .NET events are an easy and harmless way to decouple components. The former statement is true but the latter is not. The reason is that whenever an instance of a class subscribes to an event published by another class a strong link between these two is established. By strong link I mean that the subscriber(listener) won't get garbage-collected as long as the publisher is alive. The only way to break that link is to unsubscribe from the event which might be easily omitted as the link is not explicit. Additionally there are cases when explicit(deterministic) cancellation of subscription is impossible. If additionally the publisher is a long living object than we might face a memory leak. In ideal world there would be a way of specifying that a subscription is weak which would mean that if the subscription is the only link to an object then the object can be garbage-collected and the subscription can be deleted. .NET does not provide that facility out of the box but fortunately it provides building blocks that in most cases let us build a good enough solution. The idea is to intercept calls that add and remove subscribers from/to an event and create a weak link between subscribers and publisher instead of the default, strong one.
When you define an event you don't have to write add/remove methods on your own because C# compiler generates them automatically. Basically the following code snippet:

public event EventHandler <EventArgs> MyEvent;

is just "syntactic sugar" that C# compiler transforms to much more verbose form. You can find detailed description of this process in CLR via C#  by Jeffrey Richter. From our perspective the most important thing is that we can overwrite the default behavior of the compiler and inject our own implementation of Add and Remove methods in a way that is completely transparent to subscribers. SomeClass and Subscriber classes show how it can be done. Don't worry about WeakEvent<T> class as it will be explained later.

public class SomeClass
{
private WeakEvent <EventArgs> myWeakEvent;
public event EventHandler <EventArgs> MyWeakEvent
{
add
{
myWeakEvent.Add(value);
}

remove
{
myWeakEvent.Remove(value);
}
}
private void SomeMethodThatNeedsToRiseMyWeakEvent()
{
OnMyWeakEvent(new EventArgs());
}

protected void OnMyWeakEvent(EventArgs args)
{
myWeakEvent.Invoke(args);
}
}
public class Subscriber
{
private SomeClass someClass;
public Subscriber()
{
someClass = new SomeClass();
someClass.MyWeakEvent += Method;
}

private void Method(object sender, EventArgs e)
{
}
}


Add and Remove methods take a delegate as the input parameter. Every .NET delegate is an object with 2 properties. One of them is a reference to the target of the delegate(the object the delegate will be called on) and the second one is a description of the method which is provided as an instance of System.Reflection.MethodInfo class. Static delegates have the target property set to null. The target field is the root of all evil as it keeps the subscriber alive(it is a strong reference to the object the delegate will be called on). Fortunately .NET framework provides a class that can act as man in the middle between the method and its target which lets us break the direct link between them.
The class that makes it possible is called (no surprise) System.WeakReference. An instance of System.WeakReference class keeps a weak reference(instead of strong) to the object that is passed to its constructor. The weak reference can be transformed into the strong reference by accessing its Target property and storing its value in an ordinary variable. In this way we resurrect the object. If the object is already garbage-collected then the property returns null. All aforementioned functionality is encapsulated in a custom class that I called WeakDelegate.

internal class WeakDelegate
{
private WeakReference target;
private MethodInfo method;

public object Target
{
get
{
return target.Target;
}
set
{
target = new WeakReference(value);
}
}

public MethodInfo Method
{
get { return method; }
set { method = value; }
}
}

WeakEvent<T> is a class that takes advantage of WeakDelegate class to solve the problem outlined in the first paragraph. Its below implementation is rather straightforward but 2 pieces of code might need some explanation. The first one is inside Invoke method. Internally we store instances of WeakDelegate class which means that we can not invoke them directly and every time one of them needs to be executed we have to assemble an instance of  System.Delegate class. I don't know if the way the code creates delegates is the fastest one but I measured the execution time of that statement and the average time was 0.005384 ms per delegate which is fast enough for me. The second one is related to the fact that the locking is done in a way that prevents threads from waiting forever. If a thread can't enter the critical section within 15 seconds then it throws an exception. The rationale behind that approach is explained here.

public class WeakEvent <T> where T : EventArgs
{
private readonly List <WeakDelegate> eventHandlers;
private readonly object eventLock;

public WeakEvent()
{
eventHandlers = new List <WeakDelegate>();
eventLock = new object();
}

public void Invoke(T args)
{
ExecuteExclusively(delegate
{
for (int i = 0; i < eventHandlers.Count; i++)
{
WeakDelegate weakDelegate = eventHandlers[i];
// don't move this line to the ELSE block
//as the object needs to be resurrected
Object target = weakDelegate.Target;

if (IsWeakDelegateInvalid(target, weakDelegate.Method))
{
eventHandlers.RemoveAt(i);
i--;
}
else
{
Delegate realDelegate = Delegate.CreateDelegate(typeof(EventHandler <T>),
target, weakDelegate.Method);
EventHandler <T> eventHandler = (EventHandler <T>)realDelegate;
eventHandler(this, args);
}
}
});
}

public void Remove(EventHandler <T> value)
{
ExecuteExclusively(delegate
{
for (int i = 0; i < eventHandlers.Count; i++)
{
WeakDelegate weakDelegate = eventHandlers[i];
Object target = weakDelegate.Target;

if (IsWeakDelegateInvalid(target, weakDelegate.Method))
{
eventHandlers.RemoveAt(i);
i--;
}
else
{
if (value.Target == target && value.Method == weakDelegate.Method)
{
eventHandlers.RemoveAt(i);
i--;
}
}
}
});
}

public void Add(EventHandler <T> value)
{
ExecuteExclusively(delegate
{
RemoveInvalidDelegates();

WeakDelegate weakDelegate = new WeakDelegate();
weakDelegate.Target = value.Target;
weakDelegate.Method = value.Method;

eventHandlers.Add(weakDelegate);
});
}

private void RemoveInvalidDelegates()
{
for (int i = 0; i < eventHandlers.Count; i++)
{
WeakDelegate weakDelegate = eventHandlers[i];

if (IsWeakDelegateInvalid(weakDelegate))
{
eventHandlers.RemoveAt(i);
i--;
}
}
}

private void ExecuteExclusively(Operation operation)
{
bool result = Monitor.TryEnter(eventLock, TimeSpan.FromSeconds(15));

if (!result)
{
throw new TimeoutException("Couldn't acquire a lock");
}

try
{
operation();
}
finally
{
Monitor.Exit(eventLock);
}
}

private bool IsWeakDelegateInvalid(WeakDelegate weakDelegate)
{
return IsWeakDelegateInvalid(weakDelegate.Target, weakDelegate.Method);
}

private bool IsWeakDelegateInvalid(object target, MethodInfo method)
{
return target == null && !method.IsStatic;
}
}



You might have noticed that there is some housekeeping going on whenever one of Add, Remove or Invoke methods is called. The reason why we need to do this is that WeakEvent<T> keeps a collection of WeakDelegate objects that might contain methods bound to objects(targets) that have been garbage-collected. In other words we need to take care of getting rid of invalid delegates on our own. Solutions to this problem can vary from very simple to very sophisticated. The one that works in my case basically scans the collection of delegates and removes invalid ones every time a delegate is added, removed or the event is invoked. It might sound like overkill but it works fine for events that have around 1000-5000 subscribers and it's very simple. You might want to have a background thread that checks the collection every X seconds but then you need to figure out what is the value of X in your case. You can go even further and keep the value adaptive but then your solution gets even more complicated. In my case the simplest solutions works perfectly fine.
Hopefully this post will save someone an evening or two :).

Wednesday, 16 April 2008

Never ever synchronize threads without specifying a timeout value

Whenever there is more then one thread and more then one shared resource there must be some synchronization in place to make sure that the overall state of the application is consistent. Synchronization is not easy as it very often involves locking which very easily might lead to all sorts of deadlocks and performance bottlenecks. One of the ways of keeping out of trouble is to follow a set of guidelines. I can list at least a few sources of information worth getting familiar with:
And of course :) my two cents or rather lessons I've learnt writing and/or debugging multithreaded code:
  1. Minimize locking  -  Basically lock as little as possible and never execute code that is not related to a given shared resource in its critical section. The most problems I've seen were related to the fact that code in a critical section did more then it was absolutely needed.
  2. Always use timeout - Surprisingly all synchronization primitives tend to encourage developers to use overloads that never time out. One of the drawbacks of this approach is the fact that if there is a problem with a piece of code then an application hangs and nobody has an idea why. The only way to figure that out is to create a dump of a process (if you are lucky enough and the process is still hanging around) and debug it using  Debugging Tools for Windows. I can tell you that this is not the best way of tackling production issues when every minute matters. But if you use only API that lets you specify a timeout then whenever a thread fails to acquire a critical section within a given period of time it can throw an exception and it's immediately obvious what went wrong.

    Default
    Preferred
    Monitor.Enter(obj)
    Monitor.TryEnter(obj, timeout)
    WaitHandle.WaitOne()
    WaitHandle.WaitOne(timeout, context)

    The same logic applies to all classes that derive from WaitHandle: Semaphore, Mutex, AutoResetEvent, ManualResetEvent.
  3. Never call external code when in a critical section - Calling a piece of code that was passed to a critical section handler from outside is a big risk because there is always a good chance that at the time the code was designed nobody even thought that it might be run in a critical section. Such code might try to execute a long running task or to acquire another critical section. If you do something like that you simply ask for trouble :)
I suppose it's easy to figure out which one has bitten me the most :).

Sunday, 8 July 2007

Great book: C# via CLR

As I mentioned earlier I always wanted to read C# via CLR by Jeffrey Richter. Finally I got it a few months ago and while I was sick I read it. I think it's just brilliant because:
  • I like the way Jeffery explains problems.He is strict and precise whenever it's needed but no more.
  • As far as I know he is not a Microsoft employee which lets him express criticism of everything that deserves it.
  • It reveals lots of things that you will never be aware of unless you start thinking in an illogical way. Unfortunately CLR and/or C# not always behave in a predictable way.
  • The books touches nearly all the .NET internals that you can come across during your everyday job as long as you don't work on compilers and runtimes :).