Lock statement patterns

27/02/2024

A few weeks back, I wrote an article "A new lock type in .NET 9" where I showcased the new Lock type. Nothing fancy - well, at least it was more expressive. But now the dotnet team went a step further with this!

Lock statement patterns

Beginning with .NET 9, the compiler will treat the lock keyword a bit differently if it encounters the System.Threading.Lock class.

Here is a simple example:

var lockObject = new Lock();

lock (lockObject)
{
    Console.WriteLine("I am locked in");
}

Console.WriteLine("Hello, World!");

This will compile to:

Lock.Scope scope = new Lock().EnterScope();
try
{
    Console.WriteLine("I am locked in");
}
finally
{
    scope.Dispose();
}
Console.WriteLine("Hello, World!");

That looks different than the usual lock statement. The Lock class doesn't utilize a Monitor under the hood, while the "old" approach does. So this code:

var lockObj = new object();

lock (lockObj)
{
    Console.WriteLine("I am locked in");
}

Will result in:

object obj = new object();
bool lockTaken = false;
try
{
    Monitor.Enter(obj, ref lockTaken);
    Console.WriteLine("I am locked in");
}
finally
{
    if (lockTaken)
    Monitor.Exit(obj);
}

The obvious reason for such a change is to optimize path-ways using locking. Let's throw a benchmark against those two types/variations.

using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;

BenchmarkRunner.Run<Benchi>();

[MemoryDiagnoser]
public class Benchi
{
    private static readonly object lockObject = new object();
    private static readonly Lock lockLock = new Lock();

    [Benchmark(Baseline = true)]
    public async Task<int> CountTo1000WithLock()
    {
        var count = 0;
        var tasks = new Task[10];
        for (var t = 0; t < tasks.Length; t++)
        {
            tasks[t] = Task.Run(() =>
            {
                for (var i = 0; i < 100; i++) // Each task counts to 100, summing up to 1000
                {
                    lock (lockObject)
                    {
                        count++;
                    }
                }
            });
        }

        await Task.WhenAll(tasks);
        return count;
    }
    
    [Benchmark]
    public async Task<int> CountTo1000WithLockClass()
    {
        var count = 0;
        var tasks = new Task[10];
        for (var t = 0; t < tasks.Length; t++)
        {
            tasks[t] = Task.Run(() =>
            {
                for (var i = 0; i < 100; i++) // Each task counts to 100, summing up to 1000
                {
                    lock (lockLock)
                    {
                        count++;
                    }
                }
            });
        }

        await Task.WhenAll(tasks);
        return count;
    }
}

Results:

BenchmarkDotNet v0.13.12, macOS Sonoma 14.3.1 (23D60) [Darwin 23.3.0]
Apple M2 Pro, 1 CPU, 12 logical and 12 physical cores
.NET SDK 9.0.100-preview.2.24121.2
  [Host]     : .NET 9.0.0 (9.0.24.12011), Arm64 RyuJIT AdvSIMD
  DefaultJob : .NET 9.0.0 (9.0.24.12011), Arm64 RyuJIT AdvSIMD


| Method                   | Mean      | Error    | StdDev   | Ratio | Gen0   | Allocated | Alloc Ratio |
|------------------------- |----------:|---------:|---------:|------:|-------:|----------:|------------:|
| CountTo1000WithLock      | 107.22 us | 1.561 us | 1.460 us |  1.00 | 0.1221 |   1.06 KB |        1.00 |
| CountTo1000WithLockClass |  75.73 us | 0.884 us | 0.827 us |  0.71 | 0.1221 |   1.05 KB |        0.99 |

There is a 25% improvement here. In the future, we can expect other patterns to be optimized as well. This is a great step forward for the .NET runtime and the C# language.

Differences

Besides the performance I want to talk about more about the differences between the Monitor lock and the System.Threading.Lock lock. A comment by @manuel-ornato kept me thinking I should shed more light on it.

Currently, you have to enable this as a preview feature in your csproj to make it work - so it should tell you that there are rough edges the team wants to address. It can also happen that Lock and Monitor behave a bit differently - while performance is the main driver, it can happen that you are worse off. Here is a comment on the current state of the repo:

// The lock is mostly fair to release waiters in a typically FIFO order (though the order is not guaranteed).
// However, it allows non-waiters to acquire the lock if it's available to avoid lock convoys.
//
// Lock convoys can be detrimental to performance in scenarios where work is being done on multiple threads and
// the work involves periodically taking a particular lock for a short time to access shared resources. With a
// lock convoy, once there is a waiter for the lock (which is not uncommon in such scenarios), a worker thread
// would be forced to context-switch on the subsequent attempt to acquire the lock, often long before the worker
// thread exhausts its time slice. This process repeats as long as the lock has a waiter, forcing every worker
// to context-switch on each attempt to acquire the lock, killing performance and creating a positive feedback
// loop that makes it more likely for the lock to have waiters. To avoid the lock convoy, each worker needs to
// be allowed to acquire the lock multiple times in sequence despite there being a waiter for the lock in order
// to have the worker continue working efficiently during its time slice as long as the lock is not contended.
//
// This scheme has the possibility to starve waiters. Waiter starvation is mitigated by other means, see
// TryLockBeforeSpinLoop() and references to ShouldNotPreemptWaiters.

In the implementation, you will also find that sometimes a spin loop is used. Also think of scenarios where you want explicitly inform the waiting threads.

lock (_lock)
{
    // Action here...
    Monitor.Pulse(_lock);
}

To my knowledge you can't do that with the current state of the System.Threading.Lock type.

15
An error has occurred. This application may no longer respond until reloaded. Reload x