Pattern matching and the compiler can be surprising

01/04/2024
C#.NET

Pattern matching is a powerful feature in C#. It allows you to match a value against a pattern and extract information from the value. The compiler does the magic for you - and sometimes it struckles with that.

On a mailing list, there was this interesting bit of code:

Console.WriteLine(GetHashcode(new Derived()));

int GetHashcode(BaseBase baseBase)
{
    return baseBase switch {
        Derived { Foo: { } f, Bar: { } b } => HashCode.Combine(f, b),
        Base { Foo: { } f } => f.GetHashCode(),
        _ => 0,
    };
}

public class BaseBase;

public class Base : BaseBase
{
    public object? Foo { get; set; }
}

public class Derived : Base
{ 
    public object? Bar { get; set; }
}

The interesting bit is the GetHashCode function that takes a BaseBase and returns a hash code. It uses pattern matching to match the type and get the hash code. The interesting part is Derived { Foo: { } f, Bar: { } b } - It basically guarantees that Foo and Bar are not null. The second condition is to check only against the Base type (so we go from most specific to least specific). What is the surprise here? Well, if you compile that code you will get a warning:

Program.cs(9,32): Warning CS8602 : Dereference of a possibly null reference.

Line 9 is this part: Base { Foo: { } f } => f.GetHashCode(),. That is surprising, isn't it? Given that we explicitly check that Foo is not null, so why does the compiler complain about it? For that we have to look at the lowered code, the compiler generates for GetHashCode:

internal static GetHashcode(BaseBase baseBase)
{
    Derived derived = baseBase as Derived;
    object foo2;
    if (derived != null)
    {
        object foo = derived.Foo;
        if (foo != null)
        {
            object bar = derived.Bar;
            if (bar == null)
            {
                Base @base = (Base)baseBase;
                foo2 = @base.Foo;
                goto IL_005c;
            }
            return HashCode.Combine(foo, bar);
        }
    }
    else
    {
        Base @base = baseBase as Base;
        if (@base != null)
        {
            foo2 = @base.Foo;
            if (foo2 != null)
            {
                goto IL_005c;
            }
        }
    }
    return 0;
    IL_005c:
    return foo2.GetHashCode();
}

The part the compiler complains in the lowered code is: return foo2.GetHashCode();. Why? Well in our case, where we only have Base and not Derived we will go into the else branch of the function as the cast to Dervied will fail. The problem seems to be the goto statement where the static analysis seems to be in trouble.

So if you have that exact pattern in your code, you might want to be aware of that. Now there are super rare edge cases like this, where it might apply and actually throws a NullReferenceException:

public class Base : TheRealBase
{
    private object? foo;

    public object? Foo
    {
        get
        {
            Console.WriteLine("I was Called");
            var r = foo;
            foo = null;
            return r;
        }

        set => foo = value;
    }
}

In this case, you will receive a NullReferenceException - the reason is that your property is called twice (you can see two times "I was called" on the console before the Exception).

I was unaware of that effect, that the pattern matching does that (and also doesn't check again on the second access).

7
An error has occurred. This application may no longer respond until reloaded. Reload x