Inlining and structs in C#

10/20/2025
5 minute read

In this - somewhat technical and barely usable - blog post, we will have a look at inlining and structs in C#. And how they can optimize performance in some interesting ways.

Inlining

Inlining is a compiler optimization that replaces a method call with the method's body. So if you have the following code:

public int Add(int a, int b) => a + b;

public int CalculateSum(int x, int y)
{
    return Add(x, y);
}

The compiler might optimize it to:

public int CalculateSum(int x, int y)
{
    return x + y;
}

The obvious advantage here is that we avoid the overhead of a method call, but it can also increase the size of the code, which can have its own performance implications (as we copy the body of a method to every place it is called). There is an attribute called: [MethodImpl(MethodImplOptions.AggressiveInlining)] that can be used to suggest to the compiler that a method should be inlined, even if it might not do so by default. That is just a hint, and the JIT can still choose to ignore it. We can also use [MethodImpl(MethodImplOptions.NoInlining)] to prevent a method from being inlined (or better: to suggest the JIT that it should not inline it).

structs

An integral part of structs is that they are, normally, passed by value. This means that when you pass a struct to a method, a copy of the struct is made. So:

public struct Point
{
    public int X;
    public int Y;
}

public void MovePoint(Point p)
{
    p.X += 10;
    p.Y += 10;
}
Point myPoint = new Point { X = 0, Y = 0 };
MovePoint(myPoint);
// myPoint is still { X = 0, Y = 0 }

Ideally, you try to keep structs immutable for exactly that reason. I even had a blog post about that: "Mutable value types are evil! Sort of...". Also you try to keep them small to avoid the overhead of copying large amounts of data. And that is where those two collide:

Inlining struct methods

And here is the "beauty": If we inline a function, we "erase" the need for a copy of the struct onto the new stackframe. So we substitute the method call with the body of the method, and thus we do not need to copy the struct to a new stackframe. So inlining can actually make passing structs cheaper.

Let's have a look at the following benchmark:

public class InlineVsNonInlineBenchmark
{
    private SomeBigStruct _someBigStruct;
    
    [Benchmark]
    public int NonInline() => GetFNonInline(_someBigStruct) + GetFNonInline(_someBigStruct);
    
    [Benchmark]
    public int Inline() => GetFInline(_someBigStruct) + GetFInline(_someBigStruct);
    
    [MethodImpl(MethodImplOptions.NoInlining)]
    private int GetFNonInline(SomeBigStruct s) => s.F;
    
    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    private int GetFInline(SomeBigStruct s) => s.F;
}

public struct SomeBigStruct
{
    public int A { get; set; }
    public int B { get; set; }
    public int C { get; set; }
    public int D { get; set; }
    public int E { get; set; }
    public int F { get; set; }
    public int G { get; set; }
    public int H { get; set; }
    public int I { get; set; }
    public int J { get; set; }
    public int K { get; set; }
    public int L { get; set; }
    public int M { get; set; }
    public int N { get; set; }
    public int O { get; set; }
    public int P { get; set; }
    public int Q { get; set; }
    public int R { get; set; }
    public int S { get; set; }
    public int T { get; set; }
    public int U { get; set; }
    public int V { get; set; }
    public int W { get; set; }
    public int X { get; set; }
    public int Y { get; set; }
    public int Z { get; set; }
}

Results:

| Method    | Mean      | Error     | StdDev    |
|---------- |----------:|----------:|----------:|
| NonInline | 3.3646 ns | 0.0130 ns | 0.0109 ns |
| Inline    | 0.0000 ns | 0.0000 ns | 0.0000 ns |

Of course we have only one operation - that benchmark is not ideal and should be taken with a grain of salt (especially because the amount of operations is way too low). Impressive anyway! Now, how do we know it is avoiding the copy rather than just being faster because of inlining? Well - we can increase or decrease the amount of properties in SomeBigStruct and see how that affects the results. If we remove some of the properties (so we only have A to N), we get:

| Method    | Mean      | Error     | StdDev    |
|---------- |----------:|----------:|----------:|
| NonInline | 2.2966 ns | 0.0575 ns | 0.0449 ns |
| Inline    | 0.0000 ns | 0.0000 ns | 0.0000 ns |

Even without a benchmark, we can check sharplab.io:

InlineVsNonInlineBenchmark.NonInline()
    L0000: push ebp
    L0001: mov ebp, esp
    L0003: push edi
    L0004: push esi
    L0005: push ebx
    L0006: mov esi, ecx
    L0008: lea edi, [esi+4]
    L000b: sub esp, 0x38
    L000e: vmovdqu xmm0, [edi]
    L0012: vmovdqu [esp], xmm0
    L0017: vmovdqu xmm0, [edi+0x10]
    L001c: vmovdqu [esp+0x10], xmm0
    L0022: vmovdqu xmm0, [edi+0x20]
    L0027: vmovdqu [esp+0x20], xmm0
    L002d: vmovq xmm0, [edi+0x30]
    L0032: vmovq [esp+0x30], xmm0
    L0038: mov ecx, esi
    L003a: call 0x2b1b0018
    L003f: mov ebx, eax
    L0041: sub esp, 0x38
    L0044: vmovdqu xmm0, [edi]
    L0048: vmovdqu [esp], xmm0
    L004d: vmovdqu xmm0, [edi+0x10]
    L0052: vmovdqu [esp+0x10], xmm0
    L0058: vmovdqu xmm0, [edi+0x20]
    L005d: vmovdqu [esp+0x20], xmm0
    L0063: vmovq xmm0, [edi+0x30]
    L0068: vmovq [esp+0x30], xmm0
    L006e: mov ecx, esi
    L0070: call 0x2b1b0018
    L0075: add eax, ebx
    L0077: pop ebx
    L0078: pop esi
    L0079: pop edi
    L007a: pop ebp
    L007b: ret

InlineVsNonInlineBenchmark.Inline()
    L0000: mov eax, [ecx+0x18]
    L0003: add eax, eax
    L0005: ret

Now you don't have to understand the JIT ASM code. But just the amount of code shows you that Inline is probably faster. Basically NonInline has to copy a lot (vmovdqu and vmovq instructions) while Inline just reads the property, adds something and returns.

An error has occurred. This application may no longer respond until reloaded. Reload x