Const strings are not so const after all

1/10/2024

We know the const keyword with strings - but are they really constant after all? Or can we use some tricks to modify them? Let's see.

The const keyword

The way the const keyword works is that the compiler will put the string in the read-only memory, and then the compiler will replace all the references to the string with the address of the string in the read-only memory. So code like this:

const string foo = "bar";

Console.WriteLine(foo);
Console.WriteLine(foo);

Will be lowered/replaced by the compiler like this:

Console.WriteLine("bar");
Console.WriteLine("bar");

There is no occurrence of the foo variable in the compiled code! So they are kind of "burnt" in into the code. This is why you can't use const strings with variables or stuff that isn't "known" at compile time. But can we modify them? Let's see.

String interning

Sure - somewhere in the memory the string has to be stored. The mechanism here in place is called: string interning. The compiler will check all const strings in your code and will put them into a special table in the read-only memory. If you have two const strings with the same value, they will be put into the same table entry. So if you have this code:

const string foo = "bar";
const string other_foo = "bar";

Console.WriteLine(foo);
Console.WriteLine(other_foo);

Console.WriteLine(ReferenceEquals(foo, other_foo));

Will be lowered to this:

Console.WriteLine("bar");
Console.WriteLine("bar");

Console.WriteLine((object)"bar" == "bar"); // This will print "True"

So the compiler will replace all references to the foo and other_foo variables with the string "bar" and will check if the references are equal. And since the strings are interned, they will be equal. So if you have two const strings with the same value, they will be put into the same table entry.

Okay - we are getting closer to changing strings. The last thing we have to do is getting exactly that table entry.

Unsafe code to the rescue

The basic premise is that:

const string foo = "bar";

ModifyString(foo);
Console.WriteLine(foo);

After ModifyString is called, the string foo should be bar. So how to do that: Well, let's get the memory address and change what is there:

unsafe void ModifyString(string foo)
{
    fixed(char* f = foo)
    {
        f[0] = 'F';
        f[1] = 'o';
        f[2] = 'o';
    }
}

With that we get the following output: Foo. The funny part is that it also works with var foo = "bar"; as the compiler still sees that this expression is constant! So the basic principle is that we can't change the string (in the sense of its address), but we can change the content of the string.

What if we write more than the string is long? Well, the compiler will put the string into a buffer with a length of the string + 1. So if we write more than the string is long, we will overwrite the next string in the buffer. So if we have this code:

unsafe void ModifyString(string foo)
{
    fixed(char* f = foo)
    {
        f[0] = 'F';
        f[1] = 'o';
        f[2] = 'o';
        f[3] = 'o';
    }
}

It still prints Foo as the metadata of the object holds it's length - so our string is still 3 characters long. You might think, wait, what if there is anything on the next memory-block, can I overwrite stuff there? Short answer, no: You will get a System.AccessViolationException: 'Attempted to read or write protected memory. This is often an indication that other memory is corrupt.' exception.

Before we call it a day, one last thing regarding string interning: If we "construct" the string like this:

Console.WriteLine(string.Concat("b", "ar"));

What do you think will be printed to the console: foo or bar? It is: bar, as "dynamic" strings are not interened and therefore in our case, the string created from string.Concat lives on a different address in the memory.

Conclusion

We saw that const strings are not so const after all (well with many tricks, but still). You most probable (I hope) have no use for this, but understanding the internals can be interesting and helpful! Here is a sharplab.io link, if you want to directly play around with the code.

Is public const bad?

Is declaring a number or string as public const considered bad practice? Let's have a look what a const variable means in the first place. Let's find out and also check what are the alternatives.

StringBuilders magic for very large strings

The StringBuilder class is used to create mutable sequences of characters. Strings are immutable, so if you need to perform multiple operations on a string, it is better to use a StringBuilder instead of a string. This is especially useful when you need to concatenate a large number of strings. But there is more magic to it, especially when we go BIG!

Give your strings context with StringSyntaxAttribute

Strings are one of the most universal data types. We use them for URLs or regular expressions or even to define some date. With .NET 7 we have a new way of giving those strings a bit of meaning. Meet StringSyntaxAttribute.

I also show you a way how to use them in .NET 6 and earlier.

An error has occurred. This application may no longer respond until reloaded. Reload x