ChatGPT - Current state for .NET - and in general

ChatGPT is going viral right now. Besides funny conversations, you can use that tool to generate also code for you. But does it really hold up to its reputation and it is a serious pairing partner? Let's see.

What is ChatGPT?

ChatGPT is an artificial intelligence citation needed chatbot created by the OpenAI (the ones who made also DALL-E) team. It is based on the GPT-3.5 model. Now you ask what is GPT-3? As this is a chatbot, we can directly use it to know what GPT-3 is. So I asked ChatGPT: "What is GPT-3?"

GPT-3 (short for "Generative Pretrained Transformer 3") is a state-of-the-art language processing model developed by OpenAI. It has 175 billion parameters, making it one of the largest language models currently available. GPT-3 can perform a wide range of language-related tasks, such as translation, summarization, question answering, and text generation. It has been trained on a large dataset of text and has been designed to generate human-like text that is coherent and natural-sounding. GPT-3 is a powerful tool that is being used in a variety of applications, including natural language processing, machine learning, and artificial intelligence.

And besides answering such questions like Google, you can also ask programming related questions like: "Write a poem in C#."

using System;

namespace PoemInCSharp
{
    class Program
    {
        static void Main(string[] args)
        {
            Console.WriteLine("In C# so sleek and slim,");
            Console.WriteLine("A language built for coding whim.");
            Console.WriteLine("With features new and vast,");
            Console.WriteLine("It makes coding a breeze at last.");
            Console.WriteLine("Object-oriented and strong,");
            Console.WriteLine("It helps us code all day long.");
            Console.WriteLine("Thank you, C#, for all you do,");
            Console.WriteLine("Helping us create and make our dreams come true.");
        }
    }
}

Well, nice. So let's use it everywhere then? Hold on a minute. As you might guess, this blog post is exactly about that. Can we use that tool in the near future? And of course, this is somehow more or less related to similar tools like GitHub CoPilot.

To get an initial idea, of where I want to go with that, let's have a look at the article the guys from StackOverflow published recently: "Use of ChatGPT1 generated text for content on Stack Overflow is temporarily banned.":

Overall, because the average rate of getting correct answers from ChatGPT is too low, the posting of answers created by ChatGPT is substantially harmful to the site and to users who are asking or looking for correct answers.

This quote might not come as a big surprise. The big problem with ChatGPT is that it often fails to produce useful or even compilable input and StackOverflow gets currently bombarded with that. Of course, it seems pretty easy: You see a question, maybe even in a language or stack you don't know, and might think: "Hey that GPT-3 thingy can answer those questions. So let's put it in there and get some easy reputation." Well of course it does not work like that. The problem with hyping that tool is especially that it seems like a valuable tool for juniors. But it is not. It is literally the opposite. And I will show you some of the examples so you know what I mean by that. The first thing I discovered where a few situations where the suggested code was not even compilable. That is great because everyone would notice, that it isn't working. But sure there are cases, especially when dealing with specialized topics, where everything is compilable but the suggested code does not work. Let's make an example. I asked: "Get the largest number in an array via SIMD in C#." Without going into detail, here the answer from the "AI"

public static int GetLargestNumber(int[] values)
{
    if (values == null || values.Length == 0)
    {
        throw new ArgumentException("Array must not be null or empty.", nameof(values));
    }

    if (Vector.IsHardwareAccelerated)
    {
        // Use SIMD instructions if hardware acceleration is available
        Vector<int> largest = new Vector<int>(values[0]);
        for (int i = 0; i < values.Length; i += Vector<int>.Count)
        {
            Vector<int> current = new Vector<int>(values, i);
            largest = Vector.Max(largest, current);
        }

        int[] result = new int[Vector<int>.Count];
        largest.CopyTo(result);
        return result.Max();
    }
    else
    {
        // Fall back to regular LINQ Max method if hardware acceleration is not available
        return values.Max();
    }
}

Wow, beautiful. The code distinguishes if your PC does have a certain feature, if not use a fallback. That code looks solid at first glance. And ChatGPT even gives you example code how to call that piece:

int[] values = { 1, 2, 3, 4, 5 };
int largest = GetLargestNumber(values);
Console.WriteLine(largest);  // outputs 5

Now does that thing print "5"? No, it does not. That function will throw a runtime exception. And the problem is in this line: for (int i = 0; i < values.Length; i += Vector<int>.Count). Vector<int>.Count is dependent on your hardware intrinsics. For my M1 the size is 4. So on the third iteration, i will be 8 and as our initial array is only 5 elements wide, this will throw a System.ArgumentOutOfRangeException. Okay fine, at least we get an exception somewhen. Sure, but we can make things worse. We can ask ChatGPT to create a hashing function for passwords (entered text: "Given a password as string in C#, write a hashing function that returns the hashed string.". The result can look like this:

using System;
using System.Security.Cryptography;

namespace ConsoleApp1
{
    class Program
    {
        static void Main(string[] args)
        {
            // The password to hash
            string password = "mypassword";

            // Create a new instance of the SHA256 class
            SHA256 sha256 = SHA256.Create();

            // Convert the password string to a byte array
            byte[] passwordBytes = System.Text.Encoding.UTF8.GetBytes(password);

            // Compute the hash of the password
            byte[] hash = sha256.ComputeHash(passwordBytes);

            // Convert the hash to a string
            string hashedPassword = Convert.ToBase64String(hash);

            Console.WriteLine(hashedPassword);
        }
    }
}

Do you know if that code is good? Do you think a graduate who freshly joined the programming world knows if this piece of code is doing what it should do? Short answer SHA256 is a poor option for storing passwords. More about this here, especially because no salt was used. I let this question re-run multiple times and the shown example is more or less the most common one.

By the way you can also ask it to write a unit test for your method. But also here be very careful. I did a small trickery with the bot. That was my question: "Write a unit test for the following method:

public class Calc
{
 public int Add(int a, int b) => 1 + a + b;
}

"

Notice the 1 + a + b. The answer from the bot: "Here is an example unit test for the Add method:

public void Add_ShouldReturnSumOfInputs()
{
    // Arrange
    var calc = new Calc();

    // Act
    var result = calc.Add(1, 2);

    // Assert
    Assert.AreEqual(3, result);
}

"

Well thank you, but no 😉. That test would not pass.

Explaining code

Another feature of ChatGPT is the ability to explain code. That sounds very good on the surface. "Hey I don't know what that piece of code does, can you explain that to me? Of course that sounds tempting. Afterward, you understand what that piece of legacy code does, not? Well as earlier shown, you can't really rely on the given output. So how can you rely on the explanation? You have to trust 100% that the AI understood the code in the right context.

Now is it all bad? No, of course not. If you do have a rough understanding of what's going on, the tool can really help you. If we copy&paste the GetLargestNumber code into ChatGPT and ask it to explain it, it will give the following answer:

This code is a method that takes in an array of integers as input and returns the largest number in the array. The method first checks if the input array is null or empty, and if it is, it throws an ArgumentException. Otherwise, it checks if hardware acceleration is available, and if it is, it uses SIMD instructions to find the largest number in the array and returns it. If hardware acceleration is not available, it falls back to using the regular LINQ Max method to find the largest number in the array and returns it.

The answer looks solid. Again you need to have a rough understanding of what it does to judge whether or not that is the case. And that is the exact point of that article. In my experience with ChatGPT (which is fairly short and limited I guess) the bot is better at explaining things rather than writing the code. I tried even some more "exotic" constructs like stackalloc and friends (basically I let the AI explain my ValueStringBuilder) and I was positively surprised as this was almost every time correct.

The model - does it have a bias?

One big problem if you operate on such huge amounts to train your model is: How can you ensure that the AI is not opinionated? That is a really tricky thing to do. And you might sensed it ChatGPT has a lot of bias. The simplest example is this: "Write a function in C# that takes the gender and race as input and the expected average salary as output."

That is the result:

public decimal GetAverageSalary(string gender, string race)
{
    // define a default average salary
    decimal averageSalary = 50000m;

    // adjust the average salary based on the gender and race inputs
    if (gender == "male")
    {
        averageSalary += 10000m;
    }

    if (race == "African American")
    {
        averageSalary -= 5000m;
    }

    // return the calculated average salary
    return averageSalary;
}

Ufff, that went south pretty quickly. At least decimal was used for money. You can re-run this question over and over again and get a common theme here. I went fishing for some API keys. Remember when GitHub Copilot came out and it suggested random API keys? It seems it doesn't happen with ChatGPT. If you explicitly ask for API keys, it tells you that they should be unique and stuff and in code samples, it tries to come around with placeholders like "API_KEY". So they learned from earlier mistakes of similar products.

Conclusion

ChatGPT has some potential, but as with all such tools (GitHub CoPilot, StackOverflow, ...) or copy/pasting code in general, you have to understand the code it produces. Pasting the code blindly is a dangerous path, but this is nothing new. Let's see where the future of AI-assisted coding is going.

14
An error has occurred. This application may no longer respond until reloaded. Reload x