December 22, 2024

Using the new Span type in .NET Core

The .NET Core version 2.1 introduced 2 new types that allow to work with arrays in a managed way:

  • Span<T>
  • Memory<T>

The Span type can be seen as a gateway towards a continuous block of memory that can originate from a string, an array on heap, a stack allocated array or from pointers. You can get a Span instance from a string simply by calling the AsSpan() method.

In practice, you may not need this types very much. In my opinion, they can be used in edge-case scenarios like working with unmanaged types, networking protocols or simply where lots of strings operations are involved.

I will demonstrate next how Span can help to write more performant code by showing 2 examples:

  1. A Substring() function alternative that’s more efficient
  2. An optimized Split() mechanism that requires almost no additional memory

Benchmark

We are going to benchmark this code to see the memory usage of the process. We define these functions:

private static long GetUserMemory()
{
    // In MB
    return Process.GetCurrentProcess().PrivateMemorySize64 / 1024 / 1024;
}

private static void ExecuteMemoryOperation(String operationName, Action operation)
{
    var memoryBefore = GetUserMemory();
    operation();
    var memoryAfter = GetUserMemory();

    Console.WriteLine($"Operation {operationName} needed {memoryAfter - memoryBefore}MB of memory!");
}

The GetUserMemory() method returns the current memory in MB used by the process. The second ExecuteMemoryOperation() method executes an action and calculates the difference between the process memory before and after calling the action. This way we measure the memory consumption.

The Substring() alternative

In order to demonstrate the Substring() alternative method, we will need a method that generates a big string. In our sample, we will generate a string with a length of 1 million:

private static string GenerateBigString(int length)
{
    var sb = new StringBuilder(length);

    foreach (var i in Enumerable.Range(0, length))
    {
        sb.Append((char)('A' + i % 26));
    }

    return sb.ToString();
}
...

var bigString = GenerateBigString(1000000);

Now let’s call the unoptimized code:

ExecuteMemoryOperation("Substring NOT optimized", () =>
{
    var substring1 = bigString.Substring(0, bigString.Length / 2);
    var substring2 = bigString.Substring(bigString.Length / 2, bigString.Length / 2);
    var substring3 = substring1.Substring(0, substring1.Length / 2);
    var substring4 = substring1.Substring(substring2.Length / 2, substring2.Length / 2);
    var substring5 = substring2.Substring(0, substring2.Length / 2);
    var substring6 = substring2.Substring(substring2.Length / 2, substring2.Length / 2);

    // Perform some operations ...
});

This method gets the big string, creates 2 substrings representing the first and second half, then these 2 in turn are sliced in 2 substrings each. The output for this method is something like this:

Operation Substring NOT optimized needed 3MB of memory!

Although 3MB nowadays it’s not a big deal, in performance critical applications every drop matters. Winning back 3MB from any piece of code is not something you should ignore. Let’s now move to the optimized version. It looks like this:

ExecuteMemoryOperation("Substring optimized", () =>
{
    var bigSpan = bigString.AsSpan();

    var subspan1 = bigSpan.Slice(0, bigSpan.Length / 2);
    var subspan2 = bigSpan.Slice(bigSpan.Length / 2, bigSpan.Length / 2);
    var subspan3 = subspan1.Slice(0, subspan1.Length / 2);
    var subspan4 = subspan1.Slice(subspan1.Length / 2, subspan1.Length / 2);
    var subspan5 = subspan2.Slice(0, subspan2.Length / 2);
    var subspan6 = subspan2.Slice(subspan2.Length / 2, subspan2.Length / 2);

    // Perform some operations ...
});

The difference is that we called the AsSpan() method which returns a ReadOnlySpan<char>. You still cannot modify a string with Spans but it helps certain operations like the Substring(). Notice the Slice() method which works in a very similar way with Substring(). Although it returns a new Span, this newly created Span works with the same array that defines the string, unlike the Substring() method that allocates memory for a new string. You’ll also notice that in Span<char> you’ll find helper methods which allow you to work with the Span in a similar way to a string.

Running this optimized version will prove it uses almost no new memory:

Operation Substring optimized needed 0MB of memory!

Split tokens scenario

Now let’s suppose we have a big string with numbers separated by commas and we want to parse these numbers. Same as above, I’ll give you the out of the box solution and the optimized one. I agree that the optimized one is a bit more complicated, but writing it without Span will make it even more complex. First we need a function that can generate a big string with comma separated numbers:

private static string GenerateCommaSeparatedNumbers(int length)
{
    var sb = new StringBuilder(length);

    foreach (var i in Enumerable.Range(0, length))
    {
        sb.Append(i);

        if (i < length - 1)
        {
             sb.Append(',');
        }
    }

    return sb.ToString();
}

...
var commaSeparatedNumbers = GenerateCommaSeparatedNumbers(1000000);

Now the obvious solution to parse those numbers:

ExecuteMemoryOperation("Split NOT optimized", () =>
{
    foreach (var token in commaSeparatedNumbers.Split(','))
    {
        int.Parse(token);
    }
});

Nothing special in this one but as you’d expect it requires some additional memory:

Operation Split NOT optimized needed 44MB of memory!

Well, 44MB is quite something! If we could save this memory, how brilliant it’d be! And of course we can save some memory if we optimize this properly. The optimized version looks like this:

ExecuteMemoryOperation("Split optimized", () =>
{
    var span = commaSeparatedNumbers.AsSpan();
    var index = -1;

    do
    {
        index = span.IndexOf(',');
        if (index >= 0)
        {
            var slicedToken = span.Slice(0, index);
            int.Parse(slicedToken);
            span = span.Slice(index + 1);
        }
        else
        {
            int.Parse(span);
        }
    }
    while (index >= 0);
});

Sure it’s more complicated but the optimization work proves that no additional memory is needed:

Operation Split optimized needed 0MB of memory!

Conclusion

So we saved some memory out there! The thing with high-level programming languages is that coders nowadays are not aware of the memory implications and write poor performance code in respect to memory allocation and processor usage. I believe all developers should be careful about these kind of issues.

Concerning Span and Memory types, they are a welcomed addition to the .NET Library. I think that many programmers out there won’t need them very much. It’s nevertheless important to be aware of them as we never know when they might be useful. You can read more about them here. I’m still trying to understand Memory completely, I guess I need to use it in practice.

You can also find the code for this post on Github!

afivan

Enthusiast adventurer, software developer with a high sense of creativity, discipline and achievement. I like to travel, I like music and outdoor sports. Because I have a broken ligament, I prefer safer activities like running or biking. In a couple of years, my ambition is to become a good technical lead with entrepreneurial mindset. From a personal point of view, I’d like to establish my own family, so I’ll have lots of things to do, there’s never time to get bored 😂

View all posts by afivan →