.NET and C# have evolved a lot over the past few years. .NET has become fully cross-platform and has received astonishing performance improvements. This article shows off new features in .NET that are used to write very performant C#.
As the holiday season approaches, many of us at Wolfpack are getting into the festive spirit by participating in the Advent of Code. For those who are not familiar, Advent of Code is an annual online programming challenge that consists of a series of small coding puzzles released (like a true advent calendar) every day from the 1st of December until the 25th.
Participating in Advent of Code is a great way for us at Wolfpack to practice our problem-solving, coding and math skills, and to have some fun at the same time. We compare our approaches, discuss solutions and motivate each other to participate. Some developers try to see how far they can get, only programming in Excel spreadsheets, and we also have several project managers joining us each year!
Not every solution of an Advent of Code puzzle needs to be focused on performance. The goal is to learn and have fun. For day 6 this year I did focus on performance. In this blog post I’d like to highlight three ways to implement the puzzle, and explore their performance characteristics….
Day 06 of 2022
Day 6 of this year’s Advent of Code is about finding a “marker” in a string of characters. This marker is a substring of a certain size which contains unique characters (no duplicates). In part one of the puzzle, we are tasked to find a marker of four characters. Part two requires us to find a marker of fourteen distinct characters.
For example, in the following “message”:
zccnvctfwqggmdngplrshbvqnoimdavb
The marker of part one starts at index 3 (nvct) and the marker for part two starts at index 15 (gpl…imd).
If you’d like to solve this for yourself, do not read the solution below just yet!
The naive sliding window approach
The simple approach would be a sliding window. You would start at index 0 and create substrings of the required marker length. For every substring, check whether it contains duplicates.
This could be achieved using a HashSet in the following way:
public static int FindStartOfMarkerUsingHashSet(string input, int markerLength) { for (int i = 0; i < input.Length - markerLength; i++) if (new HashSet<char>(input.Substring(i, markerLength)).Count == markerLength) return i; throw new InvalidOperationException("The input has no valid marker."); }
This snippet is short and gets the job done, however, for every possible marker it checks, it allocates a new substring and HashSet and is quite slow. Running this code for part two of my actual input results in 1.58 MB of allocations and an average total time of 636.17 microseconds. (1 millisecond is 1000 microseconds)
Using Spans for no allocations
We can avoid allocating strings for each window we check by using the Span<T> feature which was added in C#7.2 and .NET Core 2.1.
A Span is nothing more than a reference to a particular memory address plus some length. In essence the Span is our sliding window. It tells us where in the string we are currently looking.
A faster and allocation-free implementation of this puzzle would be:.
public static int FindStartOfMarkerUsingSpan(string input, int markerLength) { ReadOnlySpan<char> span = input.AsSpan(); for (int index = 0; index < input.Length - markerLength; index++) { ReadOnlySpan<char> subSpan = span[index..(index + markerLength)]; if (!ContainsDuplicates(subSpan)) return index; } throw new InvalidOperationException("The input has no valid marker."); static bool ContainsDuplicates(ReadOnlySpan<char> input) { for (int i = 0; i < input.Length; i++) for (int j = i + 1; j < input.Length; j++) if (input[i] == input[j]) return true; return false; } }
We can also benchmark this approach. The benchmark tells us that there are indeed no memory allocations happening inside our code. The time it takes for my part two input is now 19.26 microseconds! Almost 43 times faster! If this is a so-called “hot path” of a program (it is often executed), time saved can really add up and make processes run minutes if not hours faster.
Using jumps and stackalloc for optimal performance
Of course we are still not satisfied with this 43 times speedup, we can do even better.
The following solution has a different approach to the sliding window. We keep track of the last occurrence of every unique character. If we notice this character again inside our sliding window, we will slide our window to immediately after the previous occurrence of the character.
For example, if we have previously seen a h after 8 characters and we notice another h after 11 characters, we can skip 9 spaces because the next 8 windows will all contain the duplicate h. This information can be saved in an array of size 26. Each index corresponds to the letter of the alphabet in its position.
To keep the method allocation-free, we use the stackalloc keyword to allocate an array on the stack.
Here is the full snippet for the jump & stackalloc implementation:
public static int FindStartOfMarkerUsingSkipForward(string input, int payloadSize) { ReadOnlySpan<char> span = input.AsSpan(); Span<int> charOccurredAt = stackalloc int['z' - 'a' + 1]; charOccurredAt.Fill(-1); int markerBase = 0; for (int i = payloadSize - 1; i >= 0; i--) { int currCharIndex = markerBase + i; char currChar = span[currCharIndex]; int currCharAlreadyOccuredAt = charOccurredAt[currChar - 'a']; if (currCharAlreadyOccuredAt >= markerBase && currCharIndex < currCharAlreadyOccuredAt) { markerBase = currCharIndex + 1; i = payloadSize; continue; } charOccurredAt[currChar - 'a'] = currCharIndex; } return markerBase; }
Benchmarking this snippet reveals it runs at an average of 1.85 microseconds, which is just over ten times faster than our previous optimized version and in total 450 times faster than the HashSet approach! I’d say we have conquered today’s Advent of Code challenge.
Benchmark
There is a great NuGet package to benchmark C# code: Benchmark.net . It yields the following results:
Method | Length | Mean | Error | StdDev | Median | Ratio | Allocated |
UsingHashSet | 4 | 82.223 μs | 1.6154 μs | 1.6154 μs | 81.112 μ | 22.24 | 239360 B |
UsingSpan | 4 | 3,885 μs | 0.0694 μ | 0.0616 μs | 3,890 μs | 1.00 | – |
UsingJumps | 4 | 1,784 μs | 0.0342 μs | 0.0320 μs | 1,780 μs | 0.46 | – |
UsingHashSet | 14 | 836.168 μs | 16.4871 μs | 27.9964 μs | 834,988 μs | 42.80 | 1582833 B |
UsingSpan | 14 | 19,255 μs | 0.2163 μs | 0.2023 μs | 19,252 μs | 1.00 | – |
UsingJumps | 14 | 1,845 μs | 0.1820 μs | 0.5365 μs | 1,538 μs | 0.08 | – |
Tbl 1. The results of the benchmark
This table can also be displayed as a graph. Note that the duration in microseconds is logarithmic.
Fig 1. A plot chart of the benchmark depicting the duration and memory usage (created using ChartBenchmark.net )
Advent of Code at Wolfpack
In conclusion, participating in Advent of Code with colleagues at Wolfpack is a fun and rewarding experience. Not only do these challenges improve our problem-solving, coding and math skills, we also get to know coworkers we don’t usually interact with. We highly recommend all software developers to consider participating in this annual event and experience the benefits for themselves. Happy coding!
Are you interested in joining us in Advent of Code next year? See our vacancies page .