Have you ever wondered what happens behind the scenes when you use range expressions in C#? Range expressions are a convenient way to access a slice of an array or a span, using the syntax data[start..end]. For example, data[..1024] means the first 1024 elements of data, and data[1024..] means the rest of the elements.
Let’s take a look at a few examples and run benchmarks. The results might be surprising.
Benchmark setup
We focus on a few typical approaches to passing data
|
|
⚡ The data processing code is intentionally oversimplified
|
|
Benchmark results
|
|
Method | Mean | Ratio | Gen0 | Gen1 | Gen2 | Allocated |
---|---|---|---|---|---|---|
Offsets | 1.771 μs | 1.00 | - | - | - | - |
Span | 1.941 μs | 1.10 | - | - | - | - |
ArraySegment | 2.883 μs | 1.63 | - | - | - | - |
SpanRanges | 1,086.565 μs | 622.66 | - | - | - | 1 B |
ArrayRanges | 3,632.562 μs | 2,050.90 | 332.0313 | 332.0313 | 332.0313 | 4194506 B |
Analysis
Scenario 1 - Data access using an offset and length
|
|
Pros:
- ⚡ The fastest way to access the data
- 👍 No compiler magic happens behind the scenes
- 👍 No pressure on garbage collection system - Zero allocations
Cons:
- 👎 Low-level approach hence tends to be more verbose and error-prone
- 👎 Non-declarative and more difficult to maintain
- 👎 No abstraction of sequential memory
Method | Mean | Ratio | Gen0 | Gen1 | Gen2 | Allocated |
---|---|---|---|---|---|---|
Offsets | 1.771 μs | 1.00 | - | - | - | - |
Scenario 2 - Span<T> over a section of an array
|
|
Pros:
- 👍 Fast! Only marginally slower than the raw access to the array
- 👍 No compiler magic happens behind the scenes
- 👍 No pressure on garbage collection system - Zero allocations
- 👍 Abstract away random access sequantial chunk of memory
- 👍 Supports fast and ergonomic code on the processing side
- 👍 Reduces code verbosity and improves consistency.
Cons:
- 👎 Requires
.net standard 2.1
Method | Mean | Ratio | Gen0 | Gen1 | Gen2 | Allocated |
---|---|---|---|---|---|---|
Span | 1.941 μs | 1.10 | - | - | - | - |
Scenario 3 - ArraySegment<T> over a section of an array
|
|
Pros:
- 👍 Fast! Only marginally slower than the raw access to the array
- 👍 No compiler magic happens behind the scenes
- 👍 No pressure on garbage collection system - Zero allocations
- 👍 Supports all versions of .net standard
- 👍 Supports fast and ergonomic code on the processing side
- 👍 Reduces code verbosity and improves consistency.
Cons:
- 👎 Leaking abstraction of the underlying array
- 👎 Pre Nullable<T> types API might require usage of the
null forgiving
operator in your code
Method | Mean | Ratio | Gen0 | Gen1 | Gen2 | Allocated |
---|---|---|---|---|---|---|
ArraySegment | 2.883 μs | 1.63 | - | - | - | - |
Scenario 4 - Span<T> and a range expression
|
|
💥 Surprise 💥 benchmark shows x 622.66
slowdown ratio relative to the baseline.
Method | Mean | Ratio | Gen0 | Gen1 | Gen2 | Allocated |
---|---|---|---|---|---|---|
SpanRanges | 1,086.565 μs | 622.66 | - | - | - | 1 B |
Analysis:
A compiler translates the syntax sugar above into the following code.
|
|
As you can see below, there is nothing in this code except boundaries check when the internal unmanaged _reference
handle is accessed. I speculate that this is the main reason for the slowdown. You can read more information about this method here
|
|
Scenario 5 - range expression over arrays
⚠️ Warning This is the slowest and the least efficient way to access the data
|
|
This syntactic sugar is designed to make things easier to read or to express the data access pattern. But without proper support from the collection, it will lead to a performance disaster. It is expended into the call to the method public static T[] GetSubArray<T>(T[] array, Range range)
of the class System.Runtime.CompilerServices.RuntimeHelper
Essentially, the snippet above is equivalent to the following:
|
|
As you can see, the range expressions are replaced by calls to GetSubArray with the same range arguments. This means that every time you use a range expression, you create a new memory array, which may have some performance implications. If you want to avoid this overhead, you can use spans instead of arrays, lightweight references to a contiguous memory region. Spans support range expressions natively without creating new objects.
The memory allocation pattern is visible in the decompiled source code below.
|
|
Pros:
- 👍 It makes reading or expressing the data access pattern easier.
- 👍 Supports fast and ergonomic code on the processing side
Cons:
- 👎 Creates GC pressure!
- 👎 💥 Might not be sutable for high performance scenarios.
- 👎 Since the memory segments were larger than 78KB, the memory was allocated using the large object heap and was subject of
Gen2
garbage collection. In other words, this is a well-known antipattern. - 👎 💥 Three orders of magnitude slower than the baseline approach
Method | Mean | Ratio | Gen0 | Gen1 | Gen2 | Allocated |
---|---|---|---|---|---|---|
ArrayRanges | 3,632.562 μs | 2,050.90 | 332.0313 | 332.0313 | 332.0313 | 4194506 B |
Conclusion
In conclusion, range expressions are a useful feature of C# that allows you to access slices of arrays or spans with a concise and readable syntax. However, they also rely on a helper method that creates new arrays under the hood, which may affect your performance. To optimize your code, you can use spans instead of arrays when possible, or avoid using range expressions in performance-critical scenarios.