Dec 28, 2022

A small C# 10 programming puzzle - Answer (Part III)

Leia este post em português

Lire cet post en Français.

In the last posts I've posed a challenge to change the output of the following program to show only lines with even numbers (go read parts I & II if you haven't done it yet).

I've also said that if you look carefully in the code you would be able to find some clues on the direction of the solution based on pieces of code that were not strictly  required in that version of the code which basically contains 2 such pieces of code:
  1. the using System.Runtime.CompilerServices which  is not required as we are not referencing any types declared in that namespace.
     
  2. the if on line 5 of Foo() method is clearly not necessary as the program only passes non null/empty strings to that method.

As I hinted, the solution I came up explores C# string interpolation handlers, a feature introduced in C#10.

With that information, one way to achieve our goal consists in simply replacing the type of msg parameter with a custom string interpolation handler (InterpolatedHandlerTrick in the code below) as follows:

If you run the above program, you'll see that it indeed works, i.e, you should see only the lines with even numbers in the output. Before diving into the details lets take a look on what is needed to implement custom interpolated string handlers:
  1. Declare a type and add the InterpolatedStringHandlerAttribute to it (line #14)
  2. Implement a constructor taking  at least two integers (literalLength and formattedCount) (our implementation declares some extra parameters to take advantage of some more advanced features, which will be discussed later)
  3. Implement the methods
    1. void AppendLiteral(string msg)
    2. void AppendFormatted<T>(T value)
and that is it.
 
That is all nice and cool but the interesting question is: why does it work?

If you paste that code in sharplab.io and select C# from the Results  drop down, you can see that the C# compiler re-wrote your program (or, to stick to the technical terminology for this transformation, the compiler lowered the code), more specifically the call to Foo() method to something like:

The actual code will be a little different (basically it will convert the for loop into a while and introduce some extra local variables) but I decided to ignore that and keep it as close to the original as possible making it simpler to reason about.

Notice that the compiler introduced a local variable typed as our interpolated string handler (line 4) and made some method calls on it passing the various parts of the interpolated string; basically it split it on `{...}` boundaries and for each part it called handler.AppendLiteral(part); (line 7) followed by handler.AppendFormatted(contents of {...}) (line 8) .

Notice also that those calls are wrapped in an if (line 5) controlled by a variable that has been initialized by the constructor of our interpolated string handler and voilà, we have all the required pieces: the method taking our string handler specifies that one or more of its parameters (in this case only one) must be passed to the constructor of our string handler (through the InterpolatedStringHandlerArgumentAttribute) which it uses to decide whether that string should be processed or not by setting a ref bool parameter (declared as the last constructor parameter) leading the lowered code to skip the calls to the AppendX() methods and so the instance of the string handler passed to Foo() produces and empty string which fails the condition and skips Console.WriteLine() method!

Cool, but before I leave you, here are some performance improvements/considerations this feature brings to the table:

  1. DefaultInterpolatedStringHandler is, at its name implies, used whenever an interpolated string is passed to a method that takes strings as a parameter, so, simply recompiling your C# 9 based code with C# 10 should bring you at least some performance/allocations improvements.

  2. Since DefaultInterpolatedStringHandler is declared as a ref struct its instantiation will not cause a heap allocation (in general all interpolated string handlers should be a ref struct to avoid such allocations).

  3. Since  String.Format() is not being used anymore no arrays are allocated.

  4. Allocations may be completely avoided when the argument specified through InterpolatedStringHandlerArgumentAttribute defines that the resulting string will not be used (very useful in logging and assertion code)

  5. Be aware that if the interpolated string handler reports that the resulting string will not be used (see point above), any related side effects of the expression in that part of the interpolated string will not be observed; for instance, if we change the interpolated value in the puzzle (line 10) from $"Test {i}" to $"Test {SomeMethod(i)}", SomeMethod() will not be invoked when i is odd which may not be obvious only inspecting the call to Foo()

  6. To minimize allocations even further, consider implementing ISpanFormattable on your own types if they may be used as expressions in interpolated strings (various types from BLC does implement that).
Finally, If you want to learn more about this feature I highly recommend to watch this video and read this MS tutorial.

Have fun!

Adriano

No comments: