Nov 22, 2023

Structs in C# are fun - Part 4/9: Constructors and struct behavior

Leia este post em português

Lire cet post en français.

  1. Structs in C# are fun.
  2. Brief introduction to Value Types vs Reference Types.
  3. Field initialization in structs.
  4. Constructors and struct behavior (this post).
  5. Other scenarios in which struct constructors behavior may surprise you.
  6. Struct with default argument values in constructors, a.k.a, are you not confused yet?
  7. `required` feature from C# 11 will not save your a** job.
  8. Struct used as default argument values.
  9. Bonus: Struct evolution in C#.

In the previous post of our series on Value Types we presented the code below:

which prints 0 (as opposed to 32 as some would expect) and we've also explored the basics of how field initialization is handled in C#. This post expands that discussion exploring some key relevant aspects that contributes to the disparity between expected/observed behavior.

Let's start with the following statement from the previous post:

In an oversimplified way, whenever C# compiler finds a field initialization it will simply move the initialization code to the constructors, i.e, initializing a field is equivalent to setting its value in the constructors (static fields are initialized in static constructors)...

If that is true (spoiler: it is) and the code is instantiating a new struct (line 1)  why is the field not being initialized ? The short answer is that despite the new expression, no constructor is being run which can be easily verified by looking into the generated il below: 

Note that the expression `new S2()` was compiled as the IL instruction intobj S2 (IL_0002 in method '<Main>$' highlighted in the above screenshot) by the compiler. The documentation for that instruction states1:

Initializes each field of the value type at a specified address to a null reference or a 0 of the appropriate primitive type.

Unlike Newobj, initobj does not call the constructor method. Initobj is intended for initializing value types, while newobj is used to allocate and initialize objects.

leading us to the next, natural, question: why does the compiler emitted a initobj instruction instead of a newobj?

If you pay close attention to the struct declaration (lines 5~9) you'll notice that there are actually no parameterless constructor declared, so the answer to that question becomes clear: because the compiler cannot invoke a non existing constructor, and for value types, it can resort to initobj instead!

To prove that point you can simply change the code, introducing a parameterless constructor in S22:

and observe that now IL_0002 contains the instruction call instance void S2::.ctor() effectively running  the parameterless constructor on s2 and whence, running the field initialization and causing the program to output 32.

So, in summary:

  1. Field initializers are injected into the constructors of the type in which the field is declared;
  2. Since no constructor is executed, all struct fields are simply zeroed out which  explains why the program at the top prints zero instead of 32.

Last, but not least, note that the behavior for classes is different and simply changing the declaration from struct to class (in the original program) leads to a compilation error since the compiler cannot use initobj instruction to initialize a reference type and requires a constructor3 to be available.

As always, all feedback is welcome.

No comments: