Showing posts with label cecil. Show all posts
Showing posts with label cecil. Show all posts

Mar 14, 2020

Mind the Gap: dealing with offset issues in Mono.Cecil


Lire cet article en français
Leia este post em Português


Mind the Gap

If there's one thing I've learned in my IT career is that, sooner or later (more often than not the former), a deeper knowledge about the various technologies abstracted by a library/framework will be required in order to use it efficiently and/or solve issues (in the context of this post Mono.Cecil abstracts various CIL aspects).

Last week while I was working on a Cecilifier feature, out of the blue I started getting invalid assemblies without changing any code in Cecilifier itself. After some head scratching I've figured it out: changes to test classes leading to some of the branches being more than 127 apart from each other.

IL (intermediate language) has two families of branch instructions: short i and long ii forms. In short (pun intended), instructions that take the short form uses one byte as the offset (ranging from -128 ~ 127) to the target of the branch whereas the ones based on the long form takes 4 bytes (a much wider range) and Cecilifier was emitting the former type of branches irrespective to the offset between the branch instruction and its target.

To make it more concrete the following program simulates this scenario by naively adding a bunch of nop instructions inside the if statement leading to the offset of the branch (to the end of the if) to overflow:
When executed it saves a modified version of itself (to your temp folder, i.e, %temp% on Windows / /tmp on Linux) which throws an exception as soon as the affected method is jited (in the example when we try to execute Foo() method).

You can build the program above and:
  1. Execute it passing `modify 1000` as its command line argument (you can try different values for the number of nops):
    `Test.exe modify 1000`
  2. After it finishes, run the modified version passing`run`  as the command line argument (on windows):
    `%temp%\Output.exe run`
Luckily, Mono.Cecil provides a relatively easy way to ensure that branch instructions compatible with the offsets will be used through SimplifyMacros() extension method (defined in Mono.Cecil.Rocks.MethodBodyRocks) which goes over a method's body instructions replacing the ones encoded using the short form of the opcodes with the respective long form ones; this means that branch instructions such as br.s, beq.s, etc. are replaced with their long form counterparts (br, beq, etc) in the same way that opcodes like ldarg.x are replaced with ldarg x and so on. Since those instructions uses offsets that are 4 bytes long it is very unlikely (almost impossible) that targets will be outside the valid range.

After finishing doing the modifications to a method body you can call OptimizeMacros() to ensure all instructions uses the most efficient (space wise) possible encoding by taking the long form of instructions and replacing them with the short form versions whenever possible.

Armed with these methods we can change our program to (note lines #51 & #58):
Now the assembly produced is no longer invalid!




As a last note keep in mind that in general in order to minimize assembly size, compilers emit instructions taking the least space possible (which will be the short form for offesets smaller than 128). That is nice and cool but this means that in the vast majority of the cases the short form of the branch instructions will be used and inserting a singe IL instruction in a method's body may cause short forms of branch instructions to fall out of valid offsets leading to exceptions at runtime.

Have fun.

#monocecil #cecilifier

Aug 15, 2015

Invalid offsets in IL instructions after modifying assembly with Mono.Cecil

Hi

Since 2008 I've been using Mono.Cecil, an amazing piece of software, that exposes .Net assemblies contents in a relatively easy and intuitive way so one can inspect and/or change the assembly's contents. 

Unfortunately, in order to use Mono.Cecil effectively, you need a fair amount of knowledge about how MS IL works and Mono.Cecil documentation is kind of sparse (to say the least).

Some time ago I was using it to change some assembly IL and to my surprise after applying the changes peverify complained that some IL instructions had invalid offsets

After some head scratching I've figured out that the issue was that the target of a (short) branch instruction have crossed the threshold that would require it to be  a normal branch (i.e, one that could use 32 bits offsets instead of 8 bits of the short version).

So now the issue was that I'd be forced to scan every single IL instruction in the method body and check/fix the target offset of branches; fortunately, Mono.Cecil has  two methods: MethodBody.SimplifyMacros() and  MethodBody.OptimizeMacros() that can be used to achieve my goal. 

Basically before start doing changes to the method body's IL, you call SimplifyMacros() and when you've finished with your changes on that method body you call OptimizeMacros() and Cecil will take care of adjusting branches accordingly. Nice!


Thanks everybody that helped to develop Mono.Cecil! It's really a handy library! :)

(Leia este post em português)

Instruções com offsets IL inválidos após modificar assembly com Mono.Cecil

Desde 2008 venho usando a biblioteca (muito boa diga-se de passagem) Mono.Cecil que permite você tanto ler quanto modificar o conteúdo de assemblies .Net de uma forma relativamente simples (depois que você compreende como utilizá-la, pois infelizmente,  usar tal biblioteca de forma efetiva exige um bom conhecimento sobre como o MS IL funciona e a documentação do Mono.Cecil deixa a desejar).

Algum tempo atras, um dos testes de um dos meus aplicativos (que usa esta biblioteca) começou a falhar; para ser mais preciso peverify começou a reportar algumas instruções com offsets inválidos

Após investigar por algum tempo concluí que o problema se encontrava no offset (operando) usado em alguma instruções de desvio (branch), as quais estavam ultrapassando o limite de um byte (-127 / 128) (isto ocorria devido a dois motivos: i) o código IL usava a forma (short) branch, ou seja, a instrução de desvio utilizada aceitava um único byte como offset e ii) meu aplicativo adicionava instruções entre a instrução de desvio e o alvo do desvio efetivamente exigindo um offset maior que 128).

Munido desta nova informação, tudo que tinha que fazer agora era verificar instrução por instrução (é claro, apenas as de desvio) se o operando da mesma estava dentro da faixa válida (-127 a 128) e corrigir quaisquer uma que não esteja. É claro que eu queria evitar isto a qualquer custo, pois este processo adicionaria mais código (e possíveis bugs) a meu aplicativo. 

Felizmente, Mono.Cecil possui dois método (MethodBody.SimplifyMacros() e  MethodBody.OptimizeMacros()) que, quando usados, se encubem em garantir que os operandos (offset) das instruções de desvio estão dentro das faixas válidas e caso não estejam, trocam a instrução para uma instrução de desvio que suporte o offset em questão (neste  case uma que use 4 bytes).

Basicamente, antes de iniciar qualquer modificação nas instruções (IL) de um método, você executa o método SimplifyMacros() e quando finalizar suas modificações você executa OptimizeMacros() e Mono.Cecil se incumbirá em ajustar as instruções de desvio (se necessário).


Desenvolvedores do Mono.Cecil: meu muito obrigado! :)

Have fun

(Read this post in english)