net IL的一些探索

查看IL有2个工具比较好用，一个是大名鼎鼎的Reflector，但已经转向收费，另一个是开源的ILSpy，强大好用，对linq和lambda表达示的支持很好。相关的书籍也很多，比如这本Microsoft.NET IL汇编语言程序设计。看明白IL有一个关键的地方：IL是基于堆栈的语言，类似汇编，每条指令都会从栈顶pop弹出需要用到的变量(if any)，然后再把返回值(if any)push压入堆栈。记住这一点，对看懂IL非常有帮助。

1、先说下这东西有什么用

比如关于i++的一个老问题：在C#里下面这段程序的输出是什么？

int i = 0;
for(int j=0;j<3;j++){
  i = i++;
  Console.Write("i={0} ", i);
}

和java一样，这里会输出 i=0 i=0 i=0 ，为什么呢？从IL里就能一目了然：

.locals init (
  [0] int32 i,
  [1] int32 j,
  [2] bool CS$4$0000
)

//for 循环的IL代码不贴了，关键是`i=i++`的部分如下：
IL_0008: ldloc.0
IL_0009: dup //把i复制了一份
IL_000a: ldc.i4.1
IL_000b: add
IL_000c: stloc.0 //把++后的值赋给i
IL_000d: stloc.0 //把复制的原值又重新赋给i

换句话说，编译器是这样来执行i=i++的：tmp=i; i=i+1; i=tmp;，所以会一直输出i=0。如果换成i=++i，那么编译器会转化成tmp=i+1; tmp1=tmp; i=tmp1; i=tmp;，当然就能正常输出i=1 i=2 i=3 了。另外，对于string+string与StringBuilder的性能对比、多线程的对共享变量的争用（你以为的一条C#语句，实际是多条汇编语句、并且CPU还可能优化调整语句的执行顺序），IL代码也有很强的指导意义，这里暂不展开了。

2、再来看一段稍复杂的c#源码，详细了解一下IL

这段代码是用来做AES加密的，如下

public static string Encrypt(string strEncrypt, string strKey){
  try{
    byte[] keyArray = Encoding.UTF8.GetBytes(FormsAuthentication.HashPasswordForStoringInConfigFile(strKey, "md5"));
    byte[] strEncryptArray = Encoding.UTF8.GetBytes(strEncrypt);
    byte[] resultArray = null;
    using(RijndaelManaged rDel = new RijndaelManaged()){
      rDel.Key = keyArray;
      rDel.Mode = CipherMode.ECB;
      rDel.Padding = PaddingMode.PKCS7;
      resultArray = rDel.CreateEncryptor().TransformFinalBlock(strEncryptArray, 0, strEncryptArray.Length);
    }
  }
  catch {
    return null;
  }
}

3、对应的IL及解释如下

.method public hidebysig static string Encrypt(string strEncrypt, string strKey) cil managed{
//hidebysig表示该方法会覆盖父类对应的方法，并且只是提供给Compiler的一个Flag，运行时用不到这个标志位。我理解被hidebysig标记过的方法在C#里相当于加上new关键字
//cil是默认的标识位，表示这段代码采用Common Intermediate Language实现
//managed表示这是一段托管代码，也是默认的标识位
  //代码大小 137(0x89)
  .maxstack 4 //表示这个方法用到的最大栈深度，这里的4不是byte而是slot，每个slot代表一个占位，不论item大小都占一个slot。呆会儿走完这个方法，我们可以数一下最大栈深是不是4？
  .locals init ([0] uint8[] keyArray, //按顺序声明本地变量，uint对应byte
    [1] uint8[] strEncryptArray,
    [2] uint8[] resultArray,
    [3] class [mscorlib]System.Security.Crytography.RijndaelManaged rDel,
    [4] class [mscorlib]System.Security.Crytography.ICryptoTransform cTransform, //rDel.CreateEncryptor()生成的中间变量
    [5] string CS$1$0000, //返回值的临时变量
    [6] bool CS$4$0001)   //判断是否应该释放using里生成的rDel的临时变量
  IL_0000: nop
  .try {
    IL_0001: nop
    IL_0002: call class[mscorlib]System.Text.Encoding [mscorlib]System.Text.Encoding::get_UTF8() //对应Encoding.UTF8，这也反映了属性即方法
    IL_0007: ldarg.1 //将arg.1也就是传入方法的第二个变量strKey压栈，以备下面调用时出栈
    IL_0008: ldstr "md5" //同理，将字符串"md5"压栈，这时最大栈深=3(UTF8、arg.1、"md5")
    IL_000d: call [System.Web]System.Web.Security.FormsAuthentication::HashPasswordForStoringInConfigFile(string, string) //调用方法，按照方法签名弹出栈里的2个slot，即最顶上的2个变量，并且隐含调用完将返回值压栈的操作
    IL_0012: callvirt instance uint8[] [mscorlib]System.Text.Encoding::GetBytes(string) //将上一个方法压栈的string出栈，调用虚方法，将结果byte[]压栈，下面的方法调用都是按照出栈、压栈的顺序执行的，不一一赘述
    IL_0017: stloc.0 //出栈，将上面返回的byte[]存入loc.0，即keyArray
    ……
    IL_0026: newobj instance void [mscorlib]System.Security.Crytography.RijndaelManaged::.ctor() //newobj实例化
    .try { //using在编译器中就是一个try、finally的简写，这里得到充分证明
    IL_002c: nop //debug模式下，会生成这些nop操作，便于调试时在“{”处设置断点
    ……
    IL_0036: ldc.i4.2 //压栈常量2,因为rDel.Mode=CipherMode.ECB;里面，ECB在枚举CipherMode中的值为2,可以看到enum在IL里就是数字常量
    ……
    IL_005a: nop
    IL_005b: leave.s IL_006f //跳出try语句，并且在leaving的时候，执行finally语句
  } //end .try
  finally {} //这里是编译器为using自动生成的语句，判断rDel是否为null，若不为null，调用其自身的Dispose()方法
    …… 
    IL_0086: ldloc.s CS$1$0000
    IL_0088: ret //将返回值压栈后退出
} //end of method Encrypt