64位下的相对指令地址X86指令格式(操作码列和指令列解释)

寻找64位系统某符号特征码时发现他的MOV指令用的是相对地址,之前32位下从来没听说MOV还能用相对地址,故查阅了下Intel指令手册。
在MOV指令介绍下找到如下介绍:

In 64-bit mode, the instruction’s default operation size is 32 bits. Use of the REX.R prefix permits access to additional
registers (R8-R15). Use of the REX.W prefix promotes operation to 64 bits. See the summary chart at the
beginning of this section for encoding data and limits.

在64位下仍使用32位操作数,REX.R扩展寄存器,REX.W扩展指令。
REX前缀结构:

这里写图片描述

关于RIP的介绍:

2.2.1.6 RIP-Relative Addressing
A new addressing form, RIP-relative (relative instruction-pointer) addressing, is implemented in 64-bit mode. An
effective address is formed by adding displacement to the 64-bit RIP of the next instruction.
In IA-32 architecture and compatibility mode, addressing relative to the instruction pointer is available only with
control-transfer instructions. In 64-bit mode, instructions that use ModR/M addressing can use RIP-relative
addressing. Without RIP-relative addressing, all ModR/M modes address memory relative to zero.
RIP-relative addressing allows specific ModR/M modes to address memory relative to the 64-bit RIP using a signed
32-bit displacement. This provides an offset range of ±2GB from the RIP. Table 2-7 shows the ModR/M and SIB
encodings for RIP-relative addressing. Redundant forms of 32-bit displacement-addressing exist in the current
ModR/M and SIB encodings. There is one ModR/M encoding and there are several SIB encodings. RIP-relative
addressing is encoded using a redundant form.
In 64-bit mode, the ModR/M Disp32 (32-bit displacement) encoding is re-defined to be RIP+Disp32 rather than
displacement-only. See Table 2-7.
这里写图片描述
The ModR/M encoding for RIP-relative addressing does not depend on using a prefix. Specifically, the r/m bit field
encoding of 101B (used to select RIP-relative addressing) is not affected by the REX prefix. For example, selecting
R13 (REX.B = 1, r/m = 101B) with mod = 00B still results in RIP-relative addressing. The 4-bit r/m field of REX.B
combined with ModR/M is not fully decoded. In order to address R13 with no displacement, software must encode
R13 + 0 using a 1-byte displacement of zero.
RIP-relative addressing is enabled by 64-bit mode, not by a 64-bit address-size. The use of the address-size prefix
does not disable RIP-relative addressing. The effect of the address-size prefix is to truncate and zero-extend the
computed effective address to 32 bits.

RIP是64位的新特性,在64位下,指令使用特定的Mod\rm来使用RIP,RIP的偏移是32位故寻址范围为上下2GB。RIP的计算时相对于当前指令的下一条指令的地址来计算的,既目标地址=下一条指令地址+偏移。RIP中ModR\M不取决于指令前缀,比如指令前缀与R\M指定了R13寄存器,但mod是00,指令仍然使用RIP而不是r13寄存器。

举个例子,原始指令:4c8b2dedd9eaff
其中4c是REX,打开了W和R,即R和reg联合制定了r13寄存器,但不用SIB,2d则是00101101,就是使用RIP,后面是32位偏移。
在计算MOV指令的地址时可以这样算:

    //算出ObpLookupObjectByName的地址
    ULONG_PTR ObpLookupObjectByName = (ULONG_PTR)((PUCHAR)tg1_addr + 0x301 + 5 + offset);

    //在ObpLookupObjectByName的偏移62C处是指令MOV R13,ObRootDirectoryObject
    //而加7则定位到下一条指令
    ULONG_PTR next_code = (ULONG_PTR)((PUCHAR)ObpLookupObjectByName + 0x62C + 7);

    //取出偏移值
    UINT32 rip = *(PINT32)((PUCHAR)ObpLookupObjectByName + 0x62C + 3);

    //用下一条指令地址+偏移值即可得到目标地址
    POBJECT_DIRECTORY ObRootDirectoryObject= (POBJECT_DIRECTORY)((ULONG_PTR)next_code+rip);


IMMEDIATE  立即数

REGISTER    寄存器操作数

MEMORY    内存操作数

REGISTER_RIP 寄存器相对指令地址

AT&T汇编语言语法与Intel的类似,你可以参考gas手册。

区别在下面几点(摘自gas manual):

AT&T Syntax versus Intel Syntax  //AT&T语法与Intel语法的对比

-------------------------------

orignal:

In order to maintain compatibility with the output of gcc, as supports AT&T System V/386 assembler syntax. 

This is quite different from Intel syntax. 

We mention these differences because almost all 80386 documents used only Intel syntax. 

Notable differences between the two syntaxes are: 

翻译:

为了与gcc的output(gcc -s source_file)保持兼容,因为gcc支持AT&T System V/386汇编语法格式。

这种AT&T的汇编语法格式与Intel的汇编语法格式有显著的不同。

我们之所以提到这些不同,是因为几乎所有的80386 documents 文档都使用Intel的语法格式。

这两种语法格式的显著区别如下:

1>立即数:(immediate operand)

AT&T immediate operands are preceded by `$';  //AT&T的立即数前有前导的'$‘;

Intel immediate operands are undelimited (Intel `push 4' is AT&T `pushl $4').  //Intel的立即数没有限定符

2>寄存器操作数:(register operand)

AT&T register operands are preceded by `%';  //AT&T的寄存器操作数前有前导的%限定

Intel register operands are undelimited.         //Intel的寄存器操作数没有限定。

3>绝对跳转指令:(absolute jump/call)

AT&T absolute (as opposed to PC relative) jump/call operands are prefixed by `*';  //AT&T的绝对跳转指令前有前缀*;

they are undelimited in Intel syntax.  //Intel的绝对跳转指令前没有限定。

4> 源操作数和目的操作数的位置:(source and destination location)

AT&T and Intel syntax use the opposite order for source and destination operands.  //两者源操作数与目的操作数的位置相反

Intel `add eax, 4' is `addl $4, %eax'.  // Intel的格式: op-code 目的操作数,源操作数;

                                                          // AT&T的格式:op-code源操作数,目的操作数;

The `source, dest' convention is maintained for compatibility with previous Unix assemblers.

//AT&T的这种’源操作数,目的操作数‘的约定(规约)是为了与先前的Unix Assemblers保持兼容性。

 5>内存操作数的size:(b, w, l, )

//在AT&T语法格式中,操作数的存储尺寸是由op-code最后一个字符决定的:

//b (byte, 8), w(word,16),  l(long, 32)。

In AT&T syntax the size of memory operands is determined from the last character of the opcode name.

 Opcode suffixes of `b', `w', and `l' specify byte (8-bit), word (16-bit), and long (32-bit) memory references. 

//Intel 语法实现操作数的size通过,operand的前缀,如,byte ptr (byte,  8), word ptr(word,16), dword ptr(double word, 32)

Intel syntax accomplishes this by prefixes memory operands (not the opcodes themselves) 

with `byte ptr', `word ptr', and `dword ptr'.

//两者间的等价举例: 

Thus, Intel `mov al, byte ptr foo' is `movb foo, %al' in AT&T syntax. 

6>长jump/call 和长ret (long jumps/calls and long ret)

Immediate form long jumps and calls are `lcall/ljmp $section, $offset' in AT&T syntax; 

//AT&T语法格式:lcall/ljmp $section, $offset

 the Intel syntax is `call/jmp far section:offset'.

//Intel语法格式:call/jmp far section:offset

//同样 long return,指令也相似: 

Also, the far return instruction is `lret $stack-adjust' in AT&T syntax; //AT&T语法格式 

Intel syntax is `ret far stack-adjust'.  //Intel语法格式

7>其他: multiple sections

The AT&T assembler does not provide support for multiple section programs. 

//AT&T assebmler 不提供对多段程序的支持。

Unix style systems expect all programs to be single sections. 

Unix风格的系统认为所有的程序都是一个段。

8> references: 参考书目:

 <1>.参考sun的x86汇编手册:http://oldlinux.org/download/805-4693.pdf

补充:

Brennan's Guide to Inline Assembly

by Brennan "Bas" Underwood

Document version 1.1.2.2

Ok. This is meant to be an introduction to inline assembly under DJGPP. DJGPP is based on GCC, so it uses the AT&T/UNIX syntax and has a somewhat unique method of inline assembly. I spent many hours figuring some of this stuff out and told Info that I hate it, many times.

Hopefully if you already know Intel syntax, the examples will be helpful to you. I've put variable names, register names and other literals in bold type.

The Syntax

So, DJGPP uses the AT&T assembly syntax. What does that mean to you?

  • Register naming:
    Register names are prefixed with "%". To reference eax:
    AT&T:  %eax
    Intel: eax
    
  • Source/Destination Ordering:
    In AT&T syntax (which is the UNIX standard, BTW) the source is always on the left, and the destination is always on the right.
    So let's load ebx with the value in eax:
    AT&T:  movl %eax, %ebx
    Intel: mov ebx, eax
    
  • Constant value/immediate value format:
    You must prefix all constant/immediate values with "$".
    Let's load eax with the address of the "C" variable booga, which is static.
    AT&T:  movl $_booga, %eax
    Intel: mov eax, _booga
    
    Now let's load ebx with 0xd00d:
    AT&T:  movl $0xd00d, %ebx
    Intel: mov ebx, d00dh
    
  • Operator size specification:
    You must suffix the instruction with one of b, w, or l to specify the width of the destination register as a byte, word or longword. If you omit this, GAS (GNU assembler) will attempt to guess. You don't want GAS to guess, and guess wrong! Don't forget it.
    AT&T:  movw %ax, %bx
    Intel: mov bx, ax
    
    The equivalent forms for Intel is byte ptr, word ptr, and dword ptr, but that is for when you are...
  • Referencing memory:
    DJGPP uses 386-protected mode, so you can forget all that real-mode addressing junk, including the restrictions on which register has what default segment, which registers can be base or index pointers. Now, we just get 6 general purpose registers. (7 if you use ebp, but be sure to restore it yourself or compile with -fomit-frame-pointer.)
    Here is the canonical format for 32-bit addressing:
    AT&T:  immed32(basepointer,indexpointer,indexscale)
    Intel: [basepointer + indexpointer*indexscale + immed32]
    
    You could think of the formula to calculate the address as:
      immed32 + basepointer + indexpointer * indexscale
    
    You don't have to use all those fields, but you do have to have at least 1 of immed32, basepointer and you MUST add the size suffix to the operator!
    Let's see some simple forms of memory addressing:
    • Addressing a particular C variable:
      AT&T:  _booga
      Intel: [_booga]
      
      Note: the underscore ("_") is how you get at static (global) C variables from assembler. This only works with global variables. Otherwise, you can use extended asm to have variables preloaded into registers for you. I address that farther down.
    • Addressing what a register points to:
      AT&T:  (%eax)
      Intel: [eax]
      
    • Addressing a variable offset by a value in a register:
      AT&T: _variable(%eax)
      Intel: [eax + _variable]
      
    • Addressing a value in an array of integers (scaling up by 4):
      AT&T:  _array(,%eax,4)
      Intel: [eax*4 + array]
      
    • You can also do offsets with the immediate value:
      C code: *(p+1) where p is a char *
      AT&T:  1(%eax) where eax has the value of p
      Intel: [eax + 1]
      
    • You can do some simple math on the immediate value:
      AT&T: _struct_pointer+8
      
      I assume you can do that with Intel format as well.
    • Addressing a particular char in an array of 8-character records:
      eax holds the number of the record desired. ebx has the wanted char's offset within the record.
      AT&T:  _array(%ebx,%eax,8)
      Intel: [ebx + eax*8 + _array]
      
    Whew. Hopefully that covers all the addressing you'll need to do. As a note, you can put esp into the address, but only as the base register.

Basic inline assembly

The format for basic inline assembly is very simple, and much like Borland's method.

asm ("statements");

Pretty simple, no? So

asm ("nop");

will do nothing of course, and

asm ("cli");

will stop interrupts, with

asm ("sti");

of course enabling them. You can use  __asm__  instead of  asm  if the keyword  asm  conflicts with something in your program.

When it comes to simple stuff like this, basic inline assembly is fine. You can even push your registers onto the stack, use them, and put them back.

asm ("pushl %eax\n\t"
     "movl $0, %eax\n\t"
     "popl %eax");

(The \n's and \t's are there so the  .s  file that GCC generates and hands to GAS comes out right when you've got multiple statements per  asm .)
It's really meant for issuing instructions for which there is no equivalent in C and don't touch the registers.

But if you do touch the registers, and don't fix things at the end of your asm statement, like so:

asm ("movl %eax, %ebx");
asm ("xorl %ebx, %edx");
asm ("movl $0, _booga");

then your program will probably blow things to hell. This is because GCC hasn't been told that your  asm  statement clobbered  ebx  and  edx  and  booga , which it might have been keeping in a register, and might plan on using later. For that, you need:

Extended inline assembly

The basic format of the inline assembly stays much the same, but now gets Watcom-like extensions to allow input arguments and output arguments.

Here is the basic format:

asm ( "statements" : output_registers : input_registers : clobbered_registers);

Let's just jump straight to a nifty example, which I'll then explain:

asm ("cld\n\t"
     "rep\n\t"
     "stosl"
     : /* no output registers */
     : "c" (count), "a" (fill_value), "D" (dest)
     : "%ecx", "%edi" );

The above stores the value in  fill_value   count  times to the pointer  dest .

Let's look at this bit by bit.

asm ("cld\n\t"

We are clearing the direction bit of the  flags  register. You never know what this is going to be left at, and it costs you all of 1 or 2 cycles.

     "rep\n\t"
     "stosl"

Notice that GAS requires the  rep  prefix to occupy a line of it's own. Notice also that  stos  has the  l  suffix to make it move  longwords .

     : /* no output registers */

Well, there aren't any in this function.

     : "c" (count), "a" (fill_value), "D" (dest)

Here we load  ecx  with  count ,  eax  with  fill_value , and  edi  with  dest . Why make GCC do it instead of doing it ourselves? Because GCC, in its register allocating, might be able to arrange for, say,  fill_value  to already be in  eax . If this is in a loop, it might be able to preserve  eax  thru the loop, and save a  movl  once per loop.

     : "%ecx", "%edi" );

And here's where we specify to GCC, "you can no longer count on the values you loaded into  ecx  or  edi  to be valid." This doesn't mean they will be reloaded for certain. This is the clobberlist.

Seem funky? Well, it really helps when optimizing, when GCC can know exactly what you're doing with the registers before and after. It folds your assembly code into the code it's generates (whose rules for generation look remarkably like the above) and then optimizes. It's even smart enough to know that if you tell it to put (x+1) in a register, then if you don't clobber it, and later C code refers to (x+1), and it was able to keep that register free, it will reuse the computation. Whew.

Here's the list of register loading codes that you'll be likely to use:

a        eax
b        ebx
c        ecx
d        edx
S        esi
D        edi
I        constant value (0 to 31)
q,r      dynamically allocated register (see below)
g        eax, ebx, ecx, edx or variable in memory
A        eax and edx combined into a 64-bit integer (use long longs)

Note that you can't directly refer to the byte registers ( ah ,  al , etc.) or the word registers ( ax ,  bx , etc.) when you're loading this way. Once you've got it in there, though, you can specify  ax  or whatever all you like.

The codes have to be in quotes, and the expressions to load in have to be in parentheses.

When you do the clobber list, you specify the registers as above with the %. If you write to a variable, you must include "memory" as one of The Clobbered. This is in case you wrote to a variable that GCC thought it had in a register. This is the same as clobbering all registers. While I've never run into a problem with it, you might also want to add "cc" as a clobber if you change the condition codes (the bits in the flags register the jnz, je, etc. operators look at.)

Now, that's all fine and good for loading specific registers. But what if you specify, say, ebx, and ecx, and GCC can't arrange for the values to be in those registers without having to stash the previous values. It's possible to let GCC pick the register(s). You do this:

asm ("leal (%1,%1,4), %0"
     : "=r" (x)
     : "0" (x) );

The above example multiplies x by 5 really quickly (1 cycle on the Pentium). Now, we could have specified, say  eax . But unless we really need a specific register (like when using  rep movsl  or  rep stosl , which are hardcoded to use  ecx ,  edi , and  esi ), why not let GCC pick an available one? So when GCC generates the output code for GAS, %0 will be replaced by the register it picked.

And where did "q" and "r" come from? Well, "q" causes GCC to allocate from eax, ebx, ecx, and edx. "r" lets GCC also consider esi and edi. So make sure, if you use "r"that it would be possible to use esi or edi in that instruction. If not, use "q".

Now, you might wonder, how to determine how the %n tokens get allocated to the arguments. It's a straightforward first-come-first-served, left-to-right thing, mapping to the"q"'s and "r"'s. But if you want to reuse a register allocated with a "q" or "r", you use "0", "1", "2"... etc.

You don't need to put a GCC-allocated register on the clobberlist as GCC knows that you're messing with it.

Now for output registers.

asm ("leal (%1,%1,4), %0"
     : "=r" (x_times_5)
     : "r" (x) );

Note the use of  =  to specify an output register. You just have to do it that way. If you want 1 variable to stay in 1 register for both in and out, you have to respecify the register allocated to it on the way in with the  "0"  type codes as mentioned above.

asm ("leal (%0,%0,4), %0"
     : "=r" (x)
     : "0" (x) );

This also works, by the way:

asm ("leal (%%ebx,%%ebx,4), %%ebx"
     : "=b" (x)
     : "b" (x) );

2 things here:

  • Note that we don't have to put ebx on the clobberlist, GCC knows it goes into x. Therefore, since it can know the value of ebx, it isn't considered clobbered.
  • Notice that in extended asm, you must prefix registers with %% instead of just %. Why, you ask? Because as GCC parses along for %0's and %1's and so on, it would interpret %edx as a %e parameter, see that that's non-existent, and ignore it. Then it would bitch about finding a symbol named dx, which isn't valid because it's not prefixed with % and it's not the one you meant anyway.

Important note:  If your assembly statement  must  execute where you put it, (i.e. must not be moved out of a loop as an optimization), put the keyword  volatile  after  asm  and before the ()'s. To be ultra-careful, use

__asm__ __volatile__ (...whatever...);

However, I would like to point out that if your assembly's only purpose is to calculate the output registers, with no other side effects, you should leave off the  volatile keyword so your statement will be processed into GCC's common subexpression elimination optimization.

Some useful examples

#define disable() __asm__ __volatile__ ("cli");

#define enable() __asm__ __volatile__ ("sti");

Of course,  libc  has these defined too.

#define times3(arg1, arg2) \
__asm__ ( \
  "leal (%0,%0,2),%0" \
  : "=r" (arg2) \
  : "0" (arg1) );

#define times5(arg1, arg2) \
__asm__ ( \
  "leal (%0,%0,4),%0" \
  : "=r" (arg2) \
  : "0" (arg1) );

#define times9(arg1, arg2) \
__asm__ ( \
  "leal (%0,%0,8),%0" \
  : "=r" (arg2) \
  : "0" (arg1) );

These multiply arg1 by 3, 5, or 9 and put them in arg2. You should be ok to do:

times5(x,x);

as well.

#define rep_movsl(src, dest, numwords) \
__asm__ __volatile__ ( \
  "cld\n\t" \
  "rep\n\t" \
  "movsl" \
  : : "S" (src), "D" (dest), "c" (numwords) \
  : "%ecx", "%esi", "%edi" )

Helpful Hint: If you say  memcpy()  with a constant length parameter, GCC will inline it to a  rep movsl  like above. But if you need a variable length version that inlines and you're always moving dwords, there ya go.

#define rep_stosl(value, dest, numwords) \
__asm__ __volatile__ ( \
  "cld\n\t" \
  "rep\n\t" \
  "stosl" \
  : : "a" (value), "D" (dest), "c" (numwords) \
  : "%ecx", "%edi" )

Same as above but for  memset() , which doesn't get inlined no matter what (for now.)

#define RDTSC(llptr) ({ \
__asm__ __volatile__ ( \
        ".byte 0x0f; .byte 0x31" \
        : "=A" (llptr) \
        : : "eax", "edx"); })

Reads the TimeStampCounter on the Pentium and puts the 64 bit result into llptr.

The End

"The End"?! Yah, I guess so.

If you're wondering, I personally am a big fan of AT&T/UNIX syntax now. (It might have helped that I cut my teeth on SPARC assembly. Of course, that machine actually had a decent number of general registers.) It might seem weird to you at first, but it's really more logical than Intel format, and has no ambiguities.

If I still haven't answered a question of yours, look in the Info pages for more information, particularly on the input/output registers. You can do some funky stuff like use"A" to allocate two registers at once for 64-bit math or "m" for static memory locations, and a bunch more that aren't really used as much as "q" and "r".

Alternately, mail me, and I'll see what I can do. (If you find any errors in the above, please, e-mail me and tell me about it! It's frustrating enough to learn without buggy docs!) Or heck, mail me to say "boogabooga."

It's the least you can do.


Related Usenet posts:


Thanks to Eric J. Korpela <korpela@ssl.Berkeley.EDU> for some corrections.


Have you seen the DJGPP2+Games Page? Probably.
Page written and provided by Brennan Underwood.
Copyright © 1996 Brennan Underwood. Share and enjoy!
Page created with  vi , God's own editor.  

上文的original link:http://www.delorie.com/djgpp/doc/brennan/brennan_att_inline_djgpp.html

操作码列

1.主操作码是 1、2 或 3 字节.其中2字节操作码和三字节操作码都在0F开头,但是二字节的SIMD opcode是一个强制前缀+0fh+一字节的操作码:

一字节操作码示例:

操作码

指令

说明

98

CBW

AX  AL 的符号扩展

FF /4

JMP r/m32

绝对间接近跳转,地址由 r/m32 给出

2.只要可能,便会按照内存中的出现顺序以十六进制字节的形式给出这些代码,非十六进制字节的其它定义如下:

操作码

指令

说明

E8 cw

CALL rel16

相对近调用,位移量相对于下一条指令

E8 cd

CALL rel32

相对近调用,位移量相对于下一条指令

FF /2

CALL r/m16

绝对间接近调用,地址由 r/m16 给出

FF /2

CALL r/m32

绝对间接近调用,地址由 r/m32 给出

9A cd

CALL ptr16:16

绝对远调用,地址由操作数给出

9A cp

CALL ptr16:32

绝对远调用,地址由操作数给出

FF /3

CALL m16:16

绝对间接远调用,地址由 m16:16 给出

FF /3

CALL m16:32

绝对间接远调用,地址由 m16:32 给出

/digit

 为0到7之间的数字,表示指令的 ModR/M byte 只使用 r/m字段作为操作数,而其reg字段作为opcode的一部分,使用digit(下表的/digit(Opcode列))指定的数字

这里再解释下ModR/M byte,图请看http://hgy413.com/3288.html

mod(模式)域:连同r/m(寄存器/内存)域共同构成了32个可能的值:8个寄存器和24个寻址模式。
reg/opcode(寄存器/操作数)域:指定了8个寄存器或者额外的3个字节的opcode。究竟这三个字节用来做什么由主opcode指定。
r/m(寄存器/内存)域:可以指定一个寄存器作为操作数,或者可以和mod域联合用来指定寻址模式。有时候,它和mod域一起用来为某些指令指定额外的信息。

一个指令往往需要引用一个在内存当中的值,典型的如mov:

MOV eax, dword ptr [123456]///一个立即数表示的地址
MOV eax, dword ptr [esi]///一个存放在寄存器当中的地址
MOV eax, ebx///寄存器本身

这其中的 123456 或者 esi 就是 MOV 指令引用的内存地址,而MOV关心的是这个地址当中的内容。这个时候,需要某种方式来为指令指定这个操作数的类型:是一个立即数表示的地址,还是一个存放在寄存器当中的地址,或者,就是寄存器本身。

这个用来区分操作数类型的指令字节就是 ModR/M,确切的说是其中的5个位,即mod和r/m域。剩下的三个位,可能用来做额外的指令字节。因为,IA32的指令个数已经远超过一个字节所能表示的256个了。因此,有的指令就要复用第一个字节,然后依据ModR/M当中的reg/opcode域进行区分。

CALL指令的表示法:FF /2,是 0xFF 后面跟着一个 /digit 表示的东西。就是说,0xFF后面需要跟一个ModR/M字节,ModR/M字节使用reg/opcode域 = 2 。那么,reg/opcode = 2 的字节有32个,正如ModR/M的解释,这32个值代表了32种不同的寻址方式。是哪32种呢?手册上面有张表:

非常复杂的一张表。现在就看看这张表怎么读。对于SIB的介绍,我们先忽略

   首先是列的定义。由于reg/opcode域可以用来表示opcode,也可以用来表示reg,因此同一个值在不同的指令当中可能代表不同的含义。在表当中,就表现为每一列的表头都有很多个不同的表示。我们需要关心的就是 opcode这一个。注意看我用红圈圈出来的部分,这一列就是opcode=2的一列。而我们需要的CALL指令,也就是在这一列当中,0xFF后面需要跟着的内容。

   行的定义就是不同的寻址模式。正如手册所说,mod + R/M域,共5个字节,定义了32种寻址模式。

   0x10--0x17 对应于寄存器寻址。例如指令 CALL dword ptr [eax] :[eax]寻址对应的是0x10,因此,该指令对应的二进制就是 FF 10。同理, CALL dword ptr [ebx] 是 FF 13,CALL dword ptr [esi] 是 FF 16,这些指令都是2个字节。有人也许问 CALL word ptr [eax] 是什么?抱歉,这不是一个合法的32位指令。

注意到这一列中有个disp32,说明是ff 15 + 32位数据:

00020000 ff1510203040    call    dword ptr ds:[40302010h]

   0x50-0x57部分需要带一个disp8,即 8bit立即数,也就是一个字节。这个是基地址+8位偏移量的寻址模式。例如 CALL dword ptr [eax+10] 就是 FF 50 10 。注意虽然表当中写的是 [eax] + disp8 这种形式,但是并不表示是取得 eax 指向的地址当中的值再加上 disp8,而是在eax上加上disp8再进行寻址。因此写成 [eax+disp8] 更不容易引起误解。后面的disp32也是一样的。这个类型指令是3个字节。

00020000 ff5130          call    dword ptr [ecx+30h]

  0x90-0x97部分需要带 disp32,即4字节立即数。这个是基地址+32位偏移量。例如 CALL dword ptr [eax+12345] 就是 FF 90 00 01 23 45。有趣的是, CALL dword ptr [eax+10] 也可以写成 FF 90 00 00 00 10。至于汇编成哪个二进制形式,这是汇编器的选择。这个类型的指令是6个字节。

00020000 ff9210203040    call    dword ptr [edx+40302010h]

  0xD0-0xD7部分则直接是寄存器。这边引用的寄存器的类型有很多,但是在CALL指令当中只能引用通用寄存器,因此 CALL eax 就是 FF D0,臭名昭著的 CALL esp 就是 FF D4。注意 CALL eax 和 CALL [eax] 是不一样的。

00020000 ffd0            call    eax
00020002 ff10            call    dword ptr [eax]

 这时应该大家注意到了0x14,0x54,0x94,0x14,0x54,0x94部分是最复杂的,因为这个时候,ModR/M不足以指定寻址方式,而是需要一个额外的字节,这个字节就是指令当中的第4个字节SIB,SIB字节包括下列信息:

某些特定的ModR/M字节需要一个后续字节,称为SIB字节。32位指令的基地址+偏移量,以及 比例*偏移量 的形式的寻址方式需要SIB字节。\ scale(比例)域指定了放大的比例。 index(偏移)域指定了用来存放偏移量 的寄存器。 base (基地址)域用来标识存放基地址的寄存器。

0x14, 0x54, 0x94就是这里所说的“特定的ModR/M字节。这个字节后面跟着的SIB表示了一个复杂的寻址方式,典型的见于虚函数调用:

CALL dword ptr [ecx+4*eax]

就是调用ecx指向的虚表当中的第eax个虚函数。这个指令当中,因为没有立即数,因此FF后面的字节就是0x14,而 [ecx+4*eax] 就需要用SIB字节来表示。SIB确定的寻址方式是[base+Index* Scale +disp]

在这个指令当中,ecx就是 Base,4是Scale,eax是Index。

那么,Base, Scale和Index是如何确定的呢?手册上同样有一张表(又是巨大的表):

列是Base,行是Index*Scale,例如[ecx+4*eax] 就是0x81。

根据这张表,CALL dword ptr [ecx+4*eax] 就是 FF 14 81 。由此可见,对于 0x14系列的来说,CALL指令就是 3个字节。

00020000 ff1481          call    dword ptr [ecx+eax*4]

而 0x54 带 8bit 立即数,就是对应于 CALL指令:CALL dword ptr [ecx+4*eax+xx],这个指令就是 FF 54 81 xx,是4个字节。

00020000 ff548120        call    dword ptr [ecx+eax*4+20h]

同理,0x94带32位立即数,对应于CALL指令:CALL dword ptr [ecx+4*eax+xxxxxxxx],这个指令就是 FF 94 81 xx xx xx xx,是7个字节。

00020000 ff948120304000  call    dword ptr [ecx+eax*4+403020h]

/r

表示指令的 ModR/M 字节同时包含寄存器操作数与 r/m 操作数

89 /r

MOV r/m32,r32

将 r32 移到 r/m32

比如 

00020000 8933            mov     dword ptr [ebx],esi

cb、cw、cd、cp

1 字节 (cb)、2 字节 (cw)、4 字节 (cd) 或 6 字节 (cp) 值,跟在操作码的后面,用于指定代码偏移量,并可能用于给代码段寄存器指定新的值,一般用于我们在汇编中写call lable

E8 cw 的含义是:字节 0xE8 后面跟着一个2字节操作数表示要跳转到的地址与当前地址的偏移量。
E8 cd 的含义是:字节 0xE8 后面跟着一个4字节的操作数表示要跳转的地址与当前地址的偏移量。
9A cp 的含义是:字节 0x9A 后面跟着一个6字节的操作数表示要跳转的地址和代码段寄存器的值。

ib、iw、id 

指令的 1 字节 (ib)、2 字节 (iw) 或 4 字节 (id) 立即数操作数,跟在操作码、ModR/M字节或基数索引字节的后面。操作码确定操作数是否为有符号值。所有的字与双字都是按照低位字节在先的形式给出。

操作码

指令

说明

14 ib

ADC AL,imm8

带进位将 imm8 加到 AL 上

15 iw

ADC AX,imm16

带进位将 imm16 加到 AX 上

15 id

ADC EAX,imm32

带进位将 imm32 加到 EAX 上

00020000 1510203040      adc     eax,40302010h

+rb、+rw、+rd 

从0到7的寄存器代码,它添加到加号左侧给出的十六进制字节,以形成单个操作码字节。寄存器如下:

操作码

指令

FF /6

PUSH r/m32

50+rw

PUSH r16

50+rd

PUSH r32

 

rb

  

rw

  

rd

 

AL

=

0

AX

=

0

EAX

=

0

CL

=

1

CX

=

1

ECX

=

1

DL

=

2

DX

=

2

EDX

=

2

BL

=

3

BX

=

3

EBX

=

3

 

rb

   

rw

   

rd

 

AH

=

4

SP

=

4

ESP

=

4

CH

=

5

BP

=

5

EBP

=

5

DH

=

6

SI

=

6

ESI

=

6

BH

=

7

DI

=

7

EDI

=

7

00020000 50              push    eax
00020001 51              push    ecx
00020002 52              push    edx
00020003 53              push    ebx
00020004 54              push    esp
00020005 55              push    ebp
00020006 56              push    esi
00020007 57              push    edi

+i

操作数之一是来自 FPU 寄存器堆栈的 ST(i) 时浮点指令中使用的数字。数字 i(范围从 0 到 7)添加到加号左侧给出的十六进制字节,以形成单个操作码字节

指令列

rel:relative(rel8,rel16,rel32)

rel8:指令前128个字节到指令后127个字节范围内的相对地址。

rel16与rel32汇编后的指令所在的代码段内的相对地址。rel16 符号适用于操作数大小属性等于 16 位的指令;rel32 符号适用于操作数大小属性等于 32 位的指令。

77 cb

JA rel8

高于(CF=0 且 ZF=0)时短跳转

0F 8C cw/cd

JL rel16/32

小于 (SF<>OF) 时近跳转

00020000 7710            ja      00020012//20002+10 = 20012
00020002 0f8c10200057    jl      57022018//20008+57002010=57022018

ptr16:16 与 ptr16:32 

远指针,通常与指令不在同一个代码段中。16:16 记法表示指针值包含两个部分。冒号左侧的值是一个16位选择器,或是代码段寄存器的目标值。冒号右侧的值对应目标段中的偏移量。指令的操作数大小属性是16位时,使用 ptr16:16 符号;操作数大小属性是32位时,使用 ptr16:32 符号  

EA cd

JMP ptr16:16

绝对远跳转,地址由操作数给出

EA cp

JMP ptr16:32

绝对远跳转,地址由操作数给出

FF /5

JMP m16:16

绝对间接远跳转,地址由 m16:16 给出

FF /5

JMP m16:32

绝对间接远跳转,地址由 m16:32 给出

00020000 ff25d4924100    jmp     dword ptr ds:[4192D4h]

r(register)

r8 - 字节通用寄存器 AL、CL、DL、BL、AH、CH、DH 或 BH 之一。

r16 - 字通用寄存器 AX、CX、DX、BX、SP、BP、SI 或 DI 之一。

r32 - 双字通用寄存器 EAX、ECX、EDX、EBX、ESP、EBP、ESI 或 EDI 之一。

imm(立即数)

imm8 - 立即数字节。imm8 符号是 -128 到 +127(含)之间的一个有符号数字。对于结合使用 imm8 与字或双字操作数的指令,立即数会进行符号扩展,以形成一个字或双字。字的高位字节使用立即数的最高位填充。

imm16 - 操作数大小属性等于 16 位的指令使用的立即数字。这是 -32,768 到 +32,767(含)之间的一个数值。

imm32 - 操作数大小属性等于 32 位的指令使用的立即数双字。它允许使用 -2,147,483,648 到 +2,147,483,647(含)之间的数值。

r/m

r/m8 - 字节操作数,可以是字节通用寄存器(AL、BL、CL、DL、AH、BH、CH 及 DH)的内容,或是内存中的一个字节。

r/m16 - 操作数大小属性等于 16 位的指令使用的字通用寄存器或内存操作数。字通用寄存器有:AX、BX、CX、DX、SP、BP、SI 及 DI。内存的内容位于有效地址计算提供的地址。

r/m32 - 操作数大小属性等于 32 位的指令使用的双字通用寄存器或内存操作数。双字通用寄存器有:EAX、EBX、ECX、EDX、ESP、EBP、ESI 及 EDI。内存的内容位于有效地址计算提供的地址。

这里要特别注意:

89 /r

MOV r/m32,r32

将 r32 移到 r/m32

8B /r

MOV r32,r/m32

将 r/m32 移到 r32

 如:

MOV ecx,edx

这里就有两种解释:

如果是89,则r32为源操作数,r32 = edx, r/m32 = ecx

如果是8b,则r32为目标操作数,r32 = ecx, r/m32 = edx

所以可以构建出如下汇编:

00020000 89d1            mov     ecx,edx
00020002 8bca            mov     ecx,edx

m(内存操作数)

m - 内存中的 16 或 32 位操作数。

m8 - 内存中的字节操作数,通常表示为变量或数组名称,但由 DS:(E)SI 或 ES:(E)DI 寄存器指向它。此术语仅用于字符串指令与 XLAT 指令。

m16 - 内存中的字操作数,通常表示为变量或数组名称,但由 DS:(E)SI 或 ES:(E)DI 寄存器指向它。此术语仅用于字符串指令。

m32 - 内存中的双字操作数,通常表示为变量或数组名称,但由 DS:(E)SI 或 ES:(E)DI 寄存器指向它。此术语仅用于字符串指令。

m64 - 内存中的内存四字操作数。此术语仅用于 CMPXCHG8B 指令。

m128 - 内存中的内存双四字操作数。此术语仅用于“数据流单指令多数据扩展指令集”。

m16:16、m16:32 - 包含两个数字组成的远指针的内存操作数。冒号左侧的数字对应指针的段选择器。右侧的数字对应它的偏移量。

m16&32、m16&16、m32&32 - 由成对的数据项组成的内存操作数,其大小分别在和号 (&) 的左右两侧指出。允许使用所有的内存寻址模式。m16&16 与 m32&32 操作数由 BOUND 指令使用,以便提供包含数组下标的上、下边界的操作数。m16&32 操作数由 LIDT 与 LGDT 指令使用,以便提供用于加载限制字段的字,以及用于加载对应的 GDTR 与 IDTR 寄存器基址字段的双字。

moffs8、moffs16、moffs32 - 字节、字或双字类型的简单内存变量(内存偏移量),供 MOV 指令的一些变体使用。实际地址按照相对于段基址的简单偏移量的形式给出。指令中不使用 ModR/M 字节。随 moffs 显示的数字表示其大小,这由指令的地址大小属性确定。

sreg(段寄存器)

段寄存器的位分配情况是:ES=0、CS=1、SS=2、DS=3、FS=4 及 GS=5

其余

m32real、m64real、m80real - 分别是内存中的单精度、双精度及扩展型实数浮点操作数。

m16int、m32int、m64int - 分别是内存中的字、短整型及长整型浮点操作数。

ST 或 ST(0) - FPU 寄存器堆栈的栈顶元素。

ST(i) - 从 FPU 寄存器堆栈的栈顶元素数算起的第 i 个元素。(i0 到 7)

mm - MMX™ 技术寄存器。64 位 MMX 寄存器有:MM0 到 MM7。

mm/m32 - MMX 寄存器的低 32 位,或是 32 位内存操作数。64 位 MMX 寄存器有:MM0 到 MM7。内存的内容位于有效地址计算提供的地址。

mm/m64 - MMX 寄存器,或是 64 位内存操作数。64 位 MMX 寄存器有:MM0 到 MM7。内存的内容位于有效地址计算提供的地址。

xmm - XMM 寄存器。128 位 XMM 寄存器有:XMM0 到 XMM7。

xmm/m32 - XMM 寄存器,或是 32 位内存操作数。128 位 XMM 寄存器有:XMM0 到 XMM7。内存的内容位于有效地址计算提供的地址。

xmm/m64 - XMM 寄存器,或是 64 位内存操作数。128 位 SIMD 浮点寄存器有:XMM0 到 XMM7。内存的内容位于有效地址计算提供的地址。

xmm/m128 - XMM 寄存器,或是 128 位内存操作数。128 位 XMM 寄存器有:XMM0 到 XMM7。内存的内容位于有效地址计算提供的地址。

参考

CALL指令有多少种写法

https://cloud.tencent.com/developer/article/1647526

vc++中.ncb .clw .aps文件的作用

.clw文件记录了类的信息,如果classView中某个类不见了,重新生成该文件就可以了,方法:删除此文件,点击“建立类向导”,根据提示输入工程名称就可以了;


.ncb文件记录了类的提示信息,如果类的成员函数和变量的提示不见了,重新生成该文件即可,方法同上; 

.aps文件记录了资源信息,要利用现成的资源,需要修改3个文件,.rc文件,Resource.h文件和.aps文件,.aps直接删除后,进入程序,VC会自动生成。

VS中*.clw *.ncb *.opt *.aps这些文件是做什么用的?

通常,VS在建立一个工程之后,会出现*.clw *.ncb *.opt *.aps为后缀的文件

.CLW 文件是VC Class Wizard信息文件。存放了Class Wizard的信息。
.NCB 文件是分析器信息文件,是由系统自动产生的。
.OPT 文件是IDE的Option文件。
.APS 文件是资源文件的二进制版本。

还有其他的几个
.bsc 浏览器信息文件
.dsp 项目文件
.dsw 工作空间文件
.mak 外部的创建文件
.plg 建立日志文件

=========================

工作机会(内部推荐):发送邮件至gaoyabing@126.com,看到会帮转内部HR。

邮件标题:X姓名X_X公司X_简历(如:张三_东方财富_简历),否则一律垃圾邮件!

公司信息:

  1. 1.东方财富|上海徐汇、南京|微信客户端查看职位(可自助提交信息,微信打开);
原文地址:https://www.cnblogs.com/Chary/p/15540376.html