建立自己的函数调用帧

本文从最简单的打印“hello world!”的C程序开始，写出其汇编程序（在汇编中使用C库函数），讲解怎样建立自己的函数调用帧，接着使用jmp指令替代call完成函数的调转与返回。在linux内核中这种技巧被大量使用，最后举出内核中使用到的两个实例。

首先，下面的C程序完成的功能，相信大家学大多数语言，都是用来讲解的第一个示例：

//helloworld1.c

#include <stdio.h>

int main()
{
        printf("hello world!\n");
        return 0;
}

我们使用gcc进行编译生成可执行文件，结果如下所示：

[guohl@guohl]$ gcc -o helloworld1 helloworld1.c
[guohl@guohl]$ ./helloworld1
hello world!

将上述C语言函数改成汇编程序，当然printf与exit函数还是使用C库自带的函数，这样就是汇编与C的混合编程，修改后程序如下：

#helloworld2.s

.section .data
output:
        .asciz "hello world!\n"

.section .text
.globl _start
_start:
        pushl $output	#通过栈传递参数
        call printf	#调用C库的printf函数
        addl $4, %esp	#恢复栈指针
        pushl $0	#以下两行为exit(0)
        call exit

在这里开始调用printf与exit函数所使用到的栈帧是我们自己建立的，因为这两个函数的参数均是通过栈传递的，因此将参数入栈。从函数返回时，再恢复调用之前的栈帧。

使用as与ld分别进行汇编和链接，运行结果如下：

[guohl@guohl]$ as -o helloworld2.o helloworld2.s
[guohl@guohl]$ ld -dynamic-linker /lib/ld-linux.so.2 -o helloworld2 -lc helloworld2.o[guohl@guohl]$ ./helloworld2
hello world!

在此程序中call指令实际完成了两件事——将下一条指令的地址压栈和将当前程序指针指向调用函数的入口。这样接下来就开始执行调用函数的程序，当从函数中返回时，ret指令恢复将之前压栈的下一条指令恢复到程序指针寄存器。

下面我们将call指令完成的工作重写一遍，得到：

#helloworld3.s

.section .data
output:
        .asciz "hello world!\n"

.section .text
.globl _start
_start:
        pushl $output
        pushl $1f	#将标签为1处的地址压栈
        jmp printf	#jmp到printf函数入口处，不是call
1:
        addl $4, %esp
        pushl $0
        call exit

在这里由于使用的是jmp指令而不是call指令，因此如果没有第11行的压栈指令，当程序从printf函数返回时，ret会将栈顶的值弹出到程序指针寄存器（即ip）中，对于本实验就跳转到数据段output那里了，这样就会出现段错误。因此，我们需要人为将函数返回时应该跳转的地址压栈，对于本程序即标号为1的地址。

与前一个实验一样编译链接并执行，得到结果如下：

[guohl@guohl]$ as -o helloworld3.o helloworld3.s
[guohl@guohl]$ ld -dynamic-linker /lib/ld-linux.so.2 -o helloworld3 -lc helloworld3.o
[guohl@guohl]$ ./helloworld3
hello world!

也许你就疑惑了，明明我用一个call就可以搞定的事，你为什么要用push和jmp两条指令完成呢？试想一下，如果我们不希望函数返回时执行到call的下一条指令，而是执行我们指定的一段程序，那么怎么实现呢？这时，将那段程序的地址先压栈，再通过jmp而不是call到调用函数，这样从函数返回的时候，就能执行到我们指定的程序段了。

下面举出内核中一个例子，使用的就是这种技巧：

#define switch_to(prev, next, last)					\
do {									\
	/*								\
	 * Context-switching clobbers all registers, so we clobber	\
	 * them explicitly, via unused output variables.		\
	 * (EAX and EBP is not listed because EBP is saved/restored	\
	 * explicitly for wchan access and EAX is the return value of	\
	 * __switch_to())						\
	 */								\
	unsigned long ebx, ecx, edx, esi, edi;				\
									\
	asm volatile("pushfl\n\t"		/* save    flags */	\
		     "pushl %%ebp\n\t"		/* save    EBP   */	\
		     "movl %%esp,%[prev_sp]\n\t"	/* save    ESP   */ \
		     "movl %[next_sp],%%esp\n\t"	/* restore ESP   */ \
		     "movl $1f,%[prev_ip]\n\t"	/* save    EIP   */	\
		     "pushl %[next_ip]\n\t"	/* restore EIP   */	\
		     __switch_canary					\
		     "jmp __switch_to\n"	/* regparm call  */	\
		     "1:\t"						\
		     "popl %%ebp\n\t"		/* restore EBP   */	\
		     "popfl\n"			/* restore flags */	\
									\
		     /* output parameters */				\
		     : [prev_sp] "=m" (prev->thread.sp),		\
		       [prev_ip] "=m" (prev->thread.ip),		\
		       "=a" (last),					\
									\
		       /* clobbered output registers: */		\
		       "=b" (ebx), "=c" (ecx), "=d" (edx),		\
		       "=S" (esi), "=D" (edi)				\
		       							\
		       __switch_canary_oparam				\
									\
		       /* input parameters: */				\
		     : [next_sp]  "m" (next->thread.sp),		\
		       [next_ip]  "m" (next->thread.ip),		\
		       							\
		       /* regparm parameters for __switch_to(): */	\
		       [prev]     "a" (prev),				\
		       [next]     "d" (next)				\
									\
		       __switch_canary_iparam				\
									\
		     : /* reloaded segment registers */			\
			"memory");					\
} while (0)

这时进程切换的核心代码，切换的具体过程就不赘述，可以参考我的一个PPT，下载地址http://wenku.baidu.com/view/f9a17542b307e87101f6968d.html?st=1。重点看第17行和19行，第17行将希望即将切换进来的进程next的执行的ip压栈（而此ip是在next被切换出去之前执行siwtch_to在第16行所保存的，即标号为1的地址），在第19行调转到__switch_to函数，待到从__switch_to函数返回时，此时就可以恢复执行next进程从上一次切换出去的地方（即标号为1）继续执行。如果按照此情景，完全可以将17和19行换成一句“call __switch_to”语句；关键地方在于，如果是fork新建的一个进程，第一次调度，它之前并未执行switch_to的语句，因此切换到它的时候，并不能让它从标号为1的地方开始执行，而是应该让进程从sys_fork系统调用中返回，该地址在sys_fork->do_fork->copy_process->copy_thread 函数中进行赋值：

p->thread.ip = (unsigned long) ret_from_fork;

这样对于切换到新进程第17行压栈的将不是标号为1的地址，而是ret_from_fork的地址。可以看出，在这里，设计的非常巧妙！而且还是必须的！