android hook 框架 ADBI 如何实现so函数挂钩

上一篇 android 5 HOOK 技术研究之 ADBI 项目 02 分析了hijack.c，这个文件编译为一个可执行程序 hijack, 该程序实现了向目标进程注入一个动态库的功能。这一篇继续研究 adbi 项目其他源码，解决真正替换目标进程函数的问题。

在开始之前，先看看 adbi 给出的一个例子，这个例子替换了目标进程epoll_wait函数的实现为自定义的实现:

首先，给出例子的epoll_wait自定义实现，共2个：

int my_epoll_wait_arm(int epfd, struct epoll_event *events, int maxevents, int timeout)
{
        return my_epoll_wait(epfd, events, maxevents, timeout);
}
int my_epoll_wait(int epfd, struct epoll_event *events, int maxevents, int timeout)
{
        int (*orig_epoll_wait)(int epfd, struct epoll_event *events, int maxevents, int timeout);
        orig_epoll_wait = (void*)eph.orig;

        hook_precall(&eph);
        int res = orig_epoll_wait(epfd, events, maxevents, timeout);
        if (counter) {
                hook_postcall(&eph);
                log("epoll_wait() called
");
                counter--;
                if (!counter)
                        log("removing hook for epoll_wait()
");
        }

        return res;
}
hook(&eph, getpid(), "libc.", "epoll_wait", my_epoll_wait_arm, my_epoll_wait);

my_epoll_wait_arm 是 arm 格式指令下的 hook 函数，
my_epoll_wait 是 thumb 格式指令下的 hook 函数，
hook 函数将 libc 库的 epoll_wait 函数替换成上述函数中的一个，并将一些关键信息存放在 eph 结构里。

查看 my_epoll_wait 这个函数主要执行3步： hook_precall, orig_epoll_wait, hook_postcall。 
hook_precall 将 eph 结构保存的原始 epoll_wait 地址重新赋值回去，orig_epoll_wait 调用原始 epoll_wait, 最后hook_postcall再用自定义的my_epoll_wait 赋值为原始epoll_wait. 一般inline hook 都是这种套路，这样做可以在hook函数内部调用原始函数，以实现在原始函数之上做一些事情。
其中，eph 是 struct hook_t 结构，用于保存一次hook的相关信息，如下：

struct hook_t {
        unsigned int jump[3]; // 对应arm,12个字节，执行这部分指令可以调用 hook 函数
        unsigned int store[3]; // 对应arm, 12 个字节，执行这部分指令可以让目标函数恢复到原始状态
        unsigned char jumpt[20]; //对应 thumb,20个字节，执行这部分指令可以调用 hook 函数
        unsigned char storet[20];//对应 thumb, 20个字节，执行这部分指令可以让目标函数恢复到原始状态
        unsigned int orig; // 目标函数地址，真正发挥作用的指令，它的值可能是上述4种其一
        unsigned int patch; // hook函数地址
        unsigned char thumb; // 取1 表示指令格式是 thumb, 取 0 表示是 arm
        unsigned char name[128];// 被hook的目标函数名
        void *data;
};

下面逐步分析目标进程函数的劫持过程，首先第一个问题，需要找到目标函数在目标进程内的地址。

int hook(struct hook_t *h, int pid, char *libname, char *funcname, void *hook_arm, void *hook_thumb)
{
        unsigned long int addr;
        int i;

        if (find_name(pid, funcname, libname, &addr) < 0) {
                log("can't find: %s
", funcname)
                return 0;
        }
　　　　。。。 
}

这个问题，在 android 5 HOOK 技术研究之 ADBI 项目 02 分析向目标进程注入so的过程时已经有涉及（需要找到目标进程内的 dlopen 函数， mprotect 函数）。在 hijack.c 的实现里，是通过解析本进程和目标进程的 maps 文件，得到本进程和目标进程 libdl.so 库加载的起始地址，然后用dlsym获取本进程内 dlopen 函数的地址，算出dlopen函数到 libdl.so 加载初始地址的 offset, 然后目标进程 libdl.so 的起始地址加上刚刚算出的 offset 即得到目标进程内部 dlopen的真实地址，这里，利用的是tracer进程和 tracee 进程加载的是同一个 libdl.so ，因而同一个符号在该动态库里的offset是固定的。

那么，如果要hook的目标函数不是在一个动态库里，而是tracee进程静态链接的一个函数，上述方式就失效了，这时候，如何获取目标函数地址呢？

这里，adbi项目的作者给出了另外一种方式，可以解决这个问题，find_name 函数仍然是获取动态库里的函数的地址，但这个实现改一下也可以用于获取静态链接的函数的地址：

int find_name(pid_t pid, char *name, char *libn, unsigned long *addr)
{
        struct mm mm[1000];
        unsigned long libcaddr;
        int nmm;
        char libc[1024];
        symtab_t s;

        if (0 > load_memmap(pid, mm, &nmm)) { // load_memap 函数加载 maps 文件并将内存块解析成一个数组
                log("cannot read memory map
")
                return -1;
        }
        if (0 > find_libname(libn, libc, sizeof(libc), &libcaddr, mm, nmm)) { // 在上述数组里使用so名字寻找对应的起　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　// 始地址
                log("cannot find lib: %s
", libn)
                return -1;
        }
        //log("lib: >%s<
", libc)
        s = load_symtab(libc); // 加载so文件，解析后获取符号表
        if (!s) {
                log("cannot read symbol table
");
                return -1;
        }
        if (0 > lookup_func_sym(s, name, addr)) { // 获取目标函数在符号表里的地址值，这个值其实就是该函数相对于so起始地
　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　// 址的offset
                log("cannot find function: %s
", name);
                return -1;
        }
        *addr += libcaddr; // so 在maps文件里的起始地址 + 函数在so符号表里的值 = 函数在目标进程的地址
        return 0;
}

load_symtab， lookup_func_sym 这两个函数在  android 5 HOOK 技术研究之 ADBI 项目 02  没有出现过， load_symtab 的作用是使用 elf 格式解析 so 文件， （linux平台下， 可执行程序，动态库，.o 文件，都是elf格式的文件， android 底层是linux）， 获取该elf文件的符号表。 lookup_func_sym 函数使用函数名遍历符号表，找到该符号（函数）对应的地址，对与 so 文件来说，这时候找到的地址是一个以起始地址为0计算的地址，即本质是一个offset, 用这个值加上目标进程 maps 文件里 so 加载的起始地址，即得到了目标进程里函数的地址。

如果要hook的地址不是so的函数呢，则使用 load_symtab 加载目标进程的可执行程序并获取符号表， lookup_func_sym 得到的函数地址就是真实地址。

如果目标进程对应的可执行程序，或者加载的so，符号被 strip 了（这时候其elf文件里没有符号表），如果获取函数地址 ？

==========

找到要hook的函数的地址后，开始填充 hook_t 结构，并覆盖原始地址：

int hook(struct hook_t *h, int pid, char *libname, char *funcname, void *hook_arm, void *hook_thumb)
{
　　　　。。。

　　　　　log("hooking:   %s = 0x%x ", funcname, addr)
        strncpy(h->name, funcname, sizeof(h->name)-1); // 被hook的函数名

        if (addr % 4 == 0) { // ARM 格式
                log("ARM using 0x%x
", hook_arm)
                h->thumb = 0;
                h->patch = (unsigned int)hook_arm; // hook 函数先保存在 h->patch 字段
                h->orig = addr; // 目标进程的被hook函数原始地址保存在 h->orig 字段
                h->jump[0] = 0xe59ff000; // LDR pc, [pc, #0] // h->jump 填充hook指令
                h->jump[1] = h->patch; // 新的hook函数地址放在 hook指令的第4到12个字节
                h->jump[2] = h->patch;
                for (i = 0; i < 3; i++)
                        h->store[i] = ((int*)h->orig)[i]; // 由于hook需要12个指令，这里先把原始12个字节的指令保存在 h-　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　//>store, unhook 时用到
                for (i = 0; i < 3; i++)
                        ((int*)h->orig)[i] = h->jump[i]; // 新的指令12个字节覆盖老的指令
        }
        else {
                if ((unsigned long int)hook_thumb % 4 == 0) //thumb格式
                        log("warning hook is not thumb 0x%x
", hook_thumb)
                h->thumb = 1;
                log("THUMB using 0x%x
", hook_thumb)
                h->patch = (unsigned int)hook_thumb; // hook函数地址保存在 h->patch
                h->orig = addr; // 原始地址保存在 h->orig
                h->jumpt[1] = 0xb4; // jumpt 保存实现 hook 的指令，20字节
                h->jumpt[0] = 0x30; // push {r4,r5}
                h->jumpt[3] = 0xa5;
                h->jumpt[2] = 0x03; // add r5, pc, #12
                h->jumpt[5] = 0x68;
                h->jumpt[4] = 0x2d; // ldr r5, [r5]
                h->jumpt[7] = 0xb0;
                h->jumpt[6] = 0x02; // add sp,sp,#8
                h->jumpt[9] = 0xb4;
                h->jumpt[8] = 0x20; // push {r5}
                h->jumpt[11] = 0xb0;
                h->jumpt[10] = 0x81; // sub sp,sp,#4
                h->jumpt[13] = 0xbd;
                h->jumpt[12] = 0x20; // pop {r5, pc}
                h->jumpt[15] = 0x46;
                h->jumpt[14] = 0xaf; // mov pc, r5 ; just to pad to 4 byte boundary
                memcpy(&h->jumpt[16], (unsigned char*)&h->patch, sizeof(unsigned int));// hook函数地址赋值给 jump　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　//t 指令的第 16字节开始的位置
                unsigned int orig = addr - 1; // sub 1 to get real address
                for (i = 0; i < 20; i++) {
                        h->storet[i] = ((unsigned char*)orig)[i]; //同等数量（20字节）的原始指令保存在 h->storet 里
                        //log("%0.2x ", h->storet[i])
                }
                //log("
")
                for (i = 0; i < 20; i++) {
                        ((unsigned char*)orig)[i] = h->jumpt[i]; // 新的指令覆盖老的指令
                        //log("%0.2x ", ((unsigned char*)orig)[i])
                }
        }
        hook_cacheflush((unsigned int)h->orig, (unsigned int)h->orig+sizeof(h->jumpt));//刷新缓存
        return 1;
}

上述注释可以看到，hook函数需要分 arm 指令和 thumb 指令实现两个分支的hook, 其中， arm 格式需要替换 12 个字节的指令，thumb需要替换 20个字节，将原始的指令存放在 hook_t结构的 store/storet 字段，将新的指令构造好后存放在 hook_t结构的 jump/jumpt字段，hook_t结构的 orig 字段存放原始函数的地址，当执行 hook 调用时，用 jump/jumpt 指令覆盖 orig地址对应长度的内存，这时候执行 orig 函数，就会执行到 jump/jumpt指令，也就是执行到hook函数，当执行 unhook 调用时，就用 store/storet 指令覆盖 orig 地址对应长度的内存，这时候的这块指令等于原始执行，相当于执行原始函数。

void unhook(struct hook_t *h)
{
        log("unhooking %s = %x  hook = %x ", h->name, h->orig, h->patch)
        hook_precall(h);
}

unhook函数，直接调用 hook_precall, 前面分析 my_epoll_wait 函数时发现，hook函数里也调用了 hook_precall, hook_postcall ，以实现在hook函数内可以调用原始函数，下面是这两个函数的实现：

void hook_precall(struct hook_t *h)
{
        int i;

        if (h->thumb) {
                unsigned int orig = h->orig - 1;
                for (i = 0; i < 20; i++) {
                        ((unsigned char*)orig)[i] = h->storet[i];
                }
        }
        else {
                for (i = 0; i < 3; i++)
                        ((int*)h->orig)[i] = h->store[i];
        }
        hook_cacheflush((unsigned int)h->orig, (unsigned int)h->orig+sizeof(h->jumpt));
}

void hook_postcall(struct hook_t *h)
{
        int i;

        if (h->thumb) {
                unsigned int orig = h->orig - 1;
                for (i = 0; i < 20; i++)
                        ((unsigned char*)orig)[i] = h->jumpt[i];
        }
        else {
                for (i = 0; i < 3; i++)
                        ((int*)h->orig)[i] = h->jump[i];
        }
        hook_cacheflush((unsigned int)h->orig, (unsigned int)h->orig+sizeof(h->jumpt));
}

实现比较简单，参见上面的讲述，这里不再赘述。

上述hook方式本质上是一个内存的赋值操作（((unsigned char*)orig)[i] = h->jumpt[i];）赋值操作的左值orig是代码区的地址，但不意味着这个赋值操作完成后，对orig代码区的调用就马上会用到刚刚赋值的指令，这是因为，CPU执行指令时，是从缓存（cache）获取的指令，内存的指令同步到缓存需要时间，所以，adbi 在每次对目标地址的指令做内存赋值操作后，都添加了一个刷新缓存的操作，人为触发一次指令同步操作，

void inline hook_cacheflush(unsigned int begin, unsigned int end)
{
        const int syscall = 0xf0002;
        __asm __volatile (
                "mov     r0, %0
"
                "mov     r1, %1
"
                "mov     r7, %2
"
                "mov     r2, #0x0
"
                "svc     0x00000000
"
                :
                :       "r" (begin), "r" (end), "r" (syscall)
                :       "r0", "r1", "r7"
                );
}

#define __ARM_NR_BASE (__NR_SYSCALL_BASE+0x0f0000)
#define __ARM_NR_breakpoint (__ARM_NR_BASE+1)
#define __ARM_NR_cacheflush (__ARM_NR_BASE+2)
#define __ARM_NR_usr26 (__ARM_NR_BASE+3)
#define __ARM_NR_usr32 (__ARM_NR_BASE+4)
#define __ARM_NR_set_tls (__ARM_NR_BASE+5)

实现方式是会汇编调用一个系统调用 __ARM_NR_cacheflush （0xf0002 ）

参考：http://blog.csdn.net/roland_sun/article/details/36049307

Arm格式的hook指令： 12个字节，

前面4个字节是：  LDR pc, [pc, #0]，pc寄存器读出的值是当前值+8，所以这一句是把jump[2]对应的地址加载到 pc 寄存器，这一句指令执行后，程序计数器执行 jump[2], 即 h->patch ，即hook函数。  jump[1] 的值是用来填充空间的，可以任意

 h->jump[0] = 0xe59ff000; // LDR pc, [pc, #0] // h->jump 填充hook指令
                h->jump[1] = h->patch; // 新的hook函数地址放在 hook指令的第4到12个字节
                h->jump[2] = h->patch;

Thumb格式的hook指令：20个字节，

首先，hook函数的地址被存放在 jumpt数组（char型）的第16,20这4个字节处。

                h->jumpt[1] = 0xb4; 
                h->jumpt[0] = 0x30; // push {r4,r5}
                h->jumpt[3] = 0xa5;
                h->jumpt[2] = 0x03; // add r5, pc, #12
                h->jumpt[5] = 0x68;
                h->jumpt[4] = 0x2d; // ldr r5, [r5]
                h->jumpt[7] = 0xb0;
                h->jumpt[6] = 0x02; // add sp,sp,#8
                h->jumpt[9] = 0xb4;
                h->jumpt[8] = 0x20; // push {r5}
                h->jumpt[11] = 0xb0;
                h->jumpt[10] = 0x81; // sub sp,sp,#4
                h->jumpt[13] = 0xbd;
                h->jumpt[12] = 0x20; // pop {r5, pc}
                h->jumpt[15] = 0x46;
                h->jumpt[14] = 0xaf; // mov pc, r5 ; just to pad to 4 byte boundary
                memcpy(&h->jumpt[16], (unsigned char*)&h->patch, sizeof(unsigned int));