Linux Running State Process ".so"、"code" Injection Technology

https://www.cnblogs.com/LittleHann/p/4594641.html

catalog

0. 引言
1. 基于so文件劫持进行代码注入
2. 基于函数符号表(PLT)中库函数入口地址的修改进行代码注入
3. PLT redirection through shared object injection into a running process
4. 基于ptrace() Linux调试API函数进行代码注入
5. Linux Hotpatch技术
6. 基于软件输入控制漏洞(overflow)进行代码注入
7. 动态共享库的保护技术

0. 引言

从本质上来说，代码注入、so注入是操作系统自身提供的机制功能，是用于帮助系统管理员、软件开发人员进行debug调试目的的，但同时也可以被黑客、安全人员用于攻击、Hook防御等目的，可见，技术对抗本身就是双刃剑，攻防双方经常是处于同一个层面上展开对抗

0x1: 共享库注入的使用场景

1. 三方辅助外挂: windows的dll注入、linux的so注入，可以让游戏进程加载我们的辅助界面
2. 病毒隐藏技术: 为了实现无进程运行，病毒常常会将自己的核心代码注入到系统常驻进程中
3. 进程状态更新: 程序已经运行，像某种服务；我们不想停止程序，却想要更新其中的一些功能

0x2: 代码注入(code injection)和Hook的区别

代码注入和Hook劫持在很多方面是类似的，但也有区别

1. 代码注入是Hook的前提，只有先实现代码注入，才可以进行下一步的Hook
2. Hook一定是实现了代码注入，但是代码注入并不一定能够Hook，这是两个不同层次的技术概念
3. Hook技术具有干净、稳定的特点，能够以串行的方式稳定的在目标进程中运行，并且可以以标准的C语言实现大量复杂的Hook程序逻辑，而代码注入更像是一种Hacking技术，它更适用于注入一段小而精的RAT Code，通过执行这段精简的代码，黑客可以实现控制目标进程、远程下载payload并执行等目的
4. 代码注入是"一次性"的，而Hook劫持是"稳定持久"的

0x3: 可稳定使用的so注入技术

1. 利用ptrace修改目标进程的PLT表中指定函数项的入口地址，实现函数劫持
2. 通过ptrace向目标进程写入shellcode，并修改EIP/RIP指针，劫持目标进程的CPU执行流，在执行完毕之后，将目标进程的执行流恢复

Relevant Link:

file:///C:/Users/zhenghan.zh/Downloads/%E5%8A%A8%E6%80%81%E5%85%B1%E4%BA%AB%E5%BA%93%E4%BF%9D%E6%8A%A4%E6%96%B9%E6%B3%95%E7%A0%94%E7%A9%B6.pdf

1. 基于so文件劫持进行代码注入

动态共享库文件被替换

1. 攻击者将进程需要加载的动态共享库文件，用同名的含有恶意代码的新动态共享库文件替换
2. 将原来的旧的动态共享库改名
3. 新的动态共享库文件中的输出符号与旧的动态共享库文件中的输出符号完全一样
4. 新的动态共享库中对应的函数在执行结束后，需要dlopen旧的被替换的共享库，并call ori_func，以实现无缝兼容

2. 基于函数符号表(PLT)中库函数入口地址的修改进行代码注入

http://www.cnblogs.com/LittleHann/p/4244863.html
//搜索：2. 地址无关代码: PIC

程序连接表(Procedure Linkage Table)可以使被感染的文件调用外部的函数。这要比修改LD_PRELOAD环境变量实现调用的重定向优越的多，首先不牵扯到环境变量的修改

1. 在ELF文件中，全局偏移表(Global Offset Table GOT)能够把位置无关的地址定位到绝对地址，程序连接表也有类似的作用，它能够把位置无关的函数调用定向到绝对地址。连接编辑器(link editor)不能解决程序从一个可执行文件或者共享库目标到另外一个的执行转移。结果，连接编辑器只能把包含程序转移控制的一些入口安排到程序连接表(PLT)中。在system V体系中，程序连接表位于共享正文中，但是它们使用私有全局偏移表(private global offset table)中的地址。动态连接器(例如：ld-2.2.2.so)会决定目标的绝对地址并且修改全局偏移表在内存中的影象。因而，动态连接器能够重定向这些入口，而勿需破坏程序正文的位置无关性和共享特性。可执行文件和共享目标文件有各自的程序连接表 
2. elf的动态连接库是内存位置无关的，就是说你可以把这个库加载到内存的任何位置都没有影响。这就叫做position independent。在编译内存位置无关的动态连接库时，要给编译器加上 -fpic选项，让编译器产生的目标文件是内存位置无关的还会尽量减少对变量引用时使用绝对地址。把库编译成内存位置无关会带来一些花费，编译器会保留一个寄存器来指向全局偏移量表(global offset table (or GOT for short))，这就会导致编译器在优化代码时少了一个寄存器可以使用，但是在最坏的情况下这种性能的减少只有3%，在其他情况下是大大小于3%的

0x1: Hook思路

//感染文件的PLT，可以在文件进行库调用时，将调用跳转到病毒程序，实现Hook、感染
1. 感染可执行文件，修改PLT
2. 劫持Hook之前，保存原始PLT，以便病毒执行完成之后可以返回到原始的库调用
3. 修改PLT中函数的入口地址，重定向共享库到病毒程序
    1) 将文本段的权限修改为可写(.plt段在文本段)
    2) 使用新的库调用地址(病毒)替代原入口
    3) 在新的库调用结束之后，调用原始库函数，call ori_func

0x2: GDB手动实验PLT劫持

1. 获取需要Hook劫持的目标函数地址
objdump -h /bin/bash
objdump -d /bin/bash | grep chdir
/*
00000000004184f8 <chdir@plt>:
  458dfb:    e8 f8 f6 fb ff           callq  4184f8 <chdir@plt>
  458e70:    e8 83 f6 fb ff           callq  4184f8 <chdir@plt>
  458e8e:    e8 65 f6 fb ff           callq  4184f8 <chdir@plt>
*/

2. 挂载目标进程
/*
[root@iZ23lobjjltZ ~]# gdb
(gdb) attach 18120
*/

3. 在目标函数入口设置断点
/*
(gdb) x/xg 0x4184f8
0x4184f8 <chdir@plt>:    0x016800299e2a25ff
(gdb) set *(0x4184f8) = 0xcc
(gdb) c
Continuing.
*/
这时被调试的 bash 会话恢复响应

4. 触发Hook
在被Hook的bash进程中输入cd，cd是bash的内置命令，会调用chdir函数，然后由于碰到断点它会看起来像死掉，这证明断点设置成功了。如果是在程序里我们可以马上对函数的参数进行处理，达到hook的目的
/*
Program received signal SIGTRAP, Trace/breakpoint trap.
0x00000000004184f9 in chdir@plt ()
*/

5. 查找堆栈内容
/*
(gdb) x/10xg $rsp-72
0x7fff56db94a0:    0x000000000a1fd130    0x0000003c000757ab
0x7fff56db94b0:    0x0000000000000001    0x000000000a216520
0x7fff56db94c0:    0x0000000000000000    0x0000000000000000
0x7fff56db94d0:    0x000000000a1fd130    0x0000000000000000
0x7fff56db94e0:    0x000000000a2165d0    0x0000000000458e75

(gdb) x/s 0x000000000a1fd130
0xa1fd130:     "/root"
*/

6. 恢复plt原始内容
/*
(gdb) set *(0x4184f8) = 0x016800299e2a25ff 
＃ 倒回触发断点的地址重新执行，int3的长度是一字节
(gdb) set $pc = $pc - 1
(gdb) c
Continuing.
*/

0x3: Code Example

/***************************             ***************************/
/************************** ptrace_hook.c **************************/
/***************************             ***************************/

#include <sys/ptrace.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/user.h>
#include <stdio.h>
#include <string.h>
#include <errno.h>
#include <signal.h>

#define STKALN 8

/* We use this to extract info from words read from the victim */
union pltval{
    unsigned long val;
    unsigned char chars[sizeof(unsigned long)];
};

void usage(char** argv)
{
    printf("Usage: %s plt_pos
", argv[0]);
}

void peekerror()
{
    printf("Status: %s
", strerror(errno));
}

/* function to modify the two parameters used by chdir */
void mod_test(pid_t traced, void* addr1, void* addr2)
{
    union pltval buf;
    buf.val = ptrace(PTRACE_PEEKDATA, traced, addr1, NULL);
    printf("--- mod_test: ");
    peekerror();
    
    memcpy(buf.chars, "hooked", 6);
    buf.chars[6] = 0;
    
    ptrace(PTRACE_POKEDATA, traced, addr1, buf.val);
    printf("--- mod_test: ");
    peekerror();
    
    buf.val = ptrace(PTRACE_PEEKDATA, traced, addr2, NULL);
    printf("--- mod_test: ");
    peekerror();

    memcpy(buf.chars, "/hooked", 7);
    buf.chars[7] = 0;
    
    ptrace(PTRACE_POKEDATA, traced, addr2, buf.val);
    printf("--- mod_test: ");
    peekerror();
}

int main(int argc, char** argv)
{
    pid_t traced;
    struct user_regs_struct regs;
    int status, trigd=0;
    unsigned long ppos;
    union pltval buf;
    unsigned long backup;
    siginfo_t si;
    long flag = 0, args[2];

    if(argc < 2){
        usage(argv);
        exit(1);
    }

    traced = atoi(argv[1]);
    ppos = atoi(argv[2]);

    ptrace(PTRACE_ATTACH, traced, NULL, NULL);
    printf("Attach: ");
    peekerror();

    wait(&status);
    buf.val = ptrace(PTRACE_PEEKDATA, traced, ppos, NULL);
    backup = buf.val;
    buf.chars[0] = 0xcc;
    ptrace(PTRACE_POKEDATA, traced, ppos, buf.val);
    ptrace(PTRACE_CONT, traced, NULL, NULL);
    
    while(1){
        printf("I'm going to wait.
");
        wait(&status);
        printf("Done waiting
");

        if(WIFEXITED(status)) break;

        ptrace(PTRACE_GETSIGINFO, traced, NULL, &si);
        ptrace(PTRACE_GETREGS, traced, NULL, &regs);

        if((si.si_signo != SIGTRAP) || (regs.rip != (long)ppos +1)){
            ptrace(PTRACE_GETREGS, traced, NULL, &regs);
            ptrace(PTRACE_CONT, traced, NULL, NULL);
            continue;
        }

        printf("Hook trigered: %ld times
", ++flag);
        printf("RSP: %lx
", regs.rsp);
        int i;
        for(i = 0; i < 2; i++){
            args[i] = ptrace(PTRACE_PEEKDATA, traced, regs.rsp-STKALN*(i+6), NULL);
            printf("Argument #%d: %lx
", i, args[i]);
        }

        mod_test(traced, (void*)args[0], (void*)args[1]);

        buf.val = backup;
        ptrace(PTRACE_POKEDATA, traced, ppos, buf.val);

        regs.rip = regs.rip - 1;
        ptrace(PTRACE_SETREGS, traced, NULL, &regs);

        ptrace(PTRACE_SINGLESTEP, traced, NULL, NULL);
        // We have to wait after each call of ptrace(),
        wait(NULL);  
        
        ptrace(PTRACE_GETREGS, traced, NULL, &regs);
        
        buf.chars[0] = 0xcc;
        ptrace(PTRACE_POKEDATA, traced, ppos, buf.val);
        ptrace(PTRACE_CONT, traced, NULL, NULL);
    }

    return 0;
}
//gcc ptrace_hook.c -o ptrace_hook
//./ptrace_hook 20745 0x4184f8

使用这种PLT劫持Hook技术需要注意以下几点

1. 在不同Linux版本、GCC版本的差别下，堆栈的增长方式是不同的，即进程的布局空间的方案可能不同，这可能导致Hook代码不具备兼容性
2. 32、64位的Hook代码不具备兼容性
3. 需要由Hook代码自身完成插入兼容工作，即在Hook代码结束之后，主动调用原始函数地址
4. PLT修改劫持的Hook架构和LD_PRELOAD的Hook架构是不兼容的，如果当前已经使用了LD_PRELOAD，则需要对代码进行修改，在Hook函数的末尾显式地发起对原始函数的调用
/*
LD_PRELOAD调用链方式
#if defined(RTLD_NEXT)
#  define REAL_LIBC RTLD_NEXT
#else
#  define REAL_LIBC ((void *) -1L)
#endif
*/

在实际编程中，可以使用/proc/pid/maps来进行动态到函数地址的搜索获取

Relevant Link:

http://bbs.sysu.edu.cn/bbstcon?board=Linux&file=M.1220887954.A
http://www.doc88.com/p-51761632926.html 
http://www.cnblogs.com/guaiguai/archive/2010/06/11/1756427.htm
https://github.com/kubo/plthook

3. PLT redirection through shared object injection into a running process

You have to know the following things to perform the redirections of the imported function in some dynamic link library:

1. The path to this library in the file system
2. The virtual address at which it is loaded
3. The name of the function to be replaced
4. The address of the substitute function
//Also it is necessary to get the address of the original function in order to perform the backward redirection and thus to return everything on its place.

Here is the algorithm of the work of the redirection function:

1. Open the library file.
2. Store the index of the symbol in the ".dynsym" section, whose name corresponds to the name of the required function.
3. Look through the ".rel.plt" section and search for the relocation for the symbol with the specified index.
4. If such symbol is found, save its original address in order to restore it from the function later. Then write the address of the substitute function in the place that was specified in the relocation. This place is calculated as the sum of the address of the load of the library into the memory and the offset in the relocation. That is all. The substitution of the function address is performed. The redirection will be performed every time at the call of this function by the library. Exit the function and restore the address of the original symbol.
5. If such symbol is not found in the ".rel.plt" section, search for it in the "rel.dyn" section likewise. But remember that in the "rel.dyn" section of relocations the symbol with the required index can be found not once. That is why you should not terminate the search loop after the first redirection. But you can store the address of the original symbol at the first coincidence and not to calculate it anymore, it will not change anyway.
6. Restore the address of the original function or just NULL if the function with the required name was not found.

Relevant Link:

http://www.codeproject.com/Articles/70302/Redirecting-functions-in-shared-ELF-libraries
http://www.codeproject.com/Articles/30824/PLT-redirection-through-shared-object-injection-in
http://www.codeproject.com/Articles/33340/Code-Injection-into-Running-Linux-Application

4. 基于ptrace() Linux调试API函数进行代码注入

在Windows上，已经有很成熟的的dll注入(Injlib)技术

1 关联到目标进程: OpenProcess()

2 在目标进程内找到装载共享库的函数地址
利用在WINDOWS中所有进程中的KERNEL32.DLL映象地址都相同的特点，通过调用GetProcAddress()函数来得到装载共享库的函数地址LoadLibrary()的地址

3 在目标进程内调用装载例程
使用VirtualAllocEx()在目标进程内分配一块可执行、可读写的内存，将我们的装载例程拷贝到此内存，然后调用CreateRemoteThread()在目标进程内创建一个使用我们拷贝的例程的线程

4 通过调用共享库中我们的代码做我们想做的操作

但是在Linux系统中，并没有像Windows那样的一套完整的API体系供我们使用，因此我们需要通过一些辅助API实现同样的功能

1 关联到目标进程: ptrace_attach(int pid)
/*
void ptrace_attach(int pid)
{
    if((ptrace(PTRACE_ATTACH , pid , NULL , NULL)) < 0) 
    {
        perror("ptrace_attach");
        exit(-1);
    } 
    waitpid(pid , NULL , WUNTRACED);
}
*/

2 在目标进程内找到装载共享库的函数地址
因为LINUX中每个进程中的共享库的地址映象是不同的(默认ASLR)，因此在Windows中的方法是不可行的。在LINUX装载库中，装载共享库的函数是dlopen()和_dl_open()，而找到它们地址的方法就是通过遍历link-map，找到我们想使用的函数的地址 
link-map是动态连接器内部使用的一个结构，通过它保持对已装载的库和库中符号的跟踪。实际上link-map是一个链表，表中的每一项都有一个指向装载库的指针。就象动态连接器所做的，当需要去查找符号的时候，我们能向前或向后遍历这个链表 ，通过访问链表上的每一个库去
/*
/* 定位在指定进程内存空间中的link-map */
struct link_map *
locate_linkmap(int pid)
{
Elf32_Ehdr *ehdr = malloc(sizeof(Elf32_Ehdr));
Elf32_Phdr *phdr = malloc(sizeof(Elf32_Phdr));
Elf32_Dyn *dyn = malloc(sizeof(Elf32_Dyn));
Elf32_Word got;
struct link_map *l = malloc(sizeof(struct link_map));
unsigned long phdr_addr , dyn_addr , map_addr;


/* 首先我们从elf header开始检查，它被映射在0x08048000处，
* 通过它计算出program header table的偏移，
* 然后从这里开始我们试着去定位PT_DYNAMIC区域。
*/

read_data(pid , 0x08048000 , ehdr , sizeof(Elf32_Ehdr));

phdr_addr = 0x08048000 + ehdr->e_phoff;
printf("program header at %p
", phdr_addr);

read_data(pid , phdr_addr, phdr , sizeof(Elf32_Phdr));

while ( phdr->p_type != PT_DYNAMIC ) {
read_data(pid, phdr_addr += sizeof(Elf32_Phdr), phdr, 
sizeof(Elf32_Phdr));
}

/* 现在我们搜索dynamic section(动态区域)，直到我们发现GOT的地址
*/

read_data(pid, phdr->p_vaddr, dyn, sizeof(Elf32_Dyn));
dyn_addr = phdr->p_vaddr;

while ( dyn->d_tag != DT_PLTGOT ) {
read_data(pid, dyn_addr += sizeof(Elf32_Dyn), dyn,
sizeof(Elf32_Dyn));
}

got = (Elf32_Word) dyn->d_un.d_ptr;
got += 4; /*　第二个GOT入口，还记得吗？

/* 现在仅仅读取第一个link_map项并返回它　*/
read_data(pid, (unsigned long) got, &map_addr , 4);
read_data(pid , map_addr, l , sizeof(struct link_map));

free(phdr);
free(ehdr);
free(dyn);

return l;
}

/* 搜索DT_SYMTAB和DT_STRTAB的位置并把它们保存到全局变量中，
* 同样也保存来自hash table的链数组项数到nchains。
*/

unsigned long symtab;
unsigned long strtab;
int nchains;


void
resolv_tables(int pid , struct link_map *map)
{
Elf32_Dyn *dyn = malloc(sizeof(Elf32_Dyn));
unsigned long addr;

addr = (unsigned long) map->l_ld;

read_data(pid , addr, dyn, sizeof(Elf32_Dyn));

while ( dyn->d_tag ) {
switch ( dyn->d_tag ) {

case DT_HASH:
read_data(pid,dyn->d_un.d_ptr +
map->l_addr+4,
&nchains , sizeof(nchains));
break;

case DT_STRTAB:
strtab = dyn->d_un.d_ptr;
break;

case DT_SYMTAB:
symtab = dyn->d_un.d_ptr;
break;

default:
break;
}

addr += sizeof(Elf32_Dyn);
read_data(pid, addr , dyn , sizeof(Elf32_Dyn));
}

free(dyn);
}

/* 从DT_SYMTAB(注：这是符号表)中发现符号 */

unsigned long
find_sym_in_tables(int pid, struct link_map *map , char *sym_name)
{
Elf32_Sym *sym = malloc(sizeof(Elf32_Sym));
char *str;
int i;

i = 0;

while (i < nchains) {
read_data(pid, symtab+(i*sizeof(Elf32_Sym)), sym,
sizeof(Elf32_Sym));
i++;

if (ELF32_ST_TYPE(sym->st_info) != STT_FUNC) continue;

/* 从string table中读取符号名*/
str = read_str(pid, strtab + sym->st_name);

if(strncmp(str , sym_name , strlen(sym_name)) == 0)
return(map->l_addr+sym->st_value);
}

/* 如果没有找到符号，返回0 */
return 0;
}
*/

3 在目标进程内调用装载例程
因为linux中没有VirtualAllocEx()，CreateRemoteThread()的等价函数，因此我们需要自己模拟实现这个过程
我们可以写一小段汇编代码，用它来调用_dl_open()函数装载我们的库，这就是我们要做的。需要记住的一件事情就是，_dl_open()是作为'internal_function'定义的，那意味着这个函数的参数将通过稍微不同的方法来传递，用寄存器代替堆栈。在这我们看看参数的传递次序：
EAX = const char *file
ECX = const void *caller (we set it to NULL)
EDX = int mode (RTLD_LAZY)
/*
_start: jmp string

begin: pop eax ; char *file
xor ecx ,ecx ; *caller
mov edx ,0x1 ; int mode

mov ebx, 0x12345678 ; addr of _dl_open()
call ebx ; call _dl_open!
add esp, 0x4

int3 ; breakpoint

string: call begin
db "/tmp/ourlibby.so",0x00
*/
一个更干净得方法就是通过ptrace(pid, PTRACE_GETREGS,...)得到寄存器并把参数写到user_regs_struct结构中

4 通过调用共享库中我们的代码做我们想做的操作
我们可以通过在目标进程内构造参数并设置堆栈来完成调用(通过ptrace设置寄存器和堆栈)
/*
_start: jmp string

begin: pop eax ; char *file
xor ecx ,ecx ; *caller
mov edx ,0x1 ; int mode

mov ebx, 0x12345678 ; addr of _dl_open()
call ebx ; call _dl_open!
add esp, 0x4

int3 ; breakpoint


string: call begin
db "/tmp/ourlibby.so",0x00
*/

0x1: Code Example

附加进程之后，修改被附加进程的当前eip指向的地址中的指令，并且保存原本指令，等待我们的指令被执行完成后，再恢复原本的指令
main.c

#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
 
int main()
{
    int i;
    for(i = 0; i < 10; i++)
    {
        printf("my counter:%d
",i);
        sleep(2);
    }
    return 0;
}
//gcc main.c -o main

hook.c

#include <sys/ptrace.h>
#include <stdlib.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
#include <string.h>
#include <errno.h>
#include <sys/syscall.h>
#include <sys/user.h>

//这个函数用来获得addr指向地址的长度为len字节的代码(len为我们shellcode的长度)
void getdata(pid_t child, long addr, char * str, int len)
{ 
        char * laddr;
        int i, j;
        union u
        {
                long val;
                char chars[long_size];
        }data;
        i = 0
 
        //求商，因为ptrace的PTRACE_PEEKDATA选项每次是读取这个地址中的4字节的代码，所以下面先读取4的整数倍。
        j = len / long_size;
        laddr = str;
        while(i < j)
        {
                data.val = ptrace(PTRACE_PEEKDATA, child, addr + i * 4, NULL);
                memcpy(laddr, data.chars, long_size);
                ++i;
                laddr += long_size;
        }
 
        //这里求余，继续读取代码
        j = len % long_size;
        if(j != 0)
        {
                data.val = ptrace(PTRACE_PEEKDATA, child, addr + i * 4, NULL);
                memcpy(laddr, data.chars, j);
        }
        str[len] = '';
}

void putdata(pid_t child, long addr, char* str, int len)
{
        int i,j;
        char * laddr;
        union u
        {
                long val;
                char chars[long_size];
        }data;
        i = 0;
        j = len/long_size;
        laddr=str;
        while(i<j)
        {
                memcpy(data.chars,laddr,long_size);
                ptrace(PTRACE_POKEDATA,child,addr+i*4,data.val);
                ++i;
                laddr+=long_size;
        }
        j = len %long_size;
        if(j!=0)
        {
                memcpy(data.chars,laddr,j);
                ptrace(PTRACE_POKEDATA,child,addr+i*4, data.val);
        }
}

const int long_size = sizeof(long);
 
int main(int argc, char* argv[])
{
 
        //这里保存想要附加进程的ID
        pid_t traced_process;
 
        //为了保存寄存器的值
        struct user_regs_struct regs, newregs;
        long ins;
        int k,h;
        int len = 41;
 
        //构造的输出hello world的shellcode，上面的len是shellcode的长度: 41bytes
        char shellcode[] = "xebx15x5exb8x04x00x00x00"
    "xbbx02x00x00x00x89xf1xba"
    "x0cx00x00x00xcdx80xccxe8"
    "xe6xffxffxffx48x65x6cx6c"
    "x6fx20x57x6fx72x6cx64x0ax00";
 
        //backup保存原始的代码
        char backup[len];
        long addr;
        if(argc != 2)
        {
                printf("command input error:
");
                exit(1);
        }
 
        //将输入的第二个参数转换成整型
        traced_process = atoi(argv[1]);
 
        //附加进程
        ptrace(PTRACE_ATTACH, traced_process, NULL, NULL);
        wait(NULL);
 
        //获得当前寄存器
        ptrace(PTRACE_GETREGS, traced_process, NULL, &regs);
        // 打印eip
        ins = ptrace(PTRACE_PEEKTEXT, traced_process, regs.eip, NULL);
        printf("EIP:%lx instruction executed : %lx
", regs.eip, ins);
    
    //获取目标进程和shellcode等长的原始数据
        getdata(traced_process, regs.eip, backup, len);
        //打印原始代码
        printf("backup is : 
");
        for(k = 0 ;k  < 41; k++)
        {
                printf("%x ",backup[k]);
        }
        printf("
");
    //向目标进程写入shellcode数据
        putdata(traced_process,regs.eip,shellcode,len);
        printf("shellcode is :
");
 
        for(h = 0; h < 41; h++)
        {
                printf("%x ", shellcode[h]);
        }
        printf("
");
 
        //重新设置寄存器，从attach时刻的位置重新开始执行
        ptrace(PTRACE_SETREGS, traced_process, NULL, &regs);
 
        //恢复运行程序，也就是从刚刚设置的eip处开始运行
        ptrace(PTRACE_CONT, traced_process, NULL, NULL);
 
        //等待子进程状态发生改变，shellcode中，使用了int 3断点使进程状态改变
        wait(NULL);
        printf("press the enter key to continue
");
        getchar();
 
        //将原始代码拷贝回去
        putdata(traced_process, regs.eip, backup, len);
        ptrace(PTRACE_SETREGS, traced_process, NULL, &regs);
        printf("excute origion code
");
        ptrace(PTRACE_DETACH, traced_process, NULL, NULL);
        return 0;
}

Relevant Link:

http://grip2.blogspot.com/2006/12/blog-post.html
http://0nly3nd.sinaapp.com/?p=529
http://blog.csdn.net/lingfong_cool/article/details/7949602
http://blog.csdn.net/myarrow/article/details/9630377
http://m.oschina.net/blog/97155
http://blog.chinaunix.net/uid-29482215-id-4135833.html

5. Linux Hotpatch技术

Hotpatch 是一个允许正在运行的进程动态加载一个so库的C库，类似于Win32上的CreateRemoteThread()API
和其他现有的动态加载方案相比，Hotpatch的优点是在加载so库之后将会恢复原先进程的运行状态，开发者可以利用Hotpatch实现

1. 加载so库到一个已经运行的进程中 
2. 调用该so库中的自定义函数 
3. 向该函数传递序列化的参数

0x1: Code Example

#include <hotpatch.h>

int main(int argc, char **argv)
{
    pid_t pid = argc > 1 ? atoi(argv[1]) : 0;
    //注入的目标进程
    hotpatch_t *hp = hotpatch_create(pid, 1);
    if (hp) 
    {
        unsigned char *data = (unsigned char *)"my custom serialized data";
        size_t datalen = strlen((char *)data) + 1;
        uintptr_t result1, result2;
        //需要注入的so文件
        hotpatch_inject_library(hp, "libhotpatchtest.so", "mysym", data, datalen, &result1, &result2);
        hotpatch_destroy(hp);
    }
    return 0;
}

hotpatch_t *hp = hotpatch_create(pid, 1);

hotpatch_t *hotpatch_create(pid_t pid, int verbose)
{
    int rc = 0;
    hotpatch_t *hp = NULL;
    do 
    {
        char filename[OS_MAX_BUFFER];
        if (pid <= 0) 
        {
            LOG_ERROR_INVALID_PID(pid);
            break;
        }
        memset(filename, 0, sizeof(filename));
        // /proc/pid/exe指向目标进程的二进制镜像空间
        snprintf(filename, sizeof(filename), "/proc/%d/exe", pid);
        if (verbose > 3)
        {
            fprintf(stderr, "[%s:%d] Exe symlink for pid %d : %s
", __func__, __LINE__, pid, filename);
        }     
        hp = malloc(sizeof(*hp));
        if (!hp) 
        {
            LOG_ERROR_OUT_OF_MEMORY;
            rc = -1;
            break;
        }
        memset(hp, 0, sizeof(*hp));
        hp->verbose = verbose;
        hp->pid = pid;
        hp->is64 = HOTPATCH_EXE_IS_NEITHER;
        //获取目标进程的符号表
        hp->exe_symbols = exe_load_symbols(filename, hp->verbose,
                &hp->exe_symbols_num,
                &hp->exe_entry_point,
                &hp->exe_interp,
                &hp->is64);
        if (!hp->exe_symbols) 
        {
            fprintf(stderr, "[%s:%d] Unable to find any symbols in exe.
", __func__, __LINE__);
            rc = -1;
            break;
        }
        if (hp->exe_entry_point == 0) 
        {
            fprintf(stderr, "[%s:%d] Entry point is 0. Invalid.
", __func__, __LINE__);
            rc = -1;
            break;
        }
        LOG_INFO_HEADERS_LOADED(verbose);
        //获取目标进程的/proc/pid/maps
        hp->ld_maps = ld_load_maps(hp->pid, hp->verbose, &hp->ld_maps_num);
        if (!hp->ld_maps) 
        {
            fprintf(stderr, "[%s:%d] Unable to load data in /proc/%d/maps.
", __func__, __LINE__, pid);
            rc = -1;
            break;
        }
        if (verbose > 2)
        {
            fprintf(stderr, "[%s:%d] /proc/%d/maps loaded.
", __func__, __LINE__, pid);
        }     
        if (hp->exe_symbols && hp->exe_symbols_num > 0) 
        {
            qsort(hp->exe_symbols, hp->exe_symbols_num, sizeof(*hp->exe_symbols), elf_symbol_cmpqsort);
        }
        //根据目标进程的符号表获取函数地址
        if (hotpatch_gather_functions(hp) < 0) 
        {
            fprintf(stderr, "[%s:%d] Unable to find all the functions needed. Cannot proceed.
", __func__, __LINE__);
            rc = -1;
            break;
        }
        if (rc < 0) 
        {
            hotpatch_destroy(hp);
            hp = NULL;
        }
    } while (0);
    return hp;
}

hotpatch_inject_library(hp, "libhotpatchtest.so", "mysym", data, datalen, &result1, &result2);

/*
1. hp: hotpatch_t指针
2. dll: 需要注入的so名称
3. symbol: 符号表
4. data: 需要注入的数据
5. datalen: 注入数据的长度
6. outaddr: 返回地址
7. outres: 返回结果
*/
int hotpatch_inject_library(hotpatch_t *hp, const char *dll, const char *symbol, const unsigned char *data, size_t datalen, uintptr_t *outaddr, uintptr_t *outres)
{
    size_t dllsz = 0;
    size_t symsz = 0;
    size_t datasz = 0;
    size_t tgtsz = 0;
    int rc = 0;
    unsigned char *mdata = NULL;
    if (!dll || !hp) 
    {
        fprintf(stderr, "[%s:%d] Invalid arguments.
", __func__, __LINE__);
        return -1;
    }
    //检查目标进程是否包含malloc、dlopen函数
    if (!hp->fn_malloc || !hp->fn_dlopen) 
    {
        fprintf(stderr, "[%s:%d] No malloc/dlopen found.
", __func__, __LINE__);
        return -1;
    }
    /* calculate the size to allocate */
    dllsz = strlen(dll) + 1;
    symsz = symbol ? (strlen(symbol) + 1) : 0;
    datasz = data ? datalen : 0;
    tgtsz = dllsz + symsz + datasz + 32; /* general buffer */
    tgtsz = (tgtsz > 1024) ? tgtsz : 1024;
    /* align the memory */
    tgtsz += (tgtsz % sizeof(void *) == 0) ? 0 : (sizeof(void *) - (tgtsz % sizeof(void *)));
    mdata = calloc(sizeof(unsigned char), tgtsz);
    if (!mdata) 
    {
        LOG_ERROR_OUT_OF_MEMORY;
        return -1;
    }
    memcpy(mdata, dll, dllsz);
    if (symbol) 
    {
        memcpy(mdata + dllsz, symbol, symsz);
    }
    if (data) 
    {
        memcpy(mdata + dllsz + symsz, data, datasz);
    }
    if (hp->verbose > 0)
        fprintf(stderr, "[%s:%d] Allocating "LU" bytes in the target.
", __func__, __LINE__, tgtsz);
    do 
    {
        /* The stack is read-write and not executable */
        struct user iregs; /* intermediate registers */
        struct user oregs; /* original registers */
        int verbose = hp->verbose;
        uintptr_t result = 0;
        uintptr_t stack[4] = { 0, 0, 0, 0}; /* max arguments of the functions we
                                               are using */
        uintptr_t heapptr = 0;
        int idx = 0;
#undef HP_SETEXECWAITGET
#undef HP_NULLIFYSTACK
#undef HP_PASS_ARGS2FUNC
#define HP_NULLIFYSTACK() 
    do 
    { 
        uintptr_t nullcode = 0; 
        if (verbose > 1) 
            fprintf(stderr, "[%s:%d] Copying Null to stack.
", __func__, __LINE__); 
        if ((rc = hp_pokedata(hp->pid, HP_REG_SP(iregs), nullcode, verbose)) < 0) 
            break; 
    } while (0)

#define HP_SETEXECWAITGET(fn) 
    do { 
        if (verbose > 1) 
            fprintf(stderr, "[%s:%d] Setting registers and invoking %s.
", 
                __func__, __LINE__, fn); 
        //设置寄存器
        if ((rc = hp_set_regs(hp->pid, &iregs)) < 0) 
            break; 
        if (verbose > 1) 
            fprintf(stderr, "[%s:%d] Executing...
", __func__, __LINE__); 
        //设置目标进程从设置的IP位置开始执行
        if ((rc = hp_exec(hp->pid)) < 0) 
            break; 
        if (verbose > 1) 
            fprintf(stderr, "[%s:%d] Waiting...
", __func__, __LINE__); 
        if ((rc = hp_wait(hp->pid)) < 0) 
            break; 
        if (verbose > 1) 
            fprintf(stderr, "[%s:%d] Getting registers.
", __func__, __LINE__); 
        if ((rc = hp_get_regs(hp->pid, &iregs)) < 0) 
            break; 
    } while (0)
#if __WORDSIZE == 64
    #define HP_PASS_ARGS2FUNC(A,FN,ARG1,ARG2) 
    do { 
        A.regs.rsi = ARG2; 
        A.regs.rdi = ARG1; 
        A.regs.rip = FN; 
        A.regs.rax = 0; 
    } while (0)
    #define HP_REDZONE 128
        /* David Yeager pointed this out. http://en.wikipedia.org/wiki/Red_zone_(computing) */
#else /* __WORDSIZE == 64 */
#define HP_PASS_ARGS2FUNC(A,FN,ARG1,ARG2) 
    do { 
        if (verbose > 1) 
            fprintf(stderr, "[%s:%d] Copying Arg 1 to stack.
", __func__, __LINE__); 
        if ((rc = hp_pokedata(hp->pid, HP_REG_SP(iregs) + sizeof(size_t), ARG1, verbose)) < 0) 
            break; 
        if (verbose > 1) 
            fprintf(stderr, "[%s:%d] Copying Arg 2 to stack.
", 
                __func__, __LINE__); 
        if ((rc = hp_pokedata(hp->pid, HP_REG_SP(iregs) + 2 * sizeof(size_t), 
                              ARG2, verbose)) < 0) 
            break; 
        A.regs.eip = FN; 
        A.regs.eax = 0; 
    } while (0)
    #define HP_REDZONE 0
#endif /* __WORDSIZE == 64 */
        /* Prepare the child for injection */
        if (verbose > 1)
            fprintf(stderr, "[%s:%d] Attaching to PID %d
", __func__, __LINE__, hp->pid);
        //挂载目标进程
        if ((rc = hp_attach(hp->pid)) < 0)
            break;
        if (verbose > 1)
            fprintf(stderr, "[%s:%d] Waiting...
", __func__, __LINE__);
        if ((rc = hp_wait(hp->pid)) < 0)
            break;
        if (verbose > 1)
            fprintf(stderr, "[%s:%d] Getting original registers.
", __func__, __LINE__);
        if ((rc = hp_get_regs(hp->pid, &oregs)) < 0)
            break;
        memcpy(&iregs, &oregs, sizeof(oregs));
        HP_REG_SP(iregs) -= HP_REDZONE;
        if (verbose > 1)
            fprintf(stderr, "[%s:%d] Copying stack out.
", __func__, __LINE__);
        for (idx = 0; idx < sizeof(stack)/sizeof(uintptr_t); ++idx) {
            if ((rc = hp_peekdata(hp->pid, HP_REG_SP(iregs) + idx * sizeof(size_t), &stack[idx], verbose)) < 0)
                break;
        }
        if (rc < 0)
            break;
        /* Call malloc */
        HP_NULLIFYSTACK();
        HP_PASS_ARGS2FUNC(iregs, hp->fn_malloc, tgtsz, 0);
        HP_SETEXECWAITGET("malloc");
        result = HP_REG_AX(iregs);
        heapptr = HP_REG_AX(iregs); /* keep a copy of this pointer */
        /* Copy data to the malloced area */
        if (verbose > 1)
            fprintf(stderr, "[%s:%d] Copying "LU" bytes to 0x"LX".
", __func__, __LINE__, tgtsz, result);
        if (!result)
            break;
        //向目标进程注入so数据
        if ((rc = hp_copydata(hp->pid, result, mdata, tgtsz, verbose)) < 0)
            break;
        /* Call dlopen 调用dlopen */
        HP_NULLIFYSTACK();
        HP_PASS_ARGS2FUNC(iregs, hp->fn_dlopen, result /* value from malloc */, (RTLD_LAZY | RTLD_GLOBAL));
        HP_SETEXECWAITGET("dlopen");
        result = HP_REG_AX(iregs);
        if (verbose > 0)
            fprintf(stderr, "[%s:%d] Dll opened at 0x"LX"
", __func__, __LINE__, result);
        if (outaddr)
            *outaddr = result;
        /* Call dlsym 调用dlsym */
        if (symbol && hp->fn_dlsym && result != 0) 
        {
            HP_NULLIFYSTACK();
            HP_PASS_ARGS2FUNC(iregs, hp->fn_dlsym, result /* value from dlopen */, (heapptr + dllsz));
            HP_SETEXECWAITGET("dlsym");
            result = HP_REG_AX(iregs);
            if (verbose > 0)
                fprintf(stderr, "[%s:%d] Symbol %s found at 0x"LX"
", __func__, __LINE__, symbol, result);
            if (result != 0) 
            {
                HP_NULLIFYSTACK();
                if (datasz > 0) 
                {
                    HP_PASS_ARGS2FUNC(iregs, result /* value from dlsym */, (heapptr + dllsz + symsz), datasz);
                } 
                else 
                {
                    HP_PASS_ARGS2FUNC(iregs, result /* value from dlsym */, 0, 0);
                }
                HP_SETEXECWAITGET(symbol);
                result = HP_REG_AX(iregs);
                if (verbose > 0)
                    fprintf(stderr, "[%s:%d] Return value from invoking %s(): %p
", __func__, __LINE__, symbol, (void *)result);
                if (outres)
                    *outres = result;
            } 
            else 
            {
                if (verbose > 0)
                    fprintf(stderr, "[%s:%d] Unable to find %s(). Dll might "
                            "already have been injected earlier.
",
                            __func__, __LINE__, symbol);
                if (outres)
                    *outres = 0;
            }
        } 
        else 
        {
            if (verbose > 1 && symbol)
                fprintf(stderr, "[%s:%d] %s not invoked as dlsym() wasn't "
                        "found.
", __func__, __LINE__, symbol);
            else if (verbose > 1)
                fprintf(stderr, "[%s:%d] No symbol was specified. _init() might"
                        " have been invoked.
", __func__, __LINE__);
            if (outres)
                *outres = 0;
        }
        /* Original reset */
        if (verbose > 1)
            fprintf(stderr, "[%s:%d] Setting original registers.
", __func__, __LINE__);
        if ((rc = hp_set_regs(hp->pid, &oregs)) < 0) 
        {
            fprintf(stderr, "[%s:%d] PID %d will be unstable.
", __func__, __LINE__, hp->pid);
            break;
        }
        if (verbose > 1)
            fprintf(stderr, "[%s:%d] Copying stack back.
",
                    __func__, __LINE__);
        for (idx = 0; idx < sizeof(stack)/sizeof(uintptr_t); ++idx) 
        {
            if ((rc = hp_pokedata(hp->pid, HP_REG_SP(oregs) - HP_REDZONE + idx * sizeof(size_t), stack[idx], verbose)) < 0)
                break;
        }
        if (rc < 0)
            break;
        if (verbose > 1)
            fprintf(stderr, "[%s:%d] Executing...
", __func__, __LINE__);
        if ((rc = hp_exec(hp->pid)) < 0)
            break;
    } while (0);
    if (rc < 0) {
        if (hp->verbose > 1)
            fprintf(stderr, "[%s:%d] Detaching from PID %d
", __func__,
                    __LINE__, hp->pid);
        if (hp_detach(hp->pid) < 0) {
            if (hp->verbose > 0)
            fprintf(stderr, "[%s:%d] Error detaching from PID %d
", __func__,
                        __LINE__, hp->pid);
            rc = -1;
        }
    }
    if (mdata)
        free(mdata);
    mdata = NULL;
#undef HP_PASS_ARGS2FUNC
#undef HP_SETEXECWAITGET
#undef HP_NULLIFYSTACK
#undef HP_REG_IP_STR
#undef HP_REG_IP
#undef HP_REG_SP
#undef HP_REG_AX
#undef HP_REDZONE
    return rc;
}

Relevant Link:

https://github.com/vikasnkumar/hotpatch
https://linuxtoy.org/archives/hotpatch.html
http://www.selectiveintellect.com/hotpatch.html

6. 基于软件输入控制漏洞(overflow)进行代码注入

Relevant Link:

http://www.vulnhunt.com/products/zhuizong/
http://www.vulnhunt.com/down/%E8%BF%BD%E8%B8%AA%E6%8A%80%E6%9C%AF%E7%99%BD%E7%9A%AE%E4%B9%A6.pdf

7. 动态共享库的保护技术

基于动态共享库的原理，针对动态共享库的保护主要有2种思路

1. 动态共享库的完整性验证: 对动态共享库进行数字签名，在程序加载动态共享库时对其签名进行验证
    1) 在对动态共享库文件进行数字签名之前，在这些文件ELF内增加一个名为signature的节区，用来存放签名的结果。该节区的内容都初始化为null字节。签名的过程为，先对文件的内容用SHA1算法生成摘要，再对摘要用RSA私钥进行加密，最后将签名的结果保存在signature节区内
    2) 验证动态共享库文件的数字签名。在可执行程序载入内存后，系统根据dynamic节区的内容，依次载入程序所需的动态共享库文件。在每个动态共享库文件载进内存之前，先读取其 signature节区，验证保存的数字签名是否正确。验证如果不能通过，将给出相应的警示

2. 库函数调用的合法性验证: 在动态共享库签名验证通过的前提下，进一步验证对库函数调用的合法性，获取库函数调用信息。库函数完整的调用信息，包括
    1) 库函数调用发生的地址 
    /*
    由于Linux系统中默认采用惰性过程链接的方式，在程序开始运行时，GOT表中并未保存库函数的内存地址。但通过设置进程环境变量LD _BIND_NOW，因此可以让进程在动态共享库载入内存，动态链接完成后，即解析出所有库函数的地址，填入GOT表中
    */
    2) 库函数的内存地址
    程序中库函数的调用指令，其目的地址是该库函数相应的PLT表项的地址。而在PLT表项中，第1条跳转指令的目的地址，由相应的GOT表项内容给出，而该GOT表项内，存放着相应库函数的地址
因此，在程序运行前，首先读取磁盘上静态的可执行程序的PLT表，获取每个表项的地址。接着通过每个PLT表项中的第一条指令目的地址，获取所有库函数的内存地址。最后，扫描程序的代码区，根据函数调用指令的目的地址，与PLT表项地址的匹配，找到每一条库函数调用指令，获取该调用点指令的地址。将所有调用点地址和相应的目的地址，保存为一个库函数调用信息表，最后的结果是形成一张二维表: [调用点: 被调用函数的目标地址]，这作为一个基准HASH库，在运行中将现有的状态和基础库进行比较以此来识别当前进程是否处于异常状态

0x1: 运行时的监控和库函数调用信息的验证

将程序的监控验证点设在程序的函数调用处。由硬件动态捕获该调用指令
Intel处理器的"IA32_DEBUGCTL MSR"寄存器提供了支持。它的第2位是BTF标志位，当与EFLAG的TF位同时置为1时，可以让程序在发生跳转的指令处陷入。我们仅需要在陷入后识别出函数调用指令，然后在进一步的验证中，识别出合法的库函数调用
库函数调用信息的验证分为调用点地址的验证和相应的目的地址的验证。验证的方法是，将这2个地址与调用信息表内保存的地址对相比较。验证有4种结果

1. 若调用点地址和相应的目的地址与表内某一地址对相符合，则认为这是一次合法的库函数调用
2. 若调用点地址与目的地址均不在表中出现，则认为是一次普通的函数调用
3. 若调用点地址在表中存在，而相应的目的地址不符合，则认为函数被重定向或者发生调用的代码被修改，即遭到了黑客的劫持攻击
4. 若调用点地址在表中不存在，但目的地址却在表中可以查到，则认为是一次非法的库函数调用，即发生了溢出、代码注入攻击

0x2: 实现方式

1. 动态共享库的数字签名的验证，由数字签名验证内核模块来实现，而该模块通过Linux的LSM(Linux SecurityModules)所提供的接口来实现对动态共享库加载的hook
2. 由用户态的定时指定的监控程序，实现运行时的监控和库函数调用信息的验证

Relevant Link:

file:///C:/Users/zhenghan.zh/Downloads/%E5%8A%A8%E6%80%81%E5%85%B1%E4%BA%AB%E5%BA%93%E4%BF%9D%E6%8A%A4%E6%96%B9%E6%B3%95%E7%A0%94%E7%A9%B6.pdf

================ End