Mach-O笔记

Mach-O是一种可执行文件,在Mac电脑上有很多不同的可执行文件,如打包的shell脚本,通c或者c++编译生成的可执行文件,还有一些是通过开发工具,如XCode创建出来的,对于不带附件信息的可以执行文件(如bundle,info.plist)直接点击就能运行,因为这部分执行文件是基于Command-Tool生成的,所有的依赖全部打包在可执行文件.对于我们的应用程序则需一些前置的准备工作,需要由系统读取bundleinfo.plist初始化沙盒空间后运行

查看Mach-O的组成

  • 通过file mach-o可以查看它的CPU架构

    Mach-O 64-bit executable x86_64
    
  • 通过Xcode自带的otool工具可以查看mach-o中的具体信息

    -f print the fat headers
    -a print the archive header
    -h print the mach header
    -l print the load commands
    -L print shared libraries used
    -D print shared library id name
    -t print the text section (disassemble with -v)
    -x print all text sections (disassemble with -v)
    -p <routine name>  start dissassemble from routine name
    -s <segname> <sectname> print contents of section
    -d print the data section
    -o print the Objective-C segment
    -r print the relocation entries
    -S print the table of contents of a library (obsolete)
    -T print the table of contents of a dynamic shared library (obsolete)
    -M print the module table of a dynamic shared library (obsolete)
    -R print the reference table of a dynamic shared library (obsolete)
    -I print the indirect symbol table
    -H print the two-level hints table (obsolete)
    -G print the data in code table
    -v print verbosely (symbolically) when possible
    -V print disassembled operands symbolically
    -c print argument strings of a core file
    -X print no leading addresses or headers
    -m don't use archive(member) syntax
    -B force Thumb disassembly (ARM objects only)
    -q use llvm's disassembler (the default)
    -Q use otool(1)'s disassembler
    -mcpu=arg use `arg' as the cpu for disassembly
    -j print opcode bytes
    -P print the info plist section as strings
    -C print linker optimization hints
  • 通过MachOView通过图形化的界面查看

  • 下面的官方给定一个层级关系图,事实也应证了这一点

Mach-O的组成-Header

  • 上面可视化的每一部分都能找到具体的代码定义
  • header定义以及可视化界面图示,主要描述了CPU 架构、文件类型以及加载命令
struct mach_header_64 {
    uint32_t    magic;     // CPU的执行单元 32或者64bit
    cpu_type_t  cputype;    // CPU类型,如X86_64
    cpu_subtype_t   cpusubtype; //CPU子类型
    uint32_t    filetype;  // 文件类型 MH_EXECUTE, MH_FVMLIB, MH_DYLIB,MH_DYLINKER and MH_BUNDLE file types 
    uint32_t    ncmds;      // 需要load command数量
    uint32_t    sizeofcmds; // 需要load command的大小
    uint32_t    flags;     // 标志位
    uint32_t    reserved;   // 保留字段
};

Mach-O的组成load command: 以LC开头,一个Mach-O有很多不同类型的Load command

  • load command记录了程序运行中动态加载的每一条指令,首先会load dyld这个库,因为其他的库需要依赖它进行rebase和binding C struct load_command {
    uint32_t cmd; // cmd的类型
    uint32_t cmdsize; // cmd的大小,bytes
    };

  • Load Command包含多种数据结构
#define LC_SEGMENT  0x1 /* segment of this file to be mapped */
#define LC_SYMTAB   0x2 /* link-edit stab symbol table info */
#define LC_SYMSEG   0x3 /* link-edit gdb symbol table info (obsolete) */
#define LC_THREAD   0x4 /* thread */
#define LC_UNIXTHREAD   0x5 /* unix thread (includes a stack) */
#define LC_LOADFVMLIB   0x6 /* load a specified fixed VM shared library */
#define LC_IDFVMLIB 0x7 /* fixed VM shared library identification */
#define LC_IDENT    0x8 /* object identification info (obsolete) */
#define LC_FVMFILE  0x9 /* fixed VM file inclusion (internal use) */
#define LC_PREPAGE      0xa     /* prepage command (internal use) */
#define LC_DYSYMTAB 0xb /* dynamic link-edit symbol table info 动态符号表*/
#define LC_LOAD_DYLIB   0xc /* load a dynamically linked shared library */
#define LC_ID_DYLIB 0xd /* dynamically linked shared lib ident */
#define LC_LOAD_DYLINKER 0xe    /* load a dynamic linker */
#define LC_ID_DYLINKER  0xf /* dynamic linker identification */
#define LC_PREBOUND_DYLIB 0x10  /* modules prebound for a dynamically */
...

Mach-O的组成segment_command

  • segment load command定义了如何将Data中的各个Segment加载入内存,app大部分位于各个Segment中。

    struct segment_command_64 { /* for 64-bit architectures */
    uint32_t    cmd;        /* LC_SEGMENT_64 */
    uint32_t cmdsize; /* includes sizeof section_64 structs */
    char segname[16]; /* segment name */
    uint64_t vmaddr; /* memory address of this segment */
    uint64_t vmsize; /* memory size of this segment */
    uint64_t fileoff; /* file offset of this segment */
    uint64_t filesize; /* amount to map from the file */
    vm_prot_t maxprot; /* maximum VM protection */
    vm_prot_t initprot; /* initial VM protection */
    uint32_t nsects; /* number of sections in segment */
    uint32_t flags; /* flags */
    };
  • 图示

  • Segment的数据类型

#define SEG_PAGEZERO    "__PAGEZERO" //一个空的Page,预留用存储程序加载过程中写入的数据    ??


#define SEG_TEXT    "__TEXT"    //代码只读段
#define SEG_DATA    "__DATA"    //数据段
#define SECT_BSS    "__bss"     /* the real uninitialized data section*/
                    /* no padding */
#define SECT_COMMON "__common"  /* the section common symbols are */
                    /* allocated in by the link editor */

#define SEG_OBJC    "__OBJC"    /* objective-C runtime segment */
#define SECT_OBJC_SYMBOLS "__symbol_table"  /* symbol table */
#define SECT_OBJC_MODULES "__module_info"   /* module information */
#define SECT_OBJC_STRINGS "__selector_strs" /* string table */
#define SECT_OBJC_REFS "__selector_refs"    /* string table */

#define SEG_ICON     "__ICON"   /* the icon segment */
#define SECT_ICON_HEADER "__header" /* the icon headers */
#define SECT_ICON_TIFF   "__tiff"   /* the icons in tiff format */

#define SEG_LINKEDIT    "__LINKEDIT"    /* the segment containing all structs */
                    /* created and maintained by the link */
                    /* editor.  Created with -seglinkedit */
                    /* option to ld(1) for MH_EXECUTE and */
                    /* FVMLIB file types only */

#define SEG_UNIXSTACK   "__UNIXSTACK"   /* the unix stack segment */

#define SEG_IMPORT  "__IMPORT"  /* the segment for the self (dyld) */
                    /* modifing code stubs that has read, */
                    /* write and execute permissions */

Mach-O组成-Section

struct section_64 { /* for 64-bit architectures */
    char        sectname[16];   /* name of this section */
    char        segname[16];    /* segment this section goes in */
    uint64_t    addr;       /* memory address of this section */
    uint64_t    size;       /* size in bytes of this section */
    uint32_t    offset;     /* file offset of this section 物理偏移量*/
    uint32_t    align;      /* section alignment (power of 2) */
    uint32_t    reloff;     /* file offset of relocation entries */
    uint32_t    nreloc;     /* number of relocation entries */
    uint32_t    flags;      /* flags (section type and attributes)*/
    uint32_t    reserved1;  /* reserved (for offset or index) */
    uint32_t    reserved2;  /* reserved (for count or sizeof) */
    uint32_t    reserved3;  /* reserved */
};

Mach-O组成-LC_LOAD_DYLIB

struct dylib {
    union lc_str  name;         /* library's path name */
    uint32_t timestamp;         /* library's build time stamp */
    uint32_t current_version;       /* library's current version number */
    uint32_t compatibility_version; /* library's compatibility vers number*/
};
struct dylib_command {
    uint32_t    cmd;        /* LC_ID_DYLIB, LC_LOAD_{,WEAK_}DYLIB,
                       LC_REEXPORT_DYLIB */
    uint32_t    cmdsize;    /* includes pathname string */
    struct dylib    dylib;      /* the library identification */
};

其他常见Load Command

  • Dyld Rebase和Binding信息,每个Mach-O只能有一份

  • Dyld load command,程序加载的入口,只能有一个

  • 数字签名段

  • 自定义方法

Section(__TEXT,__String)

  • 保存了程序中的printf的语言字符串 ## Section(__DATA_CONST,__got)
  • 保存重定向指针

参考

https://blog.csdn.net/bjtufang/article/details/50628310

原文地址:https://www.cnblogs.com/wwoo/p/macho-bi-ji.html