Proj EULibHarn Paper Reading: IntelliGen: Automatic Driver Synthesis for FuzzTesting

Abstract

本文: IntelliGen
目标: 生成fuzz driver
步骤：

确定entry function, 评价权重
通过层次参数替换(hierarchical parameter replacement)和类型推断生成driver
实验效果:
实验对象: Android Open-Source Project; Google's fuzzer-test-suite; 工业上的其他合作方
竞争软件: FUDGE, FuzzGen
效果:
average 1.08×-2.03× more basic blocks and 1.36×-2.06× more paths
找到10多个bugs

P4: 本文认为生成Driver的两个挑战1. 找到有高价值（code coverage + memcpy)的入口函数 2. 如何生成

FUDGE: 会生成大量的candidate drivers，让用户去自行挑选（但是这里的叙述，包括related work中对signature的描述，还有直接给用户等很令人怀疑）
FuzzGen: 需要高质量的test cases

A. Entry Function Locator
文中提到一个好的driver会绕过许多检查直达深层逻辑?
这里强调的三种func分别是:

B. Fuzz Driver Synthesizer

需要能将fuzz engine生成的输入正确地转化为参数
亮点：
挑战：对于指针类型的参数，常常解引用前都不知道具体期待的类型
本文：到第一次使用才给指针类型赋值(lazy store)

LLVM, LLVMFuzzerTestOneInput()，生成IR等级的driver+librarycode->fuzzer

Choosing effective entry functions: 用户要自己找适合的entry function
Avoiding redundant memory assignments：只lazy store哪些没有被设定为occupied的区域
Synthesizing complex arguments for the entry function
Assigning appropriate values for arguments: IntelliGen 扫描函数的 IR 以搜索比较指令
Filtering out useless drivers：自动执行下，如果crash就丢掉。此外，为了避免memory leak，IntelliGen还把全部的内存分配释放都hook

数据集:

指标: basic block coverage, path coverage
通过llvm-cov收集

硬件环境，每个driver运行10次，每次4线程跑6个小时，表中为平均值。
对象：IntelliGen，FuzzGen

疑问: We do not list the bugs found in these libraries since these do not have a standard bug list.

Writing a fuzz driver manually seriously hinders the efficiency of fuzz testing.
The quality of fuzz drivers will drastically impact the performance of fuzz testing
The performance of driver synthesis can be improved with more domain knowledge.
The criteria for identifying effective entry functions is largely undetermine