Proj THUDBFuzz Paper Reading: Generating Highly-structured Input Data by Combining Search-based Testing and Grammar-based Fuzzing

Abstract

文章认为: search-based testing does not generate highly-structured data; grammar-based fuzzing does not generate test case structures
本文: G-EVOSUITE
实验: an empirical study
对象: 3个Json parser的20个java classes
效果: json相关的类branch coverage + 15%，其他的至少负面影响不太大

1. Intro

文章认为: 对于grammar-based fuzzing，用户不但需要指定entry points，还需要人工创建为每个method都要test structure
G-EVOSUITE用演化算法确定句子顺序，以此保证test case structure，而使用grammar-based fuzzing保证input data符合语法。
Evolutionary algorithms create and evolve the test case structure（statement sequence) while grammar-based fuzzing is used to evolve parts of the input data.
Fuzzer令演化算法的初始化population为structured JSON inputs并让变异出的数据结构保持JSON结构
The fuzzer injects structured JSON inputs in the initial population of EA with some prob and manipulates this data to maintain a well-formed JSON structure.
作者过去曾经实现了一个前沿的JAVA test case generator-EVOSUITE。
目前的实验细节:3个JSON parser库，20个类，其中16个需要JSON input，4个non-JSON。16个JSON input用于验证本文的方法增加了code coverage，4个non-JSON用于验证没有影响到EVOSUITE在非json输入类上面的性能。
对每一组，本文分别用了60s, 120s, 180s搜索时间来做对比。
结果: 更高的code coverage

Search-based Test Case Generation

已有的方法常依赖于：

test adequacy criteria: 常用来设置搜索目标。例如: application level，branch distance等
演化算法: 例如: EVOSUITE，是前沿的演化算法test case工具，内置多种遗传算法，包括monotonic genetic alogo, local solvers和many objective algos

这里为了支持同时优化多个目标，本文选择了DynaMOSA

DynaMOSA

将test case本身视为染色体，那么，每个语句都是一个gene。算法采用single point crossover的方式，通过重组语句来生成新的test case，并使用uniform mutation进一步修改子代

Grammar-based fuzzing

开发者往往需要指定application entry point，而且不同的entry point可能对应着不同的语法。
这些软件的目标往往是: allopw satisfying/cover different program conditions
常用策略包括:

符号执行
元信息启发
语法
hybrid

Reasons for combining

开发者往往需要指定application entry point，而且不同的entry point可能对应着不同的语法。

3. Approach

G-EVOSUITE

以EVOSUITE作为test case generator

在此基础上加了一个json fuzzer
稍微修改了DynaMOSA算法

以SNODGE为grammer fuzzer

Initialization

创建初代池

目的: 激活目标class的不同方法
步骤：
1. 随机挑选若干method calls
2. 创建对应的objects
3. 填入参数
- 随机
- 从常量池中抽取
本文的改进
1. 需要json参数的地方能传入json参数

Selection

用perference sorting algo结合多项指标(line, branch, exception和weak mutation)为test case打分，最好的是tournament selection。test case根据是否打开了新branch，会有机会复制自己传到下一代。

Grammar-based Mutation

使得test case不会快速困在局部最优点

uniform mutation
add new <k,v>
add new objs
remove <k,v>
modify <k,v>
reording <k,v>
检查测试用例T中所有input data, 确定string inputs，使用JSON parser重新验证是否有效

4. Empirical Study

主要回答两个问题

To what extend does grammer-based fuzzing improve the effectiveness of test case generation in EVOSUITE?
What is the effectiveness of combining grammar-based fuzzing and search-based testing over different search budgets?