Data-flow testing (DFT) checks the correctness of variable definitions by observing their corresponding uses. It has been empirically proved to be more effective than control-flow testing in fault detection, however, its complexities still overwhelm the testers in practice. To tackle this problem, we introduce a hybrid testing framework: (1) The core of our framework is symbolic execution, enhanced by a novel guided path search to improve testing performance; and (2) we systematically cast DFT as reachability checking in software model checking to complement SE, yielding practical DFT that combines the two techniques’ strengths. We implemented our framework on the state-of-the-art symbolic execution tool KLEE and model checking tools BLAST, CPAchecker and CBMC, and extensively evaluate it on 30 real-world subjects with collectively 22,793 def-use pairs. The enhanced SE approach not only covers more pairs, but also reduces testing time by 10∼43%. The model checking approach can effectively weed out infeasible pairs that KLEE cannot infer by 70.1∼95.8%. Finally, for all subjects, our hybrid approach can improve data-flow coverage by 28.7∼46.3%, and reduce testing time by up to 64.6% than the symbolic execution approach alone. This hybrid approach also enables the cross-checking of each component for reliable and robust testing results.
Our approach was built on four different tools, which are summarized as follows.
Our version of KLEE can be downloaded, which is adapted for data-flow testing.
We carefully constructed a repository of benchmark subjects for evaluating data-flow testing techniques. It includes subjects from three sources:
CBMC uses --unwind and --depth, respectively, to limit the number of times loops to be unwound and the number of program steps to be processed. The parameters used in our evaluation are listed below.
|Name||CBMC's --unwind||CBMC's --depth|
In summary, the cut point guided search strategy outperforms the existing state-of-the-arts. It covers more def-use pairs, and at the same time reduces the testing time by 10~43% and the number of explored paths by 13~69%. Therefore, the symbolic execution-based approach, enhanced with the cut point guided search strategy, is effective for data-flow testing.
In summary, model checking based approach is effective for data-flow testing. BLAST, CPAchecker and CBMC can give consistent conclusions on the majority of def-use pairs. CPAchecker and CBMC have the best performance in number of detected feasible and infeasible pairs, and in general CPAchecker is much faster than CBMC. Due to the algorithm differences, the BMC-based approach, i.e., CBMC, is more effective on small-scale subjects, while the CEGAR-based approach, i.e., BLAST and CPAchecker, is more effective on large and complex subjects.
In summary, the hybrid approach, which combines symbolic execution and model checking, is more effective than individual approach alone. The model checking approach can weed out infeasible pairs that the symbolic execution approach cannot tell by 70.1%~95.8%. Compared with KLEE, the hybrid approach can improve data-flow coverage by 28.7~46.3%. In particular, KLEE+CPAchecker achieves the best performance, which reduces testing time by 64.6% for all 30 subjects, and at the same time improves data-flow coverage by 37.1%.
For more information, please feel free to contact us: Ting Su, Chengyu Zhang, Yichen Yan.
Last modified: 2018.3