Mobile apps have become ubiquitous and drastically increased in number over the recent years. However, it is challenging to guarantee their quality. First, they are event-centric programs with rich graphical user interfaces (GUIs), and interact with complex environments (e.g., users, devices, and other apps). Second, they are typically developed under the time-to-market pressure, thus may be inadequately tested before releases. When performing testing, developers tend to exercise those functionalities or usage scenarios that they believe to be important, but may miss bugs that their designed tests fail to expose.
Stoat (STOchastic model App Tester) is a novel guided approach to perform stochastic model-based testing on Android Apps. The idea is to thoroughly test the functionalities of an app from its GUI model, and validate the app’s behavior by enforcing various user/system interactions.
Stoat operates in a unique two-phase process to test an app: (1) construct a stochastic app model (in the form of stochastic finite state machine); and (2) iteratively mutate/refine the stochastic model (by perturbing the probability values of the model transitions), and guide test generation toward achieving high code, model coverage as well as exhibiting diverse event sequences.
1. Model Construction: Stoat uses a dynamic analysis technique, enhanced by a weighted UI exploration strategy and static analysis, to effectively and efficiently construct app models.
Stoat's supported System level events.
Download Stoat tool and user manual here.
Stoat has been compared with the state-of-the-art.
Tool Name | Approach | Description |
MobiGUITAR | model-based GUI exploration | MobiGUITAR is the extension of AndroidRipper, which implements a systematic exploration and a random exploration strategies when constructing models. |
PUMA | model-based GUI exploration | PUMA uses a generic UI automator to sequentially explore GUIs, and stops exploring when all app states have been visited. |
Tool Name | Approach | Description |
Monkey | random fuzzing | Monkey emits a stream of random input events, including both UI and system events, to maximize code coverage. |
A3E | systematic UI exploration | A3E systematically explores app pages and emits UI events by following a depth-first strategy. |
Sapienz | search-based testing | Sapienz uses Monkey to generate the initial test population, and exploits genetic algorithms to optimize the tests to maximize code coverage while minimizing test lengths. |
In Study 1 and 2, Stoat is evaluated on 93 open-source Android apps from F-droid (a popular open-source Android app repository): 68 benchmark apps (widely used in previous research work) and 25 randomly selected apps (to avoid potential evaluation bias).
1. Statistics of the models produced by MobiGUITAR (the systematic strategy --- "M-S", the random strategy --- "M-R" ), PUMA ("PU"), and Stoat ("St") on the 93 open source apps. Note each tool is allocated with one hour for each app and is run on one emulator.
* Line Coverage, #Model States, #Model Transitions.
*Stoat use the similar criterion with AMOLA (C-Lv4)
App Name | AMOLA (#States) |
AMOLA (#Transitions) |
Stoat (#States) |
Stoat (#Transitions) |
AMOLA (Coverage) |
Stoat (Coverage) |
org.jtb.alogcat | 15 | 247 | 10 | 54 | 56% | 63% |
com.example.anycut | 8 | 33 | 7 | 40 | 55% | 67% |
com.evancharlton.mileage | 69 | 532 | 30 | 262 | 33% | 39% |
cri.sanity | 2 | 7 | 16 | 202 | N/A | 12% |
org.jessies.dalvikexplorer | 30 | 301 | 29 | 348 | 64% | 74% |
i4nc4mp.myLock | 5 | 51 | 5 | 51 | 11% | 28% |
com.bwx.bequick | 60 | 250 | 8 | 132 | 39% | 40% |
com.nloko.android.syncmypix | 20 | 96 | 13 | 76 | 17% | 21% |
net.mandaria.tippytipper | 13 | 102 | 17 | 117 | 61% | 81% |
de.freewarepoint.whohasmystuff | 24 | 143 | 18 | 103 | 51% | 82% |
3. Example app models from Stoat
1. Results of the code coverage achieved by A3E (denoted by "A"), Monkey ("M"), Sapienz ("Sa"), and Stoat ("St") grouped by app size on the 93 open source app. Note each tool is allocated with three hours for each app and run on one emulator.
2. Testing statistics of A3E, Monkey, Sapienz, and Stoat in fault detection on the 93 open source app. The configuration is same as the above.
Tool | #Buggy Apps | #Unique Crashes |
A3E | 8 | 8 |
Monkey | 40 | 76 |
Sapienz | 43 | 87 |
Stoat | 68 | 249 |
Find more detailed information in our publication (* denotes corresponding author)
Lingling Fan, Ting Su*, Sen Chen, Guozhu Meng, Yang Liu, Lihua Xu, Geguang Pu
The 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE'18), Montpellier, France.
Lingling Fan, Ting Su*, Sen Chen, Guozhu Meng, Yang Liu, Lihua Xu, Geguang Pu, Zhendong Su
The 40th International Conference on Software Engineering (ICSE'18), Gothenburg, Sweden.
Chunyang Chen, Ting Su*, Guozhu Meng, Zhenchang Xing, Yang Liu
The 40th International Conference on Software Engineering (ICSE'18), Gothenburg, Sweden.
Ting Su, Guozhu Meng, Yuting Chen, Ke Wu, Weiming Yang, Yao Yao, Geguang Pu, Yang Liu, Zhendong Su
European Software Engineering Conference and ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE), September 2017, Paderborn, Germany
Ting Su
International Conference on Software Engineering Companion (ICSE '16), May 2016, Austin, Texas, USA (ACM Student Research Competition First Place)