Symbolic execution

This post is part of the Automated analysis of Ethereum smart contracts series. Take a look there to learn more about related topics.

In a concrete execution, a program is run on a specific input and a single control flow path is explored. In contrast, symbolic execution [1] [2] can explore multiple paths that a program could take under different inputs. The key idea is to allow a program to take symbolic values as inputs. Execution is performed maintaining for each explored control path a formula that describes the conditions satisfied by the branches taken along that path, and a symbolic memory store that maps variables to symbolic expressions or values. Next we present a C code example, and its symbolic execution tree, along with its symbolic stores and path constraints.

void aFunction(int a, int b) {
	int x = 1, y = 0;
	
	if(a != 0) {
		y = 3 + x;
	
		if(b == 0) {
			x = 2 * (a + b);
		}
	}
	
	assert(x - y != 0);
}

Each execution state, labeled with an uppercase letter, shows the statement to be executed, the symbolic store σ, and the path constraints π. Leaves are evaluated against the condition in the assert statement. This example was taken from [1].

The main goal of this technique, in the context of software testing, is to explore as many different program paths as possible in a given amount of time; then with each path we can generate a set of concrete input values that execute it, to finally check for the presence of errors. From a test generation perspective, it allows the creation of high coverage test suites, while from a bug finding perspective, it provides developers with a concrete input that triggers the bug, which can be used to confirm and debug the error independently of the symbolic execution tool that generated it.

Given a path constraint, meaning the boolean formula mentioned, the generation of concrete input values that satisfy it is typically based on an off-the-shelf satisfiability modulo theories solver, or SMT solver. Constraint solving is one of the main challenges faced by symbolic execution, SMT solvers can scale to complex combinations of constraints over hundreds of variables, however, constructs such as non linear arithmetic pose a major obstacle to efficiency and decidability.

Another challenge of this technique is state space explosion. Language constructs such as loops might exponentially increase the number of execution paths. It is thus unlikely that a symbolic execution engine can exhaustively explore all the possible paths within a reasonable amount of time. There are two key approaches that have been used to address this problem: heuristically prioritizing the exploration of the most promising paths, and using sound program analysis techniques to reduce the complexity of the path exploration [2].

Classical symbolic execution cannot deal with external code not traceable by the executor, nor with complex constraints solving. To cope with these issues concrete and symbolic execution can be mixed into concolic execution.

One popular concolic execution approach, known as dynamic symbolic execution [1], is to have concrete execution drive symbolic execution. After choosing an arbitrary input to begin with, it executes the program both concretely and symbolically. Whenever the concrete execution takes a branch, the symbolic execution is directed toward the same branch and the constraints extracted from the branch condition are added to the current set of path constraints. As a consequence, the symbolic engine does not need to invoke the constraint solver to decide whether a branch condition is satisfiable. In order to explore different paths, the path conditions given by one or more branches can be negated to generate a new input.

There are several approaches for bug finding in smart contracts that rely on symbolic execution. Oyente [3] was one of the first published approaches, it focuses on known security vulnerabilities. It was later extended as Gasper [4] to detect gas costly patterns, Osiris [5] to focus on integer errors, and SASC [6] to generate a diagram of invocation relationship and to find potential logic risks. Pakala [7] aims at efficiently handling multiple transaction counterexamples, it symbolically executes once each public function that produces a state change, to then join resulting states in all the possible orders up to a bound. It only looks for vulnerabilities related to sending ether from the contract under study to an address in control of an attacker.

Another tool that relies on symbolic execution is Mythril [8], it wasn't formally published but it's open source, it was informally introduced at HITBSecConf 2018 [9]. Later on, MythX [10] integrated Mythril with Harvey (a fuzzer that will be discussed in one of the following sections of the series), and a new static analysis tool: Maru [11], along with Maestro, a component that syncs them all. Maestro also estimates the confidence of each vulnerability found according to what each tool reported. MythX is a private tool accessible through a paid subscription. Both Mythril and MythX only search for predefined security properties.

teEther

teEther [12] takes advantage of a common static analysis technique called slicing to reduce the state space explosion. A program slice is a subset of the original program statements relevant to a particular computation.

teEther's goal is to check whether a contract is vulnerable to arbitrary ether transfers, thus it only cares about possible execution paths that end up in specific instructions executed with parameters under an attacker's control. These instructions, referred to as "critical instructions", are: call, callcode, delegatecall, and selfdestruct.

To take advantage of an unprotected delegatecall or callcode, for example, a selfdestruct statement could be placed on the callee.

This tool first explores the contract to find these critical instructions, then calculates slices that reach them, and finally performs symbolic execution on the slices to potentially generate transaction sequences that allow arbitrary ether transfers.

Another approach that implements a similar technique is sCompile [13], but we won't expand on it because it's neither published nor has a public implementation.

Manticore

Manticore [14] is an open source program developed by Trail of Bits [15], a company that specializes in software security audits. It's a dynamic symbolic execution framework for analyzing binaries and an arbitrary number of interacting Ethereum smart contracts.

Its core engine is the source of its flexibility; it implements a generic platform-agnostic symbolic execution engine that makes few assumptions about the underlying execution model. State exploration also contributes to its flexibility, because it implements a variety of heuristics to select the next state to explore and to determine when a state must be concretized.

Users are encouraged to create modular, event-based analyses using Manticore's plugin system. Arbitrary points within the application can broadcast various symbolic execution events that can be handled by an event subscriber outside of the tool.

The authors affirm that Manticore achieves on average 66% code coverage with a default analysis. This analysis was performed on 100 contracts taken from mainnet, but they didn't specify how and why those specific contracts were chosen.

References

[1] R. Baldoni, E. Coppa, D. C. D’Elia, C. Demetrescu, and I. Finocchi, “A survey of symbolic execution techniques,” ACM Comput. Surv., vol. 51, no. 3, 2018.

[2] C. Cadar and K. Sen, “Symbolic execution for software testing: Three decades later.” 2010.

[3] L. Luu, D.-H. Chu, H. Olickel, P. Saxena, and A. Hobor, “Making smart contracts smarter,” in Proceedings of the 2016 ACM SIGSAC conference on computer and communications security, pp. 254–269, ACM, 2016.

[4] T. Chen, X. Li, X. Luo, and X. Zhang, “Under-optimized smart contracts devour your money,” in 2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER), pp. 442–446, IEEE, 2017.

[5] C. F. Torres, J. Schütte, et al., “Osiris: Hunting for integer bugs in ethereum smart contracts,” in Proceedings of the 34th Annual Computer Security Applications Conference, pp. 664–676, ACM, 2018.

[6] E. Zhou, S. Hua, B. Pi, J. Sun, Y. Nomura, K. Yamashita, and H. Kurihara, “Security assurance for smart contract,” in 2018 9th IFIP International Conference on New Technologies, Mobility and Security (NTMS), pp. 1–5, IEEE, 2018.

[7] Pakala’s repository. Accessed: 2019-10-03.

[8] Mythril’s repository. Accessed: 2019-10-22.

[9] B. Mueller, “Smashing ethereum smart contracts for fun and real profit.” Accessed: 2019-10-22.

[10] Mythx’s official website. Accessed: 2019-10-22.

[11] “The tech behind mythx smart contract security analysis". Accessed: 2019-11-28.

[12] J. Krupp and C. Rossow, “teether: Gnawing at ethereum to automatically exploit smart contracts,” in 27th {USENIX} Security Symposium ({USENIX} Security 18), pp. 1317–1333, 2018.

[13] J. Chang, B. Gao, H. Xiao, J. Sun, and Z. Yang, “scompile: Critical path identification and analysis for smart contracts,” arXiv preprint arXiv:1808.00624, 2018.

[14] M. Mossberg, F. Manzano, E. Hennenfent, A. Groce, G. Grieco, J. Feist, T. Brunson, and A. Dinaburg, “Manticore: A user friendly symbolic execution framework for binaries and smart contracts.”, jul 2019.

[15] Trail of Bits’ official website. Accessed: 2019-09-12.

5 Likes

Thanks for the great article Vera! I’d like to add that it is possible to check custom properties with MythX using:

Both symbolic analysis and grey-box fuzzing are used to search for counterexamples. We’ll also release a specification language shortly to allow checking of more complex properties without writing extra Solidity code.

5 Likes

Hey Vera,

Thank you for this great introduction to symbolic execution!

The authors affirm that Manticore achieves on average 66% code coverage with a default analysis. This analysis was performed on 100 contracts taken from mainnet, but they didn’t specify how and why those specific contracts were chosen.

Here are the contracts we used to evaluate Manticore in the paper. Code coverage was gathered with the default setting from the command-line interface, e.g., $ manticore file.sol

While 66% is a great deal of coverage without any configuration, Manticore was limited because most contracts require configuration to generate realistic inputs. For example, most contracts are provided an address for their constructor at the time of their deployment. In its default state, Manticore considers such calls to an unknown contract as failures which explains why so many paths (33%) are not explored.

Manticore is most useful when operated by a human who can overcome these configuration issues. As proof of its value and the quality of its engineering, we have used Manticore this way on more than half of our security audits to date.

We’re working on ways to let Manticore explore more of a contract with less configuration. In particular, we have a new feature in development that over-approximates state that should help. As an example, you can consider the return value of an external call symbolic to allow Manticore to explore more paths. This lets you obtain the CFG, however, you would be unable to reproduce the input that got you there.

Here’s the code we’re working on:

If your goal is to recover the CFG of a smart contract, then symbolic execution is not the best approach. Instead, I would point out our own evm_cfg_builder, which uses abstract interpretation to find all branches of the CFG from the contract memory.

Finally, we recently merged all our training material into a single resource. “Building Secure Contracts” shows how to maximize usage of Manticore to find subtle bugs that require precise modeling, like rounding issues. This resource also demonstrates other techniques, like optimal approaches for fuzzing with Echidna and static analysis with Slither. Please let us know if you have feedback on it!

  • Josselin Feist & Dan Guido
5 Likes

I knew the contracts used to evaluate Manticore in the paper are in GitHub, but I can’t find information on how the crawl was performed, and I’m just curious about why and how those 100 contracts were chosen :slight_smile:

3 Likes

Ah! My bad. We downloaded all the contracts on the Ethereum blockchain via Google’s BigQuery dataset. Then, we removed duplicates using ad-hoc heuristics (e.g., by reviewing an instruction frequency histogram). Then, we selected 100 random contracts from this “unique” dataset.

We felt this was sufficient for a short paper, but it’s not necessarily representative of what we see on engagements. We tend to get higher coverage, especially since we identify and overcome analysis roadblocks during the project and use the human operator to supply starting inputs where required.

2 Likes