Some Qtum-x86 Tech Details

Ashley Houston
15 min readMay 19, 2019

--

Here I want to give some details about the upcoming Qtum-x86 VM. There’s been many questions from the community and some from our own developers on how certain aspects work from a higher level, and how it actually ends up working. So, I cover in this two primary topics. First, how an x86 contract is actually put into the blockchain and what it consists of. Second, how DeltaDB is designed to work as the underlying storage data at a consensus level. These are very technical, but are by no means “required reading” for developers intended to use Qtum-x86 for creating smart contracts, or really anyone outside of some niche applications and Qtum Core contributors

Contract creation from coding to putting bytecode into the blockchain

The format and process of implementing contracts is one of the biggest differences between Qtum-x86 and Ethereum’s EVM or even Qtum’s EVM. Typically, smart contract developers will work in Remix or even using solc in order to compile their contracts into bytecode. In EVM contracts, the bytecode sent to the blockchain is simply that, bytecode that begins executing at the 0th byte. This bytecode executes when the miner (or staker) constructs a block and finalizes the contract transaction's placement on the blockchain. The basic low level way this works is as so:

  • The bytecode containing transaction is sent to a blockchain node
  • The miner/staker begins to construct a new block and begins to finalize the bytecode containing transaction
  • The bytecode begins execution at “0”
  • The solidity generated bytecode for the smart contract constructor is executed
  • The relevant “persistent” bytecode is copied into memory (ie, all the data and bytecode that is needed typically excluding the constructor)
  • Constructor modified “constants” are changed in memory to match the final version to be deployed
  • The contract exits, while also specifying the range of memory which should be persisted to the blockchain
  • The blockchain saves the data to the Ethereum Global State Trie, associating the bytecode to be persisted with the address as a “code” field.

When the contract is called again, the following process is used:

  • When the contract is called, execution begins at “0” of the persisted contract bytecode (ie, excluding the constructor)
  • The contract code generated automatically by Solidity parses the ABI data specifying which function and argument values should be acted on
  • The function is executed
  • Function return data is copied into memory and address (begin and end) is specified when exiting contract execution

This is of course in much more detail than most smart contract coders ever need to worry about. However, the Solidity magic can be a leaky abstraction. Until recently, many smart contract writers knew very well about the fixed-size restriction of returning data from a smart contract. The EVM now supports dynamic length return data through some specialized opcodes, and of course Solidity will use them when needed. In addition, the magical generated ABI code has been the subject of at least 1 security vulnerability.

The process for x86 contracts is significantly different, and in some ways more complicated while also being more flexible. Some of this is due to inherit differences in using existing programming languages (ie, C or Rust) rather than a purpose built language that abstracts away all of the details. To begin with, the process of writing a contract is, despite being more transparent, more complicated:

(Right now this is limited to just C, a very low level language compared to Solidity. With other higher level languages in the future we hope to simplify this)

  • A “SimpleABI” file is created which specifies the interface by which the contract can be called externally.
  • The SimpleABI program is run with the ABI file which generates the “dispatch” code, and everything else required so that smart contract developers do not need to worry about the low level details of parsing and creating ABI data
  • The developer then writes the code for all interface functions specified (and other non-interface functions as needed of course) — If an interface function is not implemented, it’d result in linker errors
  • The code is then compiled into separate object files, and the object files linked together into one cohesive ELF file (the standard binary format of Linux and some other unix operating systems)
  • The ELF file is then fed into a custom Qtum program which rips out the interesting data and compiles it into a Qtum-x86 bytecode format file.

For the most part, this kind of process would be mostly automated after setup using something like Makefiles, and Qtum will of course provide template projects which can be changed and modified to avoid the complicated setup process.

The Qtum-x86 bytecode format is not “flat” like Solidity. It has a header specifying some info about the bytecode, as well as 3 separate sections. The data in the header consists of:

  • Options section size
  • Code section size
  • Data section size
  • “reserved” (ie, not used yet)

The three sections contained are:

  • Options — things such as flags to opt-out or opt-in to certain Qtum-x86 features (such as disabling bytecode upgrades), or more rich data such as dependency graphs and metadata.
  • Code — The actual executable data for the contract. This data is stored as read-only and executable.
  • Data — The plain data which is used and/or modified during the execution of the contract. This data is stored as read-write and not executable

Unlike the EVM, in x86 all memory is considered both data and code at the same time. There are protection mechanisms to limit this concept, but it requires for code and data to be clearly separate, and to be stored in separate areas of memory. This protection is desireable to prevent some potential security problems. Although not a security issue itself, but with improperly designed code, can be used to “amplify” a minor bug into something much bigger, by overwriting protective code, or by executing data which the caller has control over. Thus, this necessitates a structured approach to the bytecode format for Qtum-x86. Of course, ELF has the ability to specify all of this and much more, so why not use ELF as the bytecode format? The biggest reason is to simplify the consensus model. Whatever bytecode format is used must be parsed and understood precisely the same by every node as “consensus critical”. ELF is significantly more complicated, and naive parsers in the past have had several security bugs. Thus, a greatly simplified format holding only what we need was decided as the best way to go. This also allows for other potential intermediate output formats from compilers, such as the PE format (the format used for the famous Windows .exe files).

Now that the final bytecode is known, it can be broadcast to the blockchain. The Qtum-x86 method of executing and persisting contracts is similar, but with enough difference that it can impact smart contract developers:

  • The bytecode containing transaction is sent to a blockchain node
  • The miner/staker begins to construct a new block and begins to finalize the bytecode containing transaction
  • The bytecode format is parsed and each section is mapped into the VM’s memory
  • The bytecode begins execution at the 0th location of the “code” memory
  • The “crt0” code inserted by the compiler/linker is executed, to initialize the stack and prepare for execution
  • A system call is used to make the contract aware that it is being constructed (and thus shouldn’t expect that there is a contract call taking place)
  • The smart contract constructor code is executed
  • The code exits and ends execution
  • The blockchain persists the entire contract bytecode format file. The bytecode is saved into DeltaDB as simply the “bytecode” data field for the address. A “delta” indicating the construction, execution, and the bytecode is put into a merkle hash tree in the block header.

When the contract is called again, the following process is used:

  • When the contract is called, execution begins at “0” of the “code” section, in the same way as is done during contract creation
  • The “crt0” code inserted by the compiler/linker is executed, to initialize the stack and prepare for execution
  • A system call is used to make the contract aware that it is being called (and thus don’t call the constructor)
  • The SimpleABI generated code is executed and parses the ABI data sent to the contract
  • The SimpleABI code gets the ABI data off of the “shared contract communication stack” and copies it into the argument stack and executes the user-implemented interface function
  • The user-implemented function and logic is executed
  • The return data from the logic is put onto the argument stack and the function exits
  • The SimpleABI generated code takes the return data and pushes it onto the contract communication stack
  • The contract exits

State management

State in the EVM is quite significantly restricted. It consists of a key-value store where each key and value is fixed size at 256 bits. This restriction can make it difficult to manage the different state namespaces as a human developer. As a result, Solidity handles this by automatically handling the location in the 256 bit key space where each storage variable is placed. This is generated based on the hash of the variable name and/or the location in the source file (depending on exact solidity version etc, this can change). The actual way that this is stored in the database by Ethereum nodes/wallets is using the “state trie”, a Patricia tree. This article by Ethereum covers it in more detail than I probably could. The TL;DR; though is that a verifiable cryptographic tree is used so that nodes can always prove, given the “root hash”, that a piece of state exists and is of the expected value. This can’t be done in a standard merkle tree because the entire state of all contracts in the blockchain are stored and recalculating a naive merkle tree with hundreds of thousands of different state elements would be too computationally heavy, even if verification would be faster than producing the root hash.

In Qtum-x86, one of the biggest goals was to remove these restrictions of state in the EVM. So, in Qtum-x86 it is possible to have dynamic length keys and values. This required a completely new design. Qtum has always been Bitcoin-like in many of it’s designs, due to being a fork of Bitcoin. In Qtum-x86 we are introducing the DeltaDB design. It is a new database which operates on differences, rather than needing to encode the entire state of the blockchain into a single accessible data structure. There is no direct restriction on the size of keys or values of state, though overly large sizes can cause updates to be overly expensive in gas costs, so some state separation is still beneficial to separate large seldom updated data from smaller frequently updated data. All state management in Qtum-x86 is manual for the moment. There is nothing similar to Solidity’s automatic state management for key names. There may be at some later point, but only for higher level languages than C. Regardless, the key space is trivial to manage using typical methods from traditional key-value databases. For instance, if a key-value map named “balances” that is indexed by addresses and stores a simple 64bit integer, then it could be stored like so:

"balances_QddCMpVUf4gKTLseP5XFuVco6xy1YajbK7" -> 1000

Of course, “balances_” might be an integer prefix and the address would be stored as raw bytes (or better yet, the UniversalAddress bytecode format) to save on space. Namespaces and keys are thus much easier to manage manually than an equivalent situation in the EVM with 256 bit keys and values. The actual way this state is stored in the node/wallet databases is very straight forward. Something like so:

"state_XZkyE7XAweUtrizgtry1RoMSv1p4zk9Rw8_balances_QddCMpVUf4gKTLseP5XFuVco6xy1YajbK7" -> 1000

This is where “XZkyE7XAweUtrizgtry1RoMSv1p4zk9Rw8” is the contract address. Also, of course this is all stored as encoded bytes and not wasteful string, this is just for illustration. The straight forward manner in which this type of data can be stored makes modern database caches and prediction significantly more effective than the equivalent situation in the EVM. A notorious issue of the EVM is requiring full nodes to use an SSD because of the unpredictable nature of reading data indexed by “random” hashes. There is some historical data duplication in Qtum-x86 nodes for maintaining consensus, but this is also subject to pruning and thus it is safe to delete most historical data after 500 blocks if the node operator is only interested in the current state, rather than historical states. Historical states in particular can be useful for auditing and some specific light wallet applications, but is not needed for most users.

The actual method this state data is proven on the blockchain is also more simple than the methods used by Ethereum (and Qtum’s current version of the EVM). In Bitcoin’s traditional light wallet “SPV” technology, a merkle tree root hash is stored in the block and this can be used to prove that a transaction occurred in a particular block. However, this concept is too limited for smart contract state as we need to prove more than simply that a contract transaction occurred. We need to prove what state is present, along with transaction receipts and “events” (Ethereum’s “log” concept).

To do this, Qtum-x86 uses a “staggered” merkle tree with nested per-account merkle trees. Most merkle trees encode a list of some singular type of data. With the staggered approach, the merkle tree instead encodes pairs of data, with the first data being a contract address, and the second data being the account “delta tree”. The staggered approach to the root merkle tree allows for simpler location of accounts of interest. ie, if a light node is interested in all times a contract state changes in some way, it is easy and lightweight to encode a proof that the account was modified in some way. Accounts which are not modified in the current block will not be in the root tree. More importantly, this allows for a form of censorship resistance where the concept can be inverted to construct proofs that an account was not modified. This would work by getting a list of all the accounts modified and their respective delta trees. It’s obvious to detect that not all data was received (as the merkle tree hash would not match the block data) or that some of it was modified.

The delta tree itself encodes a list of “deltas”. A delta indicates a change to the contract account. This can include things like a simple (unpure) contract execution, events being emitted by the contract, state changes, or balance changes. The “transaction receipt” concept of Ethereum is also encoded into a delta, so that it is easy to prove a contract execution took place and if the contract encountered an error in execution.

The DeltaDB concept is technically more simple than Ethereum’s state trie concept, yet gives many benefits:

  • Simpler to implement and audit (as Qtum)
  • Allows for dynamic length keys and values
  • Extremely extensible to allow for additional data to be encoded in the consensus-critical tree in the future without breaking client and programmatic supports (though would still require a hardfork)
  • Simpler to scale and handle disk load due to non-random indexed keys
  • Allows for censorship detection by SPV nodes of any account
  • Allows for (slow, but capable) retrieval of all historical contract state for any and all contracts without replaying transactions, and with included proofs that data is missing.

Of course, it’s not without it’s disadvantages as well:

  • Unproven and new design
  • Getting all state of a contract of interest as an SPV node could require more data transfer with full nodes, since the untrusted setup can not exclude all historical data generally. Depending on the exact contract setup though, it may be possible to safely exclude some historical data. The amount of round-trip requests also is significantly higher when using censorship-resistance, and thus is not expected to be used as the “normal” mode of communication
  • Is expected to increase disk usage on nodes compared to an equivalent setup on Ethereum, due to needed location, intermediate, and temporary (500 block) historical data that would otherwise be expensive to compute more than one time.

Contract Upgrades

Contract upgrades have long been a pain point in the smart contract ecosystem; the EVM does this no favors. The EVM does not directly allow code changes to take place in an established contract account. The workaround used by the community is to use the concept of proxy contracts. The summary of this is that a special “proxy” contract is used that can respond to certain requests (such as those to conduct upgrades), but for all other functionality delegates to the actual backing contract, which can be pointed to a new contract through upgrade functions. The proxy contract also holds all relevant state for the backing smart contract. This is done so that the code for the contract can be upgraded without needing to redeploy or modify the existing data used by the contract.

EVM smart contracts are of course typically written in Solidity. This further complicates the upgrade process. As discussed in the state database comparison above, Solidity auto-magically handles the location in which all state variables and arrays are placed in the single 256-bit namespace of EVM based storage. In older versions of Solidity, this caused simple refactoring such as placing a new state variable above old state variables would break this, causing Solidity to try to read/write a different location for the old state variables, of course leading to disastrous results. There are many workarounds, some more complex than others. This stackexchange post covers a good number of them and the advantages and drawbacks. In general though, the trade-off triangle is Difficulty of Refactoring/Upgrades vs Additional Gas Costs vs Difficulty of Initial Implementation. The old adage applies, “pick two that you want to be optimal”.

Qtum-x86 intends to remove this triangle for the most part, and does so in four core ways:

  • Simple manual and explicit state management prevents dropping to assembly or other burdens placed on the smart contract developer.
  • Direct upgrades on contract code is possible, eliminating the need for proxy contracts.
  • New mechanisms for a coalition of (potentially untrusted) smart contracts is introduced to allow access to a central “registry” of state without undue burden in gas costs.
  • External code delegation (ie, DELEGATECALL) can be done safely with untrusted or semi-trusted contracts by introducing a permission system to establish what a delegate called contract is allowed to do on the calling contract’s behalf.

This is a lot, but first the explicit state management. We touched on that some in the above section about state databases, but the big takeaway is that the developer is in control of where state is placed. Although 256 bits is plenty of space to arrange any human number of distinct state maps/arrays, it is needlessly complicated, especially since it would require hashing of the keys used for indexing in maps. To do this in Solidity explicitly also requires dropping to the assembly level and typically writing a number of wrapper functions. These wrapper functions used in leu of normal state variables are hard to optimize by the solidity compiler, and the developer is completely on their own to handle uniqueness of locations, bounds checking, field packing, field splitting, etc. In Qtum-x86 state is handled as simply as a typical key-value database. The only real difference between access is that there is no (planned) method of allowing a contract to do a “wildcard” search for keys, such as “give me every key in my database that starts with X”. This kind of query would be quite expensive and would be difficult to guard against exploits which cause an excessive amounts of gas to be consumed. So, although the concept is possible in Qtum-x86’s database, there is no plans to expose that functionality at the moment. Regardless, simple key namespaces are made using prefixes, keys can be stored without requiring hashing (although after some length, Qtum-x86 internally will hash it), and there is no need to break up structures or manually pack bytes to fit into some constant size.

The next big thing is allowing upgrades of a contract’s code. This will be a feature that some contracts may wish to opt-out of for provability reasons, but currently there is no good reason, security-wise, that even a blockchain like Ethereum does not allow it. This allows for the elimination of the need for proxy contracts, and allows simpler implementation of upgrade mechanisms.

In Qtum-x86 it will be possible to read state directly from external contracts. There is no need for state accessor functions like getTokenBalance() to be implemented in an ERC20-like contract. Instead, something like this can be used: externalState(erc20Address, "balance", addressOfInterest, &balance). This allows for any external contract to read state from a common space. Writing state is more complicated as permissions are involved, but basically the state contract would control who has access to writing to that state, either by using the Trusted Library permission features, or by using explicit modifier functions, potentially with validation of the format etc as well as that the party attempting the modification is authorized. The Trusted Library system will allow for modular contract design, using a proxy-like system. However, instead of delegating all code to a single "code" contract, it will be possible to delegate specific functions to specific contracts. In addition the permission system introduced will ensure that even if an exploit were discovered in one of these delegated function contract, the impact would be limited. This means that if an ERC20-like contract delegated out a simple function like "getBalance", it would not be allowed to modify state or send Qtum out of the contract.

--

--

Ashley Houston

(archived) Blockchain Engineer, co-founder at Qtum, President of Earl Grey Tech. All in on blockchain tech. Also does some film photography