Date

October 3, 2023

Author

Karim Taam

A Guide To Bonsai Tries in 2023

We share recent enhancements to Bonsai—a state storage layout that improves efficiency and performance for the Hyperledger Besu Ethereum client.

Reading time

28 min 8 sec

In 2021, we introduced Bonsai—a new storage structure that introduces several enhancements to the Hyperledger Besu execution client. Bonsai primarily reduces storage requirements to run a full Ethereum node, whilst improving syncing times for new nodes and read performance for existing nodes.

If you want a non-technical introduction to Bonsai, we encourage reading Bonsai Tries: A Big Update for Small State Storage in Hyperledger Besu. This blog dives deep into Bonsai’s architecture and highlights some of the improvements made to the Bonsai storage format since our last blog on the topic.

Recap: What makes Bonsai’s storage policy different?

Storage formats implemented in other Ethereum clients often store and access nodes of Ethereum’s global state trie in a key-value store by each node’s hash. Conversely, Bonsai stores state trie nodes by their location in the trie and directly accesses an account’s data from storage using its key.

With hash-based storage, an update to Ethereum’s world state—which happens at every block—adds new nodes to the global state trie. But old leaf nodes still remain in the underlying storage, increasing the size of the database over time and forcing the client to spend more time and computational resources when retrieving account data from the state trie.

Bonsai keeps only one version of Ethereum’s state trie at any given time and supports natural pruning of state by replacing old nodes at the same position in the trie with new nodes. Furthermore, Bonsai uses trielogs (which we’ll discuss later in this article) to manage chain reorganizations—this ensures that Bonsai’s approach to storing state (persisting state at the head of the chain) doesn’t affect the capacity of Hyperledger Besu nodes to retrieve and serve Ethereum state data even when reorgs happen.

How does Bonsai work under the hood?

Trie

The trie component of Bonsai stores nodes based on their location in the trie, enabling calculation of the root hash and block validation. By naturally pruning old nodes, the trie keeps the state representation compact and reduces storage requirements.

Image A Merkle tree representation of Ethereum's state trie

Before Bonsai, nodes of the state trie were saved in the following structure:

Image This diagram is a simplified representation of the trie. There are also extension nodes, and each branch can have up to 16 children.

The trie is stored in a single column in the database, using a unique key for each element. Here’s how the keys are organized for the account and storage tries:

Account Trie:

Key: Account trie location
Value: Value of the trie node in a specific location (the location could be a branch, extension, or leaf in the trie)

Storage Trie:

Key: Concatenation of the account hash and storage location
Value: Value of the trie node in a specific location (the location could be a branch, extension, or leaf in the trie)

Note: We need to concatenate the account’s hash with the location because, without completing this step, each account could write to the key of another account since they would all have keys like 0x, 0x00, etc. Therefore, we add the account hash to differentiate the entries in the database.

Flat Database

The flat database is a storage optimization in Bonsai that stores all leaf nodes in a flat format. This design allows for quick and efficient access to accounts and storages, enhancing the performance of SLOAD operations in the EVM (Ethereum Virtual Machine). Instead of traversing the trie structure, the flat database enables direct access to the required data in a single disk read and results in faster query processing.

The flat database is structured into three columns at the database level and stores data in a key-value format:

1. Accounts column (Stores account representations):

Key: Account hash
Value: Account data (balance, nonce, and other relevant information)

2. Storage column (Stores storage slot representations):

Key: Concatenation of account hash and slot hash
Value: Data associated with the storage slot

3. Code column (Stores code representations):

Key: Account hash
Value: Bytecode associated with the account

Note: In the future, opting to use Besu in snap sync mode will automatically trigger Bonsai to store a contract account’s codehash—instead of the account hash—as its key in the flat database.

Trielogs

Trielogs are a key feature introduced by Bonsai to manage state rollback and rollforward. They represent the differences between two blocks’ state by capturing each modification made to an account or storage. Trielogs contain the prior and next versions for each modification, allowing for state changes to be reverted or applied selectively.

The trielogs are stored in a specific column in the database and use the block hash as the key. Each trielog contains the modified data along with the previous and current values for accounts, storage slots, and code.

Key: Block hash
Value: Trielog data, including the modified information, along with the previous and next values for accounts, storage slots, and code.

Block Hash: 0xabcdef1234567890...

Trielog:

- Account:

- Account address: 0x1234567890abcdef...

- Previous Value: { balance: 100, nonce: 0 , ...}

- Next Value: { balance: 150, nonce: 1 , ...}

- Storage Slot:

- Account address: 0x1234567890abcdef...

- Slot key: 0x0111...

- Previous Value: 0x01

- Next Value: 0x02

- Code:

- Account address: 0x1234567890abcdef...

- Previous Value: 0xaabbccddeeff...

- Next Value: 0x112233445566...

Note: The above example is an illustrative representation of a trielog and may not exactly reflect the internal structure as it exists in the actual code. However, it provides a simplified understanding of how the data is organized within the trielog.

Accumulator

Bonsai introduces an object called the Bonsai accumulator that accumulates modifications made during block processing. It stores (in memory) changes to accounts and contract storage at each block and, eventually, these modifications are merged and used to update the trie and the flat database.

This accumulator system allows for additional optimizations in Besu, such as preloading trie nodes during block processing. For example, if a read is detected in the flat database, a background task can be triggered to prefetch/preload the path of that account in the trie and reduce the time needed to update the trie at the end.

Rollback/Rollfoward

Since Bonsai only persists the chain’s head, it is necessary to have the ability to revert back to a previous state—this is achieved by passing one or more trielogs to the accumulator. The accumulator processes the modifications recorded in the trielogs, either using prior data (in the case of a rollback) or subsequent data (in the case of a rollforward); once all the modifications have been accumulated, the trie and flat database are updated as though importing a new block.

By utilizing the trielogs and the accumulator, the system enables the capability to revert to previous states. This is particularly useful in situations where a rollback is required due to a chain reorganization or other events. The accumulator efficiently processes the modifications recorded in the trielogs, allowing for the restoration of the state to a specific point in time.

Applying a trielog for rollback or rollforward does not require any block validation or transaction processing as in the case of a regular block import. If the trielog exists, it means that the validation has already been done previously because the trielog is created after the block import and its validation. Therefore, replaying the trielog will only change the values of the state and will be very fast.

Preload cache

The accumulator introduces an optimization mechanism that we describe below. This optimization is primarily designed to detect modifications to storage slots or accounts during the execution of the EVM.

To simplify, when reading from the flat database, a thread is launched to put in memory the path of that leaf node in the trie. This allows us to have the majority, if not all, of the trie nodes already in memory when updating the trie to calculate the root hash. As a result, it reduces the time spent waiting for disk access and improves performance.

By utilizing the accumulator, the system optimizes the process of updating the trie by preloading and caching the necessary leaf nodes. This reduces the reliance on disk access and improves the overall efficiency of the state management system.

World State

In Ethereum (and other account-based blockchain systems) the “world state” is defined as a set of variables—such as balances of accounts and storage values of smart contracts—describing the system at a specific point in time. Transactions, account updates, smart contract changes, and metadata are all updated at each block via state transitions in the world state. You can read more about world state in our Ethereum Explained blog post and see our handy infographic for an explainer on Merkle trees, which are used to store the world state.

Image The structure of Ethereum’s world state (“state trie”) at a glance. Source: "Diving into Ethereum’s world state" by Timothy McCallum

In Bonsai, we represent the world state trie differently from other Ethereum clients to gain improvements in storage and performance. Specifically, we snapshot what the Ethereum blockchain looks like at a point in time and accumulate only the differences to data between blocks using trie logs. The world state component in Bonsai comprises different subcomponents:

Storage: The storage component determines how data is stored and persisted. It can be either persistent, based on a RocksDB snapshot, or use a layered format. Each option has its own characteristics and benefits.
Root hash and block hash: The root hash and block hash of the state are derived from the storage component.
Accumulator: The accumulator is a data structure that maintains and manages a set of modifications made to the world state. It allows performing rollback or rollforward operations on the state; for example, an accumulator can record changes to the data over time (enabling selective undoing of changes and reverts to a previous state) or apply pending modifications to advance to a new state. Each instance of the world state has its own accumulator, allowing it to manage rollbacks on different states in a thread-safe manner.
Persistence flag: The persistence flag is a boolean value indicating whether the world state is allowed to persist modifications during a rollback or roll-forward operation.

As Bonsai persists a single state in the database we need to add various optimizations in Besu to support multiple copies of the state at different heights simultaneously. This enables handling scenarios where, for example, an RPC request requires the account balance at block #4 and another request needs it at block #5 concurrently. This is achieved through a combination of RocksDB snapshot and in-memory diff layers.

The persisted, snapshot, and layered states work together to manage changes in the state. Periodically, a snapshot of the state is taken, capturing the state at a specific point in time. Each addition of a new block generates a layered diff in memory, representing the changes introduced by that block.

Once a sufficient number of blocks have been processed, a new snapshot is created—an approach that prevents the accumulation of an infinite list of diff layers and provides a fresh starting point for future modifications. Additionally, when a block is finalized and considered permanent, the state is persisted to maintain the most up-to-date version of the state.

To summarize we have different types of world state representation in Bonsai:

Persisted state: This refers to the state representation that is persisted in a database. It represents the state at a specific point in time and is stored persistently for future reference.
Snapshot state: A snapshot state is a copy of the persisted one at a particular moment. It provides a view of the state at that specific point in time.
Layered state: Layered state refers to a snapshot state on which an in-memory diff is applied, with this diff representing the changes or updates made to the state snapshot. The layered state allows for efficient and incremental updates to the snapshot without modifying the original data directly.

Layered state vs. trielog

A layered state is different from a trielog: it contains the differences between two blocks of the entire state—including leaf nodes, branch nodes and extension nodes—while a trielog only contains the state before and after the leaf node. To put it differently:

The trielog provides a view of the modifications made between two blocks at the account and storage levels
The layered state is more a view of how these modifications have impacted the storage of the entire world state (both at flat database and trie levels).

The following code snippets further highlight the high-level differences between trie logs and layered states in Bonsai:

Trielog:

- Account:

- Account address: 0x1234567890abcdef...

- Previous Value: { balance: 100, nonce: 0 , ...}

- Next Value: { balance: 150, nonce: 1 , ...}

- Storage Slot:

- Account address: 0x1234567890abcdef...

- Slot key: 0x0111...

- Previous Value: 0x01

- Next Value: 0x02

- Code:

- Account address: 0x1234567890abcdef...

- Previous Value: 0xaabbccddeeff...

- Next Value: 0x112233445566...

Vs.

Layered state:

0x <> Branch node present in the layered state

0x01 <> Branch node present in the layered state

0x0102 <> Leaf node present in the layered state

0x02 <> Branch node is missing so will ask the parent state

Image Interaction between snapshot, layered world state, and persisted state in Bonsai

Snapshot <> Layered state <> Persisted state (Bonsai tries)

Life Cycle of the World State

To protect the integrity of the chain head from inconsistencies, we never use the persisted state to simulate transactions, validate blocks, create blocks, or respond to RPC calls in Besu. The persisted state is only used to import a new block and change the head of Besu. It’s also important to note that we always have a copy of the persisted state in a snapshot, which is created right after the import or at the start of Besu.

Choosing the right moment to create a snapshot of the persisted state is crucial. In particular, we want to avoid creating the snapshot of a state not under modification currently. Creating a snapshot during a block import could result in an invalid snapshot that represents a transitional state between two blocks.

To avoid this issue, it’s important to carefully synchronize the snapshot creation process with the block processing workflow. Typically, snapshots are created after the completion of a block processing cycle. This ensures that all modifications made during the block execution are finalized, and the state has reached a stable point.

Now, suppose we want to simulate a transaction on the state of block #4 while the head is at block #9. What do we do?

Every time we need a world state, we will request an instance of BonsaiWorldstate from the world state provider. If it’s for something other than changing the head, the provider will look into its cache to check if there is an available image of the world state corresponding to the requested block. This process is described subsequently:

Scenario #1: The World State is available in the cache

When retrieving a world state, the system follows a copy-on-access approach. If an image of the world state instance exists in the cache, it creates a copy and returns the copy. This technique enables preserving a clean and intact world state image in the cache, which can be utilized to serve other requests if necessary.

If the retrieved world state is a snapshot, a copy of the snapshot is returned. Similarly, if it’s a layered state, a copy of the layered state is returned. With this copy, we can perform simulations or any other operations as needed. It’s important to note that this copy is never persisted and remains unique to our specific call, eliminating any potential concurrency issues.

Image Workflow for retrieving world state in Bonsai when world state is available in cache

World state retrieval workflow I (Bonsai tries)

Scenario #2: The World State is not available in the cache

If the requested world state is not available in the cache, the world state provider will need to recreate an image of that world state. Here’s how it will proceed:

It will locate the nearest snapshot/layered state in the cache that corresponds to a block close to the requested block.
The provider will make a copy of that snapshot/layered and perform rollback or rollforward operations to reach the state of the requested block.
Once it has obtained the desired state, it will save this copy for future requests and make another copy to return to the caller.

In summary, the world state provider uses the closest available snapshot/layered state as a starting point and applies necessary changes to reach the specific block’s state. The copied world state is then stored for caching purposes, and a separate copy is returned to the caller.

Image Workflow for retrieving world state in Bonsai when world state is not available in cache

World state retrieval workflow II (Bonsai tries)

Closing the World State

Whenever we retrieve a snapshot/layered state, it is essential to ensure that it is closed properly to avoid any memory leaks. This means that whenever we create a block or simulate a transaction, we need to close the corresponding world state. To prevent any oversight in closing the world state, our code enforces strict rules and prevents any attempt to manually open it within an RPC method or similar contexts.

The responsibility of opening and closing the world state is delegated to the TransactionSimulator, MainnetBlockValidator, and BlockCreator—minimizing the chances of developers forgetting to close it. We strive to maintain this rigor, and certain parts of the code explicitly prevent opening a world state anywhere else to avoid errors. That said, we acknowledge that mistakes can still happen and aim to protect against them as much as possible.

All the world states we create need to be closed at some point or another. For example, if we need a world state to validate a block, the provider will provide us a copy of the cache. Once the validator finishes using this copy, it will close it. The world state in the cache will be closed once it is evicted from the cache either because it is too old or we haven’t used it for a long time.

Closing world states is a very important consideration because leaving snapshots in RocksDB open increases the risk of a memory leak. To illustrate, a layered world state relies on a snapshot: if a layered world state is in the cache, but the snapshot is evicted, the snapshot has to remain open. The layered world state will keep the snapshot open, but as soon as the snapshot no longer has any open dependencies—that is, all layered states using it are closed and evicted from the cache—it is closed.

Snapshot state infographic (Bonsai tries)

However, this approach has a downside. Once a simulation or validation is terminated, we cannot access data from the world state anymore. To address this limitation, most methods accept an interface where the caller can provide their code to be executed if they need to access world state information before it is closed. This method is invoked just before the close, allowing the caller to retain control over the world state data.

Overall, this strict control over opening and closing world states ensures memory management and minimizes the risk of errors while still providing a mechanism for accessing data when necessary.

Image Closing world states in Bonsai

Closing world state infographic (Bonsai tries)

Can Bonsai work with Verkle Tries?

Verkle tries have long been touted as an important part of Ethereum’s scaling roadmap, and though an explainer is out of scope for this post, Vitalik’s Buterin explainer on Verkle tries provides an excellent introduction to the topic. Instead, this section addresses a key question: can Bonsai’s approach to storing state work adapt when Ethereum transitions from a Merkle Patricia Trie (MPT) to a Verkle trie in the future?

To answer this question we must make two important observations:

Bonsai is not tied to a specific trie structure (it only manages how we store and retrieve trie nodes)
Bonsai allows us to go back in time by saving different account and storage modifications associated with each block.

From this perspective, we realize that Bonsai is not necessarily linked to the type of trie we use. Therefore, we can consider transitioning to the Verkle trie without having to rewrite all the optimizations provided by Bonsai, or rethink the entire logic. This also means the work we are doing to enhance Bonsai will not be lost when Ethereum’s execution layer upgrades to Verkle tries.

Another factor behind Bonsai’s support for storing state in Verkle trie format is the design of the accumulator. In Bonsai, the accumulator plays a crucial role in managing and tracking the modifications made to the trie by temporarily storing changes to the state trie before they are committed to the underlying trie structure (MPT or Verkle trie).

The accumulator plays an integral role in the transition from MPT (Merkle Patricia Trie) to Verkle trie. The modifications made to the trie are first accumulated in the accumulator, similar to how it works in the current Bonsai implementation. But instead of directly committing these modifications to the MPT, the accumulator applies them to the Verkle trie.

Thus, the accumulator ensures that the changes made to the trie are appropriately recorded and can be efficiently applied to the Verkle trie when necessary. This helps maintain consistency and facilitates the process of structures without losing modifications to the state trie in the process.

To summarize, the accumulator remains involved in the transition process and after, allowing for the temporary storage and subsequent application of modifications to the Verkle trie, replacing the role it previously played with the PMT in the Bonsai framework.

Get started with Bonsai tries today

We’ve worked hard to enhance Bonsai and are pleased to share the latest updates to the storage format. Excited to try out Bonsai? Run a Hyperledger Besu node with Bonsai tries data storage format by using the command line option --data-storage-format=BONSAI. We also encourage reading the documentation on Bonsai tries for more information on storage requirements and instructions on accessing data and syncing nodes with Bonsai.

We’d love to hear your experience with using new features in Besu (Bonsai inclusive). If you have feedback on Bonsai, or any issues with using it in your Besu node, come share your thoughts on the Hyperledger Discord community (#besu channel). If you’re new to developing with Besu, check out the documentation and try running your own Besu node by following the Developer Quickstart.

PS: Are you running a majority client for Ethereum’s Proof of Stake (PoS) consensus? Consider switching to Besu to support client diversity. Read the documentation on migrating to Hyperledger Besu for more details on running Besu as your validator’s execution client.