In 2021, we introduced Bonsai—a new storage structure that introduces several enhancements to the Hyperledger Besu execution client. Bonsai primarily reduces storage requirements to run a full Ethereum node, whilst improving syncing times for new nodes and read performance for existing nodes.
If you want a non-technical introduction to Bonsai, we encourage reading Bonsai Tries: A Big Update for Small State Storage in Hyperledger Besu. This blog dives deep into Bonsai’s architecture and highlights some of the improvements made to the Bonsai storage format since our last blog on the topic.
Recap: What makes Bonsai’s storage policy different?
Storage formats implemented in other Ethereum clients often store and access nodes of Ethereum’s global state trie in a key-value store by each node’s hash. Conversely, Bonsai stores state trie nodes by their location in the trie and directly accesses an account’s data from storage using its key.
With hash-based storage, an update to Ethereum’s world state—which happens at every block—adds new nodes to the global state trie. But old leaf nodes still remain in the underlying storage, increasing the size of the database over time and forcing the client to spend more time and computational resources when retrieving account data from the state trie.
Bonsai keeps only one version of Ethereum’s state trie at any given time and supports natural pruning of state by replacing old nodes at the same position in the trie with new nodes. Furthermore, Bonsai uses trielogs (which we’ll discuss later in this article) to manage chain reorganizations—this ensures that Bonsai’s approach to storing state (persisting state at the head of the chain) doesn’t affect the capacity of Hyperledger Besu nodes to retrieve and serve Ethereum state data even when reorgs happen.
How does Bonsai work under the hood?
Trie
The trie component of Bonsai stores nodes based on their location in the trie, enabling calculation of the root hash and block validation. By naturally pruning old nodes, the trie keeps the state representation compact and reduces storage requirements.