Article Besu Fleet: The Future of RPC Scaling

Date

December 10, 2024

Author

Karim Taam, Ameziane Hamlat & Amelie Chatelain

Besu Fleet: The Future of RPC Scaling

This blog post introduces the Besu Fleet, an innovative solution that simplifies RPC service scaling while cutting costs and enhancing efficiency.

Besu Fleet: The Future of RPC Scaling Feature image

Reading time

5 Mins

This blog post introduces the Besu Fleet, an innovative solution that simplifies RPC service scaling while cutting costs and enhancing efficiency. With features like trie log shipping, lightweight RPC nodes, and a flexible plugin system, Fleet transforms Besu into a modular, high-performance platform for both Ethereum mainnet and Layer 2 networks.

Key advantages include:

  • Efficient RPC scaling through a captain-follower architecture.

  • Reduced storage needs and faster deployment with lightweight nodes.

  • Modular integration via plugins, minimizing changes to Besu’s core.

The Besu team is excited to introduce Fleet as a big step forward for RPC providers. Paired with our versatile plugin system, Fleet unlocks new opportunities to extend and customize Besu, meeting diverse operational needs. This blog post explores these innovations, showing how they optimize processes and deliver value to developers and RPC providers.

Trie log shipping

Bonsai and the concept of trie logs

Before diving into the details of the Fleet solution, let’s set the stage. Besu currently uses Bonsai as its default state storage format. Bonsai offers several advantages, such as reducing the overall database size and paving the way for transaction parallelization. But there’s another compelling data structure Bonsai provides: the trie log.

The trie log is much like a state diff, capturing the differences between two blocks at the account, slot, and code levels. This makes it possible for Bonsai to efficiently roll forward or back the state between blocks.

The creation of trie log shipping

From that point, we began exploring the idea: what if we could send the trie log and make it accessible externally? This would provide an easy way to track all the changes within a block, potentially opening the door to a variety of new features and use cases. This idea led to the concept of trie log shipping.

With the help of a plugin, Besu can send the trie log for each block it generates to an external component. This makes it possible for the component to monitor changes in real time, stay in sync with the blockchain as it evolves, and maintain an up-to-date database, all without the need to re-execute the EVM. This approach provides an efficient and streamlined way to access and utilize the latest blockchain data.

Production use – Linea integration

We've started using this feature with Linea, a zkEVM Layer 2 solution on Ethereum built by Consensys. Linea leverages zero-knowledge proofs to provide scalable and efficient transactions while maintaining full compatibility with Ethereum.

After the EVM executes transactions in Besu, it generates a trie log that captures all state reads and changes within the block. This trie log is then sent to a secondary component called the state manager, which is designed to maintain a zk-friendly representation of the blockchain state. This approach eliminates the need to re-execute the EVM by constructing the zk-friendly state directly from the trie log. Unlike the traditional state trie (Patricia Merkle Trie), the state manager uses a different trie structure (Sparse Merkle Trie) specifically optimized for zero-knowledge proofs.

Besu Fleet: The Future of RPC Scaling Image 1

This architecture simplifies the design of the secondary component, as it no longer requires a full EVM implementation, and eases code management by keeping the modifications in plugins, and the core code of Besu unchanged. By modularizing state maintenance, this architecture is perfectly suited to meet the high-performance demands of zkEVM solutions like Linea. For more details on how this feature is implemented in Linea, check out the full explanation

Seeing how effectively this approach worked, we decided to push the idea even further. 

A solution for scaling RPC providers

Our research on trie log shipping and Linea led us to another key use case: scaling RPC providers. Setting up an Ethereum client from scratch can take several hours and require massive amounts of disk space. A single client is often insufficient to meet scalability demands, and duplicating clients becomes necessary. However, syncing a new node every time is inefficient, as it requires significant time and storage, often terabytes of data. Creating a snapshot of a large database is also resource-intensive and slow, making it impractical for scalable solutions.

To address these challenges, why not leverage Bonsai's functionality, which has been thoroughly tested and optimized? Bonsai's capabilities can be used to solve novel problems with trie log shipping.

Consider a setup with a Besu "captain" node that executes blocks, runs the EVM, communicates with the consensus layer (CL), and maintains peer connections. This captain node could manage a network of lightweight Besu "follower" nodes. These follower nodes would keep their state up to date by simply listening to the trie log and blockchain data provided by the captain. They wouldn’t require P2P connections, transaction pool, CL integration, or EVM execution; they would only need to apply the trie log and blockchain data.

Besu Fleet: The Future of RPC Scaling Image 2

We decided to consider the captain as trusted since it must be controlled by the same party as the followers, allowing for further optimization of this setup. By applying the trie log directly, the follower nodes could eliminate the need for the state trie and old blockchain data, operating instead on a streamlined, flat database.

Before diving deeper, here’s a quick reminder: Bonsai consists of two key components in the world state: the trie and the flat database. Both store critical information such as account details and slot data. By default, when a trie log is applied, a Besu node updates both the trie and the flat database. The trie is then used to verify the state root, while the flat database is the workhorse for accessing state data. 

If we rely solely on the flat database, the follower can still execute transactions or blocks (without verifying the state root), access account, slot, or code information, and handle most RPC calls. For a node dedicated to serving RPC requests, a flat database-only setup would be sufficient. This type of node would lack the ability to validate or create blocks, but that's not its intended purpose.

By removing the need to compute the state root, we can eliminate the trie. Additionally, we can remove blockchain history data, resulting in a more efficient node optimized for RPC services. By default, this approach limits the follower to storing only the last 2048 blocks, ensuring it remains lightweight and highly efficient for near-head operations. This default value can be adjusted as needed. This default value allows managing all RPC requests, knowing that some need access to previous blocks.

Another important aspect to consider is that follower nodes should not grow too quickly in terms of storage size. By leveraging pruning and trie deletion, we observed almost no increase in the storage footprint of a follower node over an extended period. This suggests that it’s possible to run a follower node for a very long time without needing to increase disk size, which is a game-changer for scalability and long-term efficiency.

Lightweight nodes optimized for RPC requests

The result is a highly efficient system where a captain node manages a fleet of lightweight Besu followers, each with a database size of less than 80 GB on Ethereum mainnet. These followers are optimized to handle the majority of near-head RPC requests. They are incredibly easy to scale, as they can be spun up in less than 10 minutes to handle increased load. Since they don't execute the EVM to follow the head of the chain, they synchronize rapidly by applying trie logs from the captain, making the process simple and efficient.

An important point is that these nodes do not execute the EVM to stay synchronized with the chain, relying instead on the captain's trie logs. However, they are fully capable of executing the EVM for tasks such as eth_call operations, making them highly functional. 

What’s more, these followers can be started from a snapshot, thanks to their compact size. This means we can generate daily snapshots, and launching a follower from one of these snapshots takes only a few minutes. Once started, the follower smoothly syncs from the snapshot to the chain head. The small size of these snapshots (less than 80 GB) drastically reduces storage costs compared to the large and resource-heavy snapshots required for traditional nodes. This combination of speed, scalability, and cost-efficiency makes this approach a game-changer for managing RPC workloads.

Fleet implementation details – plugin-based architecture

The RPC provider scaling use case showcases just how versatile and effective plugins can be. To introduce this Fleet feature, we created a dedicated repository for our plugin. The Fleet plugin enables a Besu node to operate as either a captain or a follower, depending on a startup flag. In captain mode, the plugin manages sending the trie log, while in follower mode, it handles syncing via the trie log.

What makes this approach particularly appealing is that it requires almost no changes to Besu’s core code. All the logic is encapsulated within the plugin itself. To use the feature, we simply attach the plugin when starting Besu and select the desired mode. This setup allows us to implement advanced mechanisms like trie log shipping, a new method of syncing Besu via trie logs, and other features, all while keeping the core Besu codebase almost untouched. 

To help visualize how this works, here is a conceptual diagram of the Fleet plugin architecture with Besu.

Besu "captain" node:

  • Executes blocks and the EVM.

  • Validates the state root of each block. 

  • Communicates with the consensus layer (CL) and other peers.

  • Generates the trie log after each block execution.

  • Sends the trie log to Besu "Follower" nodes.

  • Maintains a list of follower nodes to manage their synchronization.

Besu "follower" nodes:

  • Receive the trie log from the Captain.

  • Do not require CL, P2P, transaction pool, or EVM execution.

  • Maintain an up-to-date state (only flat database) by continuously applying trie logs.

Trie log shipping:

  • The trie log captures the differences between two blocks, similar to a Git diff.

  • Trie logs are sent from the Captain to the Followers, enabling them to track state changes without re-executing the EVM.

Scalability:

  • Besu Follower nodes are lightweight, with a database size of less than 80 GB.

  • They can be deployed quickly to handle increased RPC load.

  • The Captain node manages the complexity, including EVM execution, P2P communication, and synchronization, while offloading state updates to Followers.

Besu Fleet: The Future of RPC Scaling Image 3

Supported RPC calls

With this feature, follower nodes are capable of handling the majority of RPC requests. Below is a non-exhaustive list of what followers can manage effectively:

  • Accessing balance information for any account.

  • Accessing data from specific storage slots within smart contracts.

  • Retrieving details of processed transactions.

  • Getting the deployed code for smart contracts.

  • Retrieving headers, transactions, and other details for blocks within the recent block range (near head).

  • Performing eth_call requests.

  • Calculating the gas cost for transactions (eth_estimateGas).

Essentially, followers can support any operation that doesn’t require accessing the state trie (proof or root hash computation) and is focused on near-head data. This is an efficient and scalable solution for serving RPC calls in real-time without the overhead of a full node.

Follower nodes can handle a wide range of RPC calls efficiently, including but not limited to:

  • eth_blockNumber

  • eth_call

  • eth_chainId

  • eth_estimateGas

  • eth_gasPrice

  • eth_getBalance

  • eth_getBlock[*]

  • eth_getCode

  • eth_getLogs

  • eth_getStorageAt

  • eth_getTransaction[*]

  • debug_trace*

  • trace_*

But some RPC calls are not supported:

  • eth_getProof

This limitation exists because follower nodes do not maintain the full state trie, which is required for certain RPC calls like eth_getProof. 

Performance

Sync time

On Ethereum mainnet, syncing a new follower node from a 9-hour snapshot on an i3en.2xlarge AWS VM takes approximately 6 minutes. This includes 3 minutes to download the snapshot and an additional 3 minutes to synchronize to the latest block. Compared with 1 day to sync from scratch using vanilla Besu with checkpoint sync, this is a game-changing improvement. 

RPC performance 

Since the follower nodes are dedicated to RPC tasks and do not execute blocks or handle peer-to-peer requests, we observed a significant improvement in RPC performance compared to vanilla Besu, resulting in more consistent and stable response times.

Captain’s capacity

We successfully tested over 300 follower nodes connected to a single captain, maintaining excellent performance on fleet_getBlock calls with the 99th percentile latency staying below 50 ms. fleet_getBlock is used by each follower to get the new block from the captain. We noticed that 300 followers generated approximately 75 RPS on fleet_getBlock under normal conditions, with occasional spikes reaching up to 1000 RPS during sync and reorg events, without any noticeable performance degradation. We could achieve similar performance because of another caching mechanism that stores each block data that was executed by the captain. This feature can be enabled with this flag --cache-last-blocks=n.

Disk activity

During state calls (eth_call, eth_getBalance and eth_getTransactionCount) load testing in Fleet mode on Linea, we observed very low disk activity on nodes with an on-disk database. This can be attributed to the relatively small size of the Linea mainnet flat database. As shown below, all column families on a Linea mainnet Besu Fleet node occupy only 19 GiB, and with a well-configured Besu, we can fit all account data inside the database cache and benefit from a 99.9% hit ratio, where only 0.1% of the database calls will hit the disk. Additionally, since there is no block processing or peer-to-peer requests, the block cache remains dedicated to RPC requests. This was a key finding during our load testing.

Besu Fleet: The Future of RPC Scaling Image 4

Conclusion

The Besu Fleet represents a paradigm shift in how RPC services are scaled, combining the efficiency of trie log (state diffs) shipping with the flexibility of Besu’s plugin architecture. By separating computational tasks between captain and follower nodes, Fleet mode enables rapid deployment, cost-efficient operations, and seamless scalability.

Fleet introduces an innovative way to handle the ever-increasing demand for RPC providers, delivering high performance, cost efficiency, and scalability for future growth.

We’d love to hear from you if you’re interested to find out more! To contact us, please email [email protected].