In the lead up to Ethereum’s Merge to Proof of Stake, Ethereum client teams and core researchers are performing several upgrades and tests on the consensus “Beacon Chain” to ensure a smooth transition to Proof of Stake.
The most recent Beacon Chain Altair hard-fork was performed largely as an administrative upgrade and acted as a trial run for the next upgrade, Ethereum’s merge to Proof of Stake. It took place October 27th, 2021 and overall, was successful as all client teams were able to navigate the upgrade.
Here’s a recap of the upgrade and how Consensys and Teku helped make it happen.
Learnings and issues from the upgrade
The Altair fork was the first upgrade to the Beacon Chain and went well, serving as a trial run for future updates. Network participants were instructed to update their clients in time. Immediately after the fork, at 10:56 UTC on October 27th, participation dropped from 99.7% to ~95%, then shortly after climbed back up to 98%. This was positive and indicated that the vast majority of validating nodes were correctly upgraded in time. There was only one strange event during the fork.
Despite almost all attestations being present and correct, about 25% of blocks went missing, which is very unexpected. Missing blocks almost exclusively seemed to be from validators running Prysm, but it was difficult to identify a root cause, although several theories were explored.
Eventually, the majority of misbehaving validators were traced to a single, very large staking operator that had apparently not fully upgraded its infrastructure in time for the Altair upgrade. In conversation with core devs, the service was able to complete the upgrade, and within 12 hours the beacon chain was once again running smoothly.
The Altair fork proved to be a rewarding exercise: useful lessons were learned by devs and operators alike that will stand teams in good stead when it comes to The Merge.
Teku’s role in Altair & the importance of client diversity
In spite of challenges the Beacon Chain remained up and running and Teku performed perfectly throughout the upgrade. The Consensys team was able to set up the first Altair multi-client test nets, with Teku being the first fully Altair-compatible client. These provided a useful reference point for other clients during their own implementations of the Altair upgrade.
The value of running these upgrades is to ensure risks and weaknesses are addressed ahead of The Merge. Client specific protocol or interoperability bugs will always be a challenge in the Ethereum ecosystem. On this occasion, there were no client-specific bugs. But if there had been an issue with the current majority client, Prysm, the consequences would have been very severe. Avoiding a client supermajority is essential for de-risking events like Altair and continued investments in client diversity are critical for the network.
The Teku team has worked hard to ensure interoperability with as many clients and configurations as possible, looking toward a diverse Proof of Stake Ethereum.
Consensys R&D, a group of industry-progressing researchers and engineers, has been working closely with the Teku team to help define specs and prototype key pieces for the Merge. Our lightweight client is leading the way as demonstrated on testnets and for interoperability among consensus and execution clients. Teku, the consensus client, along with the execution client, Hyperledger Besu, have given our team insights into interoperability and the full-stack approach for next-generation Ethereum.
Now we look toward The Merge. Altair was successful and a great test of the network’s ability to upgrade, coordinate across clients, and prevent bugs. Consensys’ protocol engineers and researchers continue to work hard on execution and consensus clients, developing and formalizing Merge specifications, prototyping new tech like sharding, and creating and participating in testnets.
For those of you staking, if you upgraded your node after the fork, please resync your node with Teku and Infura’s pioneering initial-state option. This will get you up and running again in a few minutes.