Special thanks to Justin Drake, Francesco, Hsiao-wei Wang, @antonttc, and Georgios Konstantopoulos.
Initially, there were two scaling strategies in the Ethereum roadmap. One (see an early paper from 2015) is ‘Sharding’: each Node only needs to validate and store a small portion of transactions, rather than all transactions on the chain. Any other peer-to-peer network (such as BitTorrent) also works in this way, so we can certainly make the blockchain work in the same way. The other is the Layer 2 protocol: these networks will sit on top of Ethereum, allowing it to benefit fully from its security while keeping most data and computations off the mainchain. Layer 2 protocol refers to state channels in 2015, Plasma in 2017, and then Rollup in 2019. Rollup is more powerful than state channels or Plasma, but they require a large amount of on-chain data bandwidth. Fortunately, by 2019, Sharding research had solved the problem of large-scale validation of ‘data availability’. As a result, the two paths merged, and we got a roadmap centered on Rollup, which is still Ethereum’s scaling strategy today.
The Surge, 2023 Roadmap Edition
The Rollup-centric roadmap proposes a simple division of labor: ETH L1 focuses on becoming a powerful and decentralized base layer, while L2 takes on the task of helping the ecosystem scale. This pattern is ubiquitous in society: the existence of the court system (L1) is not for pursuing ultra-high speed and efficiency, but for protecting contracts and property rights, while entrepreneurs (L2) need to build on this solid foundation to lead humanity towards Mars (both literally and metaphorically).
This year, the roadmap centered around Rollup has achieved important milestones: with the launch of EIP-4844 blobs, the data bandwidth of Ethereum L1 has significantly increased, and multiple Ethereum Virtual Machine (EVM) Rollups have entered the first stage. Each L2 exists as a ‘Sharding’ with its own internal rules and logic, and the diversity and pluralism of Sharding implementation have now become a reality. However, as we can see, this path also faces some unique challenges. Therefore, our current task is to complete the roadmap centered around Rollup, address these issues, while maintaining the robustness and decentralization unique to Ethereum L1.
The Surge: Key Objectives
In the future, Ethereum can achieve more than 100,000 TPS through L2.
Maintain the decentralization and robustness of L1;
At least some L2 fully inherit the core properties of Ethereum (Trustless, open, censorship-resistant);
Ethereum should feel like a unified ecosystem, not 34 different blockchains.
This Chapter
Scalability Trilemma
Further progress in data availability sampling
Data Compression
Generalized Plasma
Mature L2 proof system
Cross-L2 interoperability improvement
Expand execution on L1
Scalability Trilemma
The Scalability Trilemma is an idea proposed in 2017, which suggests that there is a contradiction between the three characteristics of blockchain: Decentralization (specifically, low cost of running Nodes), scalability (processing a large number of transactions), and security (attackers need to destroy a large part of the network Nodes to cause a single transaction to fail).
It is worth noting that the Trilemma of Decentralization is not a theorem, and the posts introducing the Trilemma of Decentralization do not come with mathematical proofs. It does present a heuristic mathematical argument: if a Decentralization-friendly Node (such as a consumer laptop) can verify N transactions per second, and you have a chain that processes k*N transactions per second, then (i) each transaction can only be seen by 1/k of the Nodes, which means an attacker only needs to disrupt a few Nodes to pass a malicious transaction, or (ii) your Node will become powerful, but your chain will not be decentralized. The purpose of this article is not to prove that breaking the Trilemma of Decentralization is impossible; instead, it aims to show that breaking the Trilemma is difficult and requires thinking outside the framework implied by the argument.
For years, some high-performance chains have often claimed to have solved the Trilemma without fundamentally changing the architecture, usually by optimizing Nodes through software engineering techniques. This is always misleading, as running Nodes on-chain is much more difficult than running Nodes on ETH chain. This article will explore why this is the case, and why L1 client software engineering alone cannot scale ETH chain.
However, the combination of data availability sampling and SNARKs does solve the trilemma: it allows clients to verify that a certain amount of data is available and a certain number of computational steps are executed correctly by downloading only a small amount of data and performing a minimal amount of computation. SNARKs are trustless. Data availability sampling has a subtle few-of-N trust model, but it preserves the fundamental property of an unscalable chain that even a 51% attack cannot force bad blocks to be accepted by the network.
Another way to solve the trilemma is the Plasma architecture, which cleverly shifts the responsibility for monitoring data availability to users in an incentivized manner. As early as 2017-2019, when we only had fraud proof as a means to expand computational capacity, Plasma was very limited in terms of secure execution. However, with the widespread adoption of SNARKs (Zero-Knowledge Succinct Non-Interactive Arguments of Knowledge), the Plasma architecture has become more feasible for a wider range of use cases than ever before.
Further Progress on Data Availability Sampling
What problem are we solving?
On March 13, 2024, when Dencun is upgraded and launched, there will be approximately 3 blobs of about 125 kB per 12-second slot on the Ethereum blockchain, or about 375 kB of available bandwidth per slot. Assuming that transaction data is published directly on-chain, the TPS (transactions per second) for ERC20 transfers on the Ethereum blockchain Rollup is approximately 173.6, calculated as 375000 / 12 / 180 = 173.6 TPS.
If we add the calldata of Ethereum (theoretical maximum: 30 million gas per slot / 16 gas per byte = 1,875,000 bytes per slot), it becomes 607 TPS. Using PeerDAS, the number of blobs may increase to 8-16, which will provide 463-926 TPS for calldata.
This is a major upgrade for Ethereum L1, but it’s not enough. We want more scalability. Our mid-term goal is 16 MB per slot, which, combined with the improvement of Rollup data compression, will bring ~58000 TPS.
What is it? How does it work?
PeerDAS is a relatively simple implementation of “1 D sampling”. In the ETH network, each blob is a 4096-degree polynomial in a 253-bit prime field. We broadcast shares of the polynomial, where each share contains 16 evaluation values from 16 adjacent coordinates out of a total of 8192 coordinates. Among these 8192 evaluation values, any 4096 (any 64 out of 128 possible samples according to the current proposed parameters) can recover the blob.
The working principle of PeerDAS is to make each client listen to a small number of subnets. In the i-th subnet, the i-th sample of any blob is broadcasted, and by asking peers in the global P2P network (who listen to different subnets), it requests blobs on other subnets that it needs. A more conservative version, SubnetDAS, only uses the subnet mechanism without additional querying at the peer layer. The current proposal is for nodes participating in Proof of Stake to use SubnetDAS, while other nodes (i.e., clients) use PeerDAS.
In theory, we can scale up a ‘1 D sampling’ to a fairly large scale: if we increase the maximum number of blobs to 256 (targeting 128), then we can achieve the goal of 16 MB, while data availability sampling each Node 16 samples * 128 blobs * 512 bytes per sample = 1 MB data bandwidth per slot. This is just barely within our tolerance range: it is feasible, but it means that bandwidth-limited clients cannot sample. We can optimize this to some extent by reducing the number of blobs and increasing the size of each blob, but this will increase the reconstruction cost.
Therefore, we ultimately want to go further and perform 2D sampling, which not only randomly samples within the blob, but also randomly samples between blobs. Using the linear property of KZG commitment, we extend the blob set in a Block through a new set of virtual blobs, which redundantly encode the same information.
Therefore, in the end, we want to go further and perform 2D sampling, which is not only within the blob, but also between the blobs for random sampling. The linear property promised by KZG is used to expand a block’s blob set, which contains a new virtual blob list that redundantly encodes the same information.
2D Sampling. Source: a16z crypto.
It is crucial that the expansion of computational commitments does not require a blob, so the scheme is fundamentally friendly to distributed Block construction. The actual construction Node of Block only needs to have the blob KZG commitment, and they can rely on Data Availability Sampling (DAS) to verify the availability of data blocks. One-dimensional Data Availability Sampling (1D DAS) is also essentially friendly to distributed block construction.
What are the links to existing research?
Introduction to Data Availability Original Post (2018):
Follow-up paper:
Explanation article about DAS, paradigm:
2D availability with KZG commitment:
PeerDAS on ethresear.ch: and the paper:
EIP-7594:
SubnetDAS on ethresear.ch:
Subtle differences in recoverability in 2D sampling:
What else needs to be done? What are the trade-offs?
Next is the implementation and launch of PeerDAS. After that, continuously increasing the number of blobs on PeerDAS, while carefully observing the network and improving the software to ensure security, this is a gradual process. At the same time, we hope to have more academic work to standardize the interaction of PeerDAS and other versions of DAS and their security with fork choice rule.
In the future, we need to do more work to determine the ideal version of 2D DAS and prove its security properties. We also hope to eventually move away from KZG to an alternative solution that is quantum-safe and does not require a trusted setup. Currently, we are not sure which candidate solutions are friendly to distributed Block construction. Even with the expensive “brute force” technique, using recursive STARK to generate validity proof for reconstructing rows and columns is not sufficient to meet the requirements, because although technically, the size of a STARK is O(log(n) * log(log(n)) hash values (using STIR), in practice, the STARK is almost as large as the entire blob.
My view of the long-term reality path is:
Implement the ideal 2D DAS;
Adhere to using 1 D DAS, sacrificing sampling bandwidth efficiency for simplicity and robustness at the expense of lower data ceilings.
(Hard pivot) Abandon DA, fully embrace Plasma as the main Layer 2 architecture we follow.
Please note that even if we decide to scale directly on the L1 layer, this option exists. This is because if the L1 layer is to handle a large number of TPS, the L1 Block will become very large, and clients will want an efficient way to verify their correctness, so we will have to use the same technology as Rollup (such as ZK-EVM and DAS) at the L1 layer.
How to interact with other parts of the roadmap?
If data compression is implemented, the demand for 2D DAS will be reduced, or at least the latency will be reduced. If Plasma is widely used, the demand will be further reduced. DAS also poses challenges to the construction protocol and mechanism of distributed blocks: although DAS is theoretically friendly to distributed reconstruction, this requires integration with the package inclusion list proposal and the surrounding fork selection mechanism in practice.
Data Compression
What problem are we solving?
Each transaction in Rollup will consume a large amount of on-chain data space: ERC 20 transfer requires about 180 bytes. Even with ideal data availability sampling, this also limits the scalability of the Layer protocol. With each slot being 16 MB, we get: 01928374656574839201
16000000 / 12 / 180 = 7407 TPS
What if we could not only solve the problem of the numerator, but also the problem of the denominator, so that each transaction in the Rollup occupies fewer bytes on-chain?
What is it, how does it work?
In my opinion, the best explanation is this picture from two years ago:
In zero-byte compression, each long zero-byte sequence is replaced with two bytes to indicate the number of zero bytes. Furthermore, we make use of specific attributes of the transaction:
Signature Aggregation: We have switched from ECDSA signatures to BLS signatures. The characteristic of BLS signatures is that multiple signatures can be combined into a single signature, which can prove the validity of all original signatures. In L1, the computational cost of verification is high even with aggregation, so BLS signatures are not considered. However, in L2 environments where data is scarce, using BLS signatures is meaningful. The aggregation feature of ERC-4337 provides a way to achieve this functionality.
Replace Address with pointers: If you have used a certain Address before, we can replace the 20-byte Address with a 4-byte pointer pointing to a position in the history.
Custom serialization of transaction values - Most transaction values have very few digits, for example, 0.25 ETH is represented as 250,000,000,000,000,000 wei. The maximum base fee and the priority fee are also similar. Therefore, we can use a custom decimal floating-point format to represent most currency values.
What are the links to existing research?
Explore sequence.xyz:
L2 Calldata Optimized Contract:
Rollups based on validity proof (also known as ZK rollups) differ in their release status rather than transactions:
BLS Wallet - Implementing BLS aggregation through ERC-4337:
What else needs to be done, what are the trade-offs?
The next main thing to do is to actually implement the above plan. The main considerations include:
Switching to BLS signature requires a lot of effort and may drop compatibility with trusted hardware chips that enhance security. ZK-SNARK encapsulation using other signature schemes can be used as a substitute.
Dynamic compression (for example, replacing Address with pointers) will make the client code more complex.
Publishing state differences on-chain instead of in transactions will drop auditability and render many software (e.g., blockchain explorer) inoperable.
How to interact with other parts of the roadmap?
By adopting ERC-4337 and ultimately incorporating some of its content into L2 EVM, the deployment of aggregation technology can be greatly accelerated. Placing part of the content of ERC-4337 on L1 can speed up its deployment on L2.
Generalized Plasma
What problem are we solving?
Even with a 16 MB blob and data compression, 58, 000 TPS may not be enough to fully meet the needs of consumer payments, Decentralization social, or other high-bandwidth areas, especially when we start considering privacy factors, which may cause a drop in scalability by 3-8 times. For high volume, low-value use cases, one current option is to use Validium, which stores data off-chain and adopts an interesting security model: operators cannot steal user funds, but they may temporarily or permanently freeze all user funds. But we can do better.
What is it and how does it work?
Plasma is a scaling solution that involves an operator publishing Blocks to off-chain and putting the Merkle roots of these Blocks on-chain (unlike Rollup, which puts the complete Block on-chain). For each Block, the operator will send a Merkle proof to each user to prove what changes have occurred to the user’s assets, or if no changes have occurred. Users can extract their assets by providing a Merkle proof. Importantly, this proof does not have to be rooted in the latest state. Therefore, even if data availability becomes an issue, users can still recover their assets by extracting their available latest state. If a user submits an invalid proof (for example, extracting assets they have already sent to others, or the operator arbitrarily creating an asset), the legitimacy of the asset vesting can be judged through the on-chain challenge mechanism.
Plasma Cash chain diagram. The transaction spending coin i is placed at the i-th position in the tree. In this example, assuming all previous trees are valid, we know that Eve currently owns Token 1, David owns Token 4, and George owns Token 6.
Early versions of Plasma could only handle payment use cases, which hindered its further adoption. However, if we require every root to be verified by SNARKs, Plasma can become much more powerful. Every challenge game can be greatly simplified, as we exclude most of the possible paths for operator cheating. At the same time, new paths are also opened up, allowing Plasma technology to expand to a wider range of asset categories. Finally, in the case where operators do not cheat, users can withdraw funds immediately without waiting for a one-week challenge period.
One way (not the only way) to create an EVM Plasma chain is to use ZK-SNARK to build a parallel UTXO tree, which reflects the balance changes made by EVM and defines the unique mapping of the ‘same token’ at different points in history. Then a Plasma structure can be built on top of it.
A key insight is that the Plasma system does not need to be perfect. Even if you can only protect a subset of assets (for example, tokens that have not moved in the past week), you have already greatly improved the current state of the super scalable EVM (that is, Validium).
Another type of structure is a hybrid Plasma/Rollup, such as Intmax. These constructions put a minimal amount of data for each user on-chain (e.g., 5 bytes), which allows for certain characteristics between Plasma and Rollup: in the case of Intmax, you can achieve high scalability and privacy, although theoretically limited to about 266,667 TPS within a capacity of 16 MB.
What are the links related to existing research?
Original Plasma paper:
Plasma Cash:
Plasma Cashflow:
Intmax ( 2023):
What else needs to be done? What are the trade-offs?
The main remaining task is to put the Plasma system into practical production applications. As mentioned above, Plasma and Validium’ is not an either-or choice: any Validium can enhance its security attributes to some extent by incorporating Plasma features into its exit mechanism. The focus of the research is on obtaining the best attributes for the EVM (considering trust requirements, worst-case L1 Gas costs, and the ability to resist DoS attacks), as well as alternative specific application structures. In addition, compared to Rollup, Plasma is conceptually more complex, which requires direct resolution through research and construction of a better general framework.
The main trade-offs of using Plasma designs are that they rely more on operators and are more difficult to base, although a hybrid Plasma/Rollup design can often avoid this weakness.
How to interact with other parts of the roadmap?
The more effective the Plasma solution is, the less pressure there is on L1 with high-performance data availability. Moving activities to L2 can also reduce MEV pressure on L1.
Mature L2 Proof System
What problem are we solving?
Currently, most Rollups are not Trustless in practice. There is a security committee that has the ability to override the behavior of the proof system (optimistic or validity). In some cases, the proof system does not even run at all, or even if it does, it only has a “consultation” function. The most advanced Rollups include: (i) some application-specific Rollups that are Trustless, such as Fuel; (ii) as of the time of writing, Optimism and Arbitrum are two implementations of the “phase one” milestone of fully trustless EVM Rollups. The reason Rollup has not made greater progress is the concern about bugs in the code. We need Trustless Rollups, so we must face and address this issue.
What is it and how does it work?
First, let’s review the ‘stage’ system introduced in this article.
Phase 0: Users must be able to run Node and synchronize the chain. It’s okay if the validation is completely trustworthy/centralized.
Stage 1: There must be a (trustless) proof system to ensure that only valid transactions are accepted. It is allowed to have a security council that can overturn the proof system, but there must be a 75% quorum vote threshold. In addition, the quorum-blocking portion of the council (i.e. 26%+) must be outside the main company building the Rollup. A weaker upgrade mechanism (e.g. DAO) is allowed, but it must have sufficiently long latency so that if it approves a malicious upgrade, users can withdraw their funds before they come online.
Stage 2: There must be a (trustless) proof system to ensure that only valid transactions will be accepted. The Security Council only allows intervention when provable errors exist in the code, for example. If two redundant proof systems are inconsistent with each other, or if a proof system accepts two different post-state roots of the same Block (or accepts no content for a long enough time, e.g. a week). Upgrade mechanisms are allowed, but must have very long latency.
Our goal is to reach phase 2. The main challenge of reaching phase 2 is to gain enough confidence to prove that the system is actually trustworthy. There are two main methods to achieve this:
Formal Verification: We can use modern mathematics and computational techniques to prove (optimistic and validity) that the system only accepts Blocks that comply with the EVM specification. These techniques have been around for decades, but recent advances (such as Lean 4) have made them more practical, and AI-assisted proof advancements may further accelerate this trend.
Multi-provers: Create multiple proof systems and invest funds into these proof systems and security committees (or other small tools with trust assumptions, such as TEE). If the proof systems agree, the security committee has no power; if they do not agree, the security committee can only choose between them, and it cannot unilaterally impose its own answer.
The programmatic diagram of multiple validators combines an optimistic proof system, a validity proof system, and a security council.
What are the links to existing research?
EVM K Semantics (formal verification work from 2017):
Speech on the Concept of Multi-Proof (2022):
Taiko plans to use multi-proof:
What else needs to be done? What are the trade-offs?
For Formal Verification, the workload is huge. We need to create a formal verification version of the entire SNARK prover of the EVM. This is an extremely complex project, despite the fact that we have already started. There is a trick that can greatly simplify this task: we can create a SNARK prover that has undergone formal verification for a minimized Virtual Machine (such as RISC-V or Cairo), and then implement EVM in this minimized Virtual Machine (and formally prove its equivalence with other ETH Virtual Machine specifications).
For multi-proof, there are still two main parts to be completed. First, we need to have enough confidence in at least two different proof systems, ensuring that they are both fairly safe and that if they have problems, these problems should be different and unrelated (so they won’t occur at the same time). Second, we need to have a very high degree of trust in the underlying logic of the merged proof system. This part of the code is much smaller. There are ways to make it very small, by storing funds in a secure multisig contract with contracts representing various proof systems as signers, but this will increase the on-chain gas cost. We need to find a balance between efficiency and security.
How to interact with other parts of the roadmap?
Moving the activity to L2 can drop the MEV pressure on L1.
Cross-Layer 2 interoperability enhancement
What problem are we solving?
One of the main challenges facing the current L2 ecosystem is that it is difficult for users to navigate. In addition, the simplest methods often reintroduce trust assumptions, such as centralized cross-chain interaction, RPC clients, and so on. We need to make using the L2 ecosystem feel like using a unified Ethereum ecosystem.
What is it? How does it work?
There are many types of cross L2 interoperability improvements. In theory, focusing on Rollup is the same as implementing Sharding L1 for Ethereum. However, the current ETH L2 ecosystem still falls short of the ideal state in practice.
Address of a specific chain: The Address should contain chain information (L1, Optimism, Arbitrum, etc.). Once this is achieved, the cross L2 sending process can be accomplished by simply putting the Address into the ‘send’ field, at which point the Wallet can handle how to send in the background (including using Cross-Chain Interactionprotocol).
Payment Requests on Specific Chains: Messages in the form of ‘Send X Y-Type Tokens to Me on Chain Z’ should be easily and standardly created. This has two main use cases: (i) payments between individuals or between individuals and merchant services, (ii) DApp fund requests.
Cross-Chain Interaction Exchange and Gas Payment: There should be a standardized open protocol to express Cross-Chain Interaction operations, such as “I will send 1 ether to the person who sent me 0.9999 ether on Arbitrum (on Optimism)”, and “I will send 0.0001 ether to the person who included this transaction on Arbitrum (on Optimism)”. ERC-7683 is an attempt at the former, and RIP-7755 is an attempt at the latter, although both have broader applications than these specific use cases.
light client: Users should be able to verify the chain they are interacting with, rather than just trusting the RPC provider. Helios by a16z crypto can achieve this (for ETH mainnet itself), but we need to extend this trustlessness to L2. ERC-3668 (CCIP-read) is one strategy to achieve this goal.
How does the light client update its view of the Ethereum header chain? Once you have the header chain, you can use Merkle proof to verify any state object. Once you have the correct L1 state object, you can use Merkle proof (and optionally signatures if you want to check pre-commits) to verify any state object on L2. Helios has achieved the former. Extending to the latter is a standardization challenge.
Keystore Wallet: Nowadays, if you want to update the Secret Key that controls your Smart ContractWallet, you must update it on all N chains where the Wallet exists. Keystore Wallet is a technology that allows the Secret Key to exist in only one place (either on L1 or possibly on L2 in the future), and then any L2 with a copy of the Wallet can read the Secret Key from there. This means that the update only needs to be done once. To improve efficiency, Keystore Wallet requires L2 to have a standardized way to read information from L1 at no cost; there are two proposals for this, namely L1S LOAD and REMOTESTATICCALL.
How Keystore Wallet Works
More aggressive ‘shared Token bridge’ concept: Imagine a world where all L2 are validity proof Rollups and each slot submits to the ETH chain. Even in such a world, transferring an L2 asset from one L2 to another in its native state still requires withdrawal and deposit, which incurs a significant amount of L1 Gas fees. One way to address this issue is to create a shared minimal Rollup, whose sole function is to maintain which L2 owns each type of Token and how much balance each owns, and to allow these balances to be batch updated through a series of cross-L2 transfer operations initiated by any L2. This will enable cross-L2 transfers without the need to pay L1 gas fees for each transfer, and without the use of technologies based on Liquidity Providers such as ERC-7683.
Synchronous Composition: Allows synchronous calls between specific L2 and L1 or multiple L2. This helps improve the financial efficiency of the Decentralized Finance protocol. The former can be achieved without any cross-L2 coordination; the latter requires shared ordering. Based on Rollup technology, it automatically applies to all of these technologies.
What are the links to existing research?
Specific Address:
ERC-3770:
ERC-7683:
RIP-7755:
Scroll keystore Wallet design style:
Helios:
ERC-3668 (sometimes referred to as CCIP Read):
Justin Drake’s proposal for ‘pre-confirmation based (shared)’
L1S LOAD (RIP-7728): load-precompile/20388
在Optimism中的REMOTESTATICCALL:
AggLayer, which includes the idea of a shared token bridge:
What else needs to be done? What are the trade-offs?
Many of the examples above face the dilemma of when to standardize and which layers to standardize. If standardization is too early, it may entrench a poorer solution. If standardization is too late, it may result in unnecessary fragmentation. In some cases, there is both a short-term solution with weaker attributes that is easier to implement, as well as a long-term solution that is ‘ultimately correct’ but may take years to achieve.
These tasks are not just technical issues, they are also (and may even be primarily) social issues, requiring L2 and Wallet as well as L1 cooperation.
How to interact with other parts of the roadmap?
Most of these proposals are at a ‘higher level’ and therefore have little impact on the L1 level. An exception is the shared ordering, which has a significant impact on maximal extractable value (MEV).
Expand Execution on L1
What problem are we solving?
If L2 becomes very scalable and successful, but L1 still can only handle a very small volume, then Ethereum may face many risks:
The economic situation of ETH assets will become more unstable, which in turn will affect the long-term security of the network.
Many L2s benefit from close ties to highly developed financial ecosystems on L1. If this ecosystem is greatly weakened, the incentives to become an L2 (rather than an independent L1) will be reduced.
It will take a long time for L2 to achieve the same level of security as L1.
If L2 fails (e.g., due to malicious behavior or disappearance of the operator), users still need to recover their assets through L1. Therefore, L1 needs to be powerful enough to occasionally handle the highly complex and messy finishing work of L2.
For these reasons, continuing to expand L1 itself and ensuring that it can continue to accommodate more and more use cases is very valuable.
What is it? How does it work?
The simplest way to expand is to directly increase the Gas limit. However, this may lead to centralization of L1, weakening another important feature of Ethereum L1: the credibility as a robust foundational layer. There is still debate about how sustainable it is to simply increase the Gas limit, and this will also vary depending on the implementation of other technologies to make the validation of larger blocks easier (e.g. historical expiration, statelessness, L1 EVM validity proof). Another important thing that needs continuous improvement is the efficiency of Ethereum client software, which is much higher today than it was five years ago. An effective L1 Gas limit increase strategy will involve accelerating the development of these validation technologies.
EOF: A new EVM bytecode format that is more friendly to static analysis and can achieve faster implementation. Considering these efficiency improvements, EOF bytecode can obtain lower gas fees.
Multidimensional Gas Pricing: Different base fees and limits are set for computation, data, and storage, which can increase the average capacity of Ethereum L1 without increasing the maximum capacity (thus avoiding new security risks).
Drop specific opcodes and precompiled gas costs - Historically, to prevent denial-of-service attacks, we have repeatedly increased the gas costs of certain operations with low pricing. A little more can be done by dropping the gas costs of overpriced opcodes. For example, addition is much cheaper than multiplication, but currently the gas costs of the ADD and MUL opcodes are the same. We can drop the gas cost of ADD and even lower the cost of simpler opcodes like PUSH. EOF is more optimized in this regard overall.
EVM-MAX and SIMD: EVM-MAX is a proposal that allows for more efficient native big integer arithmetic as a separate module for the EVM. Unless intentionally exported, the values computed by EVM-MAX operations can only be accessed by other EVM-MAX opcodes. This allows for a larger space to optimize the storage format of these values. SIMD (single instruction multiple data) is a proposal that allows for efficient execution of the same instruction on arrays of values. Together, they can create a powerful coprocessor alongside the EVM for more efficient encryption operations. This is particularly useful for privacy protocols and L2 protection systems, thus contributing to the scalability of both L1 and L2.
These improvements will be discussed in more detail in future Splurge articles.
Finally, the third strategy is native Rollups (or enshrined rollups): Essentially, creating many parallel running EVM replicas, thus generating a model equivalent to what Rollup can offer, but more natively integrated into the protocol.
What are the links to existing research?
Polynya’s ETH L1 Extension Roadmap:
Multidimensional Gas Pricing:
EIP-7706:
EOF:
EVM-MAX:
SIMD:
Native rollups:
Max Resnick interviewed on the value of expanding L1:
Justin Drake talks about using SNARK and native Rollups for scaling:
What else needs to be done, what are the trade-offs?
L1 expansion has three strategies, which can be carried out separately or in parallel:
Improve the technology (such as client code, stateless client, historical expiration) to make L1 easier to verify, and then increase the Gas limit.
Increase the cost of performing specific operations, while not increasing the worst-case risk, to increase the average capacity;
Native Rollups (i.e., creating N parallel replicas of the EVM).
Understanding these different technologies, we will find that each has different trade-offs. For example, native Rollups have many of the same weaknesses in terms of composability as regular Rollups: you cannot send a single transaction to perform operations across multiple Rollups synchronously, as you can with contracts on the same L1 (or L2). Increasing the gas limit would weaken other benefits that can be achieved through simplified L1 verification, such as increasing the proportion of users running validation nodes and increasing the number of solo stakers. Depending on the implementation, making specific operations in the Ethereum Virtual Machine (EVM) cheaper may increase the overall complexity of the EVM.
Any major question that needs to be answered by any L1 scaling roadmap is: what are the ultimate visions of L1 and L2 respectively? Obviously, it is absurd to put all content on L1: potential use cases may involve hundreds of thousands of transactions per second, which would make L1 completely unable to verify (unless we adopt the native Rollup approach). But we do need some guiding principles to ensure that we do not end up in a situation where Gas limits are increased by 10 times, severely compromising the Decentralization of the ETHereum L1.
One viewpoint on the division of labor between L1 and L2
How to interact with other parts of the roadmap?
Bringing more users into L1 not only means improving scalability, but also improving other aspects of L1. This means that more MEV will stay on L1 (rather than just becoming an issue for L2), making the need to address MEV more urgent. This will greatly increase the value of fast slot times on L1. At the same time, it also heavily relies on the smooth operation of L1 (the Verge) validation.
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
Vitalik's new article: The possible future of Ethereum, The Surge
Original author: Vitalik Buterin
Original text compilation: Karen, Foresight News
Special thanks to Justin Drake, Francesco, Hsiao-wei Wang, @antonttc, and Georgios Konstantopoulos.
Initially, there were two scaling strategies in the Ethereum roadmap. One (see an early paper from 2015) is ‘Sharding’: each Node only needs to validate and store a small portion of transactions, rather than all transactions on the chain. Any other peer-to-peer network (such as BitTorrent) also works in this way, so we can certainly make the blockchain work in the same way. The other is the Layer 2 protocol: these networks will sit on top of Ethereum, allowing it to benefit fully from its security while keeping most data and computations off the mainchain. Layer 2 protocol refers to state channels in 2015, Plasma in 2017, and then Rollup in 2019. Rollup is more powerful than state channels or Plasma, but they require a large amount of on-chain data bandwidth. Fortunately, by 2019, Sharding research had solved the problem of large-scale validation of ‘data availability’. As a result, the two paths merged, and we got a roadmap centered on Rollup, which is still Ethereum’s scaling strategy today.
The Surge, 2023 Roadmap Edition
The Rollup-centric roadmap proposes a simple division of labor: ETH L1 focuses on becoming a powerful and decentralized base layer, while L2 takes on the task of helping the ecosystem scale. This pattern is ubiquitous in society: the existence of the court system (L1) is not for pursuing ultra-high speed and efficiency, but for protecting contracts and property rights, while entrepreneurs (L2) need to build on this solid foundation to lead humanity towards Mars (both literally and metaphorically).
This year, the roadmap centered around Rollup has achieved important milestones: with the launch of EIP-4844 blobs, the data bandwidth of Ethereum L1 has significantly increased, and multiple Ethereum Virtual Machine (EVM) Rollups have entered the first stage. Each L2 exists as a ‘Sharding’ with its own internal rules and logic, and the diversity and pluralism of Sharding implementation have now become a reality. However, as we can see, this path also faces some unique challenges. Therefore, our current task is to complete the roadmap centered around Rollup, address these issues, while maintaining the robustness and decentralization unique to Ethereum L1.
The Surge: Key Objectives
In the future, Ethereum can achieve more than 100,000 TPS through L2.
Maintain the decentralization and robustness of L1;
At least some L2 fully inherit the core properties of Ethereum (Trustless, open, censorship-resistant);
Ethereum should feel like a unified ecosystem, not 34 different blockchains.
This Chapter
Scalability Trilemma
The Scalability Trilemma is an idea proposed in 2017, which suggests that there is a contradiction between the three characteristics of blockchain: Decentralization (specifically, low cost of running Nodes), scalability (processing a large number of transactions), and security (attackers need to destroy a large part of the network Nodes to cause a single transaction to fail).
It is worth noting that the Trilemma of Decentralization is not a theorem, and the posts introducing the Trilemma of Decentralization do not come with mathematical proofs. It does present a heuristic mathematical argument: if a Decentralization-friendly Node (such as a consumer laptop) can verify N transactions per second, and you have a chain that processes k*N transactions per second, then (i) each transaction can only be seen by 1/k of the Nodes, which means an attacker only needs to disrupt a few Nodes to pass a malicious transaction, or (ii) your Node will become powerful, but your chain will not be decentralized. The purpose of this article is not to prove that breaking the Trilemma of Decentralization is impossible; instead, it aims to show that breaking the Trilemma is difficult and requires thinking outside the framework implied by the argument.
For years, some high-performance chains have often claimed to have solved the Trilemma without fundamentally changing the architecture, usually by optimizing Nodes through software engineering techniques. This is always misleading, as running Nodes on-chain is much more difficult than running Nodes on ETH chain. This article will explore why this is the case, and why L1 client software engineering alone cannot scale ETH chain.
However, the combination of data availability sampling and SNARKs does solve the trilemma: it allows clients to verify that a certain amount of data is available and a certain number of computational steps are executed correctly by downloading only a small amount of data and performing a minimal amount of computation. SNARKs are trustless. Data availability sampling has a subtle few-of-N trust model, but it preserves the fundamental property of an unscalable chain that even a 51% attack cannot force bad blocks to be accepted by the network.
Another way to solve the trilemma is the Plasma architecture, which cleverly shifts the responsibility for monitoring data availability to users in an incentivized manner. As early as 2017-2019, when we only had fraud proof as a means to expand computational capacity, Plasma was very limited in terms of secure execution. However, with the widespread adoption of SNARKs (Zero-Knowledge Succinct Non-Interactive Arguments of Knowledge), the Plasma architecture has become more feasible for a wider range of use cases than ever before.
Further Progress on Data Availability Sampling
What problem are we solving?
On March 13, 2024, when Dencun is upgraded and launched, there will be approximately 3 blobs of about 125 kB per 12-second slot on the Ethereum blockchain, or about 375 kB of available bandwidth per slot. Assuming that transaction data is published directly on-chain, the TPS (transactions per second) for ERC20 transfers on the Ethereum blockchain Rollup is approximately 173.6, calculated as 375000 / 12 / 180 = 173.6 TPS.
If we add the calldata of Ethereum (theoretical maximum: 30 million gas per slot / 16 gas per byte = 1,875,000 bytes per slot), it becomes 607 TPS. Using PeerDAS, the number of blobs may increase to 8-16, which will provide 463-926 TPS for calldata.
This is a major upgrade for Ethereum L1, but it’s not enough. We want more scalability. Our mid-term goal is 16 MB per slot, which, combined with the improvement of Rollup data compression, will bring ~58000 TPS.
What is it? How does it work?
PeerDAS is a relatively simple implementation of “1 D sampling”. In the ETH network, each blob is a 4096-degree polynomial in a 253-bit prime field. We broadcast shares of the polynomial, where each share contains 16 evaluation values from 16 adjacent coordinates out of a total of 8192 coordinates. Among these 8192 evaluation values, any 4096 (any 64 out of 128 possible samples according to the current proposed parameters) can recover the blob.
The working principle of PeerDAS is to make each client listen to a small number of subnets. In the i-th subnet, the i-th sample of any blob is broadcasted, and by asking peers in the global P2P network (who listen to different subnets), it requests blobs on other subnets that it needs. A more conservative version, SubnetDAS, only uses the subnet mechanism without additional querying at the peer layer. The current proposal is for nodes participating in Proof of Stake to use SubnetDAS, while other nodes (i.e., clients) use PeerDAS.
In theory, we can scale up a ‘1 D sampling’ to a fairly large scale: if we increase the maximum number of blobs to 256 (targeting 128), then we can achieve the goal of 16 MB, while data availability sampling each Node 16 samples * 128 blobs * 512 bytes per sample = 1 MB data bandwidth per slot. This is just barely within our tolerance range: it is feasible, but it means that bandwidth-limited clients cannot sample. We can optimize this to some extent by reducing the number of blobs and increasing the size of each blob, but this will increase the reconstruction cost.
Therefore, we ultimately want to go further and perform 2D sampling, which not only randomly samples within the blob, but also randomly samples between blobs. Using the linear property of KZG commitment, we extend the blob set in a Block through a new set of virtual blobs, which redundantly encode the same information.
Therefore, in the end, we want to go further and perform 2D sampling, which is not only within the blob, but also between the blobs for random sampling. The linear property promised by KZG is used to expand a block’s blob set, which contains a new virtual blob list that redundantly encodes the same information.![Vitalik新文:以太坊可能的未来,The Surge]()
2D Sampling. Source: a16z crypto.
It is crucial that the expansion of computational commitments does not require a blob, so the scheme is fundamentally friendly to distributed Block construction. The actual construction Node of Block only needs to have the blob KZG commitment, and they can rely on Data Availability Sampling (DAS) to verify the availability of data blocks. One-dimensional Data Availability Sampling (1D DAS) is also essentially friendly to distributed block construction.
What are the links to existing research?
What else needs to be done? What are the trade-offs?
Next is the implementation and launch of PeerDAS. After that, continuously increasing the number of blobs on PeerDAS, while carefully observing the network and improving the software to ensure security, this is a gradual process. At the same time, we hope to have more academic work to standardize the interaction of PeerDAS and other versions of DAS and their security with fork choice rule.
In the future, we need to do more work to determine the ideal version of 2D DAS and prove its security properties. We also hope to eventually move away from KZG to an alternative solution that is quantum-safe and does not require a trusted setup. Currently, we are not sure which candidate solutions are friendly to distributed Block construction. Even with the expensive “brute force” technique, using recursive STARK to generate validity proof for reconstructing rows and columns is not sufficient to meet the requirements, because although technically, the size of a STARK is O(log(n) * log(log(n)) hash values (using STIR), in practice, the STARK is almost as large as the entire blob.
My view of the long-term reality path is:
Please note that even if we decide to scale directly on the L1 layer, this option exists. This is because if the L1 layer is to handle a large number of TPS, the L1 Block will become very large, and clients will want an efficient way to verify their correctness, so we will have to use the same technology as Rollup (such as ZK-EVM and DAS) at the L1 layer.
How to interact with other parts of the roadmap?
If data compression is implemented, the demand for 2D DAS will be reduced, or at least the latency will be reduced. If Plasma is widely used, the demand will be further reduced. DAS also poses challenges to the construction protocol and mechanism of distributed blocks: although DAS is theoretically friendly to distributed reconstruction, this requires integration with the package inclusion list proposal and the surrounding fork selection mechanism in practice.
Data Compression
What problem are we solving?
Each transaction in Rollup will consume a large amount of on-chain data space: ERC 20 transfer requires about 180 bytes. Even with ideal data availability sampling, this also limits the scalability of the Layer protocol. With each slot being 16 MB, we get: 01928374656574839201
16000000 / 12 / 180 = 7407 TPS
What if we could not only solve the problem of the numerator, but also the problem of the denominator, so that each transaction in the Rollup occupies fewer bytes on-chain?
What is it, how does it work?
In my opinion, the best explanation is this picture from two years ago:
In zero-byte compression, each long zero-byte sequence is replaced with two bytes to indicate the number of zero bytes. Furthermore, we make use of specific attributes of the transaction:
Signature Aggregation: We have switched from ECDSA signatures to BLS signatures. The characteristic of BLS signatures is that multiple signatures can be combined into a single signature, which can prove the validity of all original signatures. In L1, the computational cost of verification is high even with aggregation, so BLS signatures are not considered. However, in L2 environments where data is scarce, using BLS signatures is meaningful. The aggregation feature of ERC-4337 provides a way to achieve this functionality.
Replace Address with pointers: If you have used a certain Address before, we can replace the 20-byte Address with a 4-byte pointer pointing to a position in the history.
Custom serialization of transaction values - Most transaction values have very few digits, for example, 0.25 ETH is represented as 250,000,000,000,000,000 wei. The maximum base fee and the priority fee are also similar. Therefore, we can use a custom decimal floating-point format to represent most currency values.
What are the links to existing research?
What else needs to be done, what are the trade-offs?
The next main thing to do is to actually implement the above plan. The main considerations include:
Switching to BLS signature requires a lot of effort and may drop compatibility with trusted hardware chips that enhance security. ZK-SNARK encapsulation using other signature schemes can be used as a substitute.
Dynamic compression (for example, replacing Address with pointers) will make the client code more complex.
Publishing state differences on-chain instead of in transactions will drop auditability and render many software (e.g., blockchain explorer) inoperable.
How to interact with other parts of the roadmap?
By adopting ERC-4337 and ultimately incorporating some of its content into L2 EVM, the deployment of aggregation technology can be greatly accelerated. Placing part of the content of ERC-4337 on L1 can speed up its deployment on L2.
Generalized Plasma
What problem are we solving?
Even with a 16 MB blob and data compression, 58, 000 TPS may not be enough to fully meet the needs of consumer payments, Decentralization social, or other high-bandwidth areas, especially when we start considering privacy factors, which may cause a drop in scalability by 3-8 times. For high volume, low-value use cases, one current option is to use Validium, which stores data off-chain and adopts an interesting security model: operators cannot steal user funds, but they may temporarily or permanently freeze all user funds. But we can do better.
What is it and how does it work?
Plasma is a scaling solution that involves an operator publishing Blocks to off-chain and putting the Merkle roots of these Blocks on-chain (unlike Rollup, which puts the complete Block on-chain). For each Block, the operator will send a Merkle proof to each user to prove what changes have occurred to the user’s assets, or if no changes have occurred. Users can extract their assets by providing a Merkle proof. Importantly, this proof does not have to be rooted in the latest state. Therefore, even if data availability becomes an issue, users can still recover their assets by extracting their available latest state. If a user submits an invalid proof (for example, extracting assets they have already sent to others, or the operator arbitrarily creating an asset), the legitimacy of the asset vesting can be judged through the on-chain challenge mechanism.
Plasma Cash chain diagram. The transaction spending coin i is placed at the i-th position in the tree. In this example, assuming all previous trees are valid, we know that Eve currently owns Token 1, David owns Token 4, and George owns Token 6.
Early versions of Plasma could only handle payment use cases, which hindered its further adoption. However, if we require every root to be verified by SNARKs, Plasma can become much more powerful. Every challenge game can be greatly simplified, as we exclude most of the possible paths for operator cheating. At the same time, new paths are also opened up, allowing Plasma technology to expand to a wider range of asset categories. Finally, in the case where operators do not cheat, users can withdraw funds immediately without waiting for a one-week challenge period.
One way (not the only way) to create an EVM Plasma chain is to use ZK-SNARK to build a parallel UTXO tree, which reflects the balance changes made by EVM and defines the unique mapping of the ‘same token’ at different points in history. Then a Plasma structure can be built on top of it.
A key insight is that the Plasma system does not need to be perfect. Even if you can only protect a subset of assets (for example, tokens that have not moved in the past week), you have already greatly improved the current state of the super scalable EVM (that is, Validium).
Another type of structure is a hybrid Plasma/Rollup, such as Intmax. These constructions put a minimal amount of data for each user on-chain (e.g., 5 bytes), which allows for certain characteristics between Plasma and Rollup: in the case of Intmax, you can achieve high scalability and privacy, although theoretically limited to about 266,667 TPS within a capacity of 16 MB.
What are the links related to existing research?
What else needs to be done? What are the trade-offs?
The main remaining task is to put the Plasma system into practical production applications. As mentioned above, Plasma and Validium’ is not an either-or choice: any Validium can enhance its security attributes to some extent by incorporating Plasma features into its exit mechanism. The focus of the research is on obtaining the best attributes for the EVM (considering trust requirements, worst-case L1 Gas costs, and the ability to resist DoS attacks), as well as alternative specific application structures. In addition, compared to Rollup, Plasma is conceptually more complex, which requires direct resolution through research and construction of a better general framework.
The main trade-offs of using Plasma designs are that they rely more on operators and are more difficult to base, although a hybrid Plasma/Rollup design can often avoid this weakness.
How to interact with other parts of the roadmap?
The more effective the Plasma solution is, the less pressure there is on L1 with high-performance data availability. Moving activities to L2 can also reduce MEV pressure on L1.
Mature L2 Proof System
What problem are we solving?
Currently, most Rollups are not Trustless in practice. There is a security committee that has the ability to override the behavior of the proof system (optimistic or validity). In some cases, the proof system does not even run at all, or even if it does, it only has a “consultation” function. The most advanced Rollups include: (i) some application-specific Rollups that are Trustless, such as Fuel; (ii) as of the time of writing, Optimism and Arbitrum are two implementations of the “phase one” milestone of fully trustless EVM Rollups. The reason Rollup has not made greater progress is the concern about bugs in the code. We need Trustless Rollups, so we must face and address this issue.
What is it and how does it work?
First, let’s review the ‘stage’ system introduced in this article.
Phase 0: Users must be able to run Node and synchronize the chain. It’s okay if the validation is completely trustworthy/centralized.
Stage 1: There must be a (trustless) proof system to ensure that only valid transactions are accepted. It is allowed to have a security council that can overturn the proof system, but there must be a 75% quorum vote threshold. In addition, the quorum-blocking portion of the council (i.e. 26%+) must be outside the main company building the Rollup. A weaker upgrade mechanism (e.g. DAO) is allowed, but it must have sufficiently long latency so that if it approves a malicious upgrade, users can withdraw their funds before they come online.
Stage 2: There must be a (trustless) proof system to ensure that only valid transactions will be accepted. The Security Council only allows intervention when provable errors exist in the code, for example. If two redundant proof systems are inconsistent with each other, or if a proof system accepts two different post-state roots of the same Block (or accepts no content for a long enough time, e.g. a week). Upgrade mechanisms are allowed, but must have very long latency.
Our goal is to reach phase 2. The main challenge of reaching phase 2 is to gain enough confidence to prove that the system is actually trustworthy. There are two main methods to achieve this:
The programmatic diagram of multiple validators combines an optimistic proof system, a validity proof system, and a security council.
What are the links to existing research?
What else needs to be done? What are the trade-offs?
For Formal Verification, the workload is huge. We need to create a formal verification version of the entire SNARK prover of the EVM. This is an extremely complex project, despite the fact that we have already started. There is a trick that can greatly simplify this task: we can create a SNARK prover that has undergone formal verification for a minimized Virtual Machine (such as RISC-V or Cairo), and then implement EVM in this minimized Virtual Machine (and formally prove its equivalence with other ETH Virtual Machine specifications).
For multi-proof, there are still two main parts to be completed. First, we need to have enough confidence in at least two different proof systems, ensuring that they are both fairly safe and that if they have problems, these problems should be different and unrelated (so they won’t occur at the same time). Second, we need to have a very high degree of trust in the underlying logic of the merged proof system. This part of the code is much smaller. There are ways to make it very small, by storing funds in a secure multisig contract with contracts representing various proof systems as signers, but this will increase the on-chain gas cost. We need to find a balance between efficiency and security.
How to interact with other parts of the roadmap?
Moving the activity to L2 can drop the MEV pressure on L1.
Cross-Layer 2 interoperability enhancement
What problem are we solving?
One of the main challenges facing the current L2 ecosystem is that it is difficult for users to navigate. In addition, the simplest methods often reintroduce trust assumptions, such as centralized cross-chain interaction, RPC clients, and so on. We need to make using the L2 ecosystem feel like using a unified Ethereum ecosystem.
What is it? How does it work?
There are many types of cross L2 interoperability improvements. In theory, focusing on Rollup is the same as implementing Sharding L1 for Ethereum. However, the current ETH L2 ecosystem still falls short of the ideal state in practice.
Address of a specific chain: The Address should contain chain information (L1, Optimism, Arbitrum, etc.). Once this is achieved, the cross L2 sending process can be accomplished by simply putting the Address into the ‘send’ field, at which point the Wallet can handle how to send in the background (including using Cross-Chain Interactionprotocol).
Payment Requests on Specific Chains: Messages in the form of ‘Send X Y-Type Tokens to Me on Chain Z’ should be easily and standardly created. This has two main use cases: (i) payments between individuals or between individuals and merchant services, (ii) DApp fund requests.
Cross-Chain Interaction Exchange and Gas Payment: There should be a standardized open protocol to express Cross-Chain Interaction operations, such as “I will send 1 ether to the person who sent me 0.9999 ether on Arbitrum (on Optimism)”, and “I will send 0.0001 ether to the person who included this transaction on Arbitrum (on Optimism)”. ERC-7683 is an attempt at the former, and RIP-7755 is an attempt at the latter, although both have broader applications than these specific use cases.
light client: Users should be able to verify the chain they are interacting with, rather than just trusting the RPC provider. Helios by a16z crypto can achieve this (for ETH mainnet itself), but we need to extend this trustlessness to L2. ERC-3668 (CCIP-read) is one strategy to achieve this goal.
How does the light client update its view of the Ethereum header chain? Once you have the header chain, you can use Merkle proof to verify any state object. Once you have the correct L1 state object, you can use Merkle proof (and optionally signatures if you want to check pre-commits) to verify any state object on L2. Helios has achieved the former. Extending to the latter is a standardization challenge.
How Keystore Wallet Works
More aggressive ‘shared Token bridge’ concept: Imagine a world where all L2 are validity proof Rollups and each slot submits to the ETH chain. Even in such a world, transferring an L2 asset from one L2 to another in its native state still requires withdrawal and deposit, which incurs a significant amount of L1 Gas fees. One way to address this issue is to create a shared minimal Rollup, whose sole function is to maintain which L2 owns each type of Token and how much balance each owns, and to allow these balances to be batch updated through a series of cross-L2 transfer operations initiated by any L2. This will enable cross-L2 transfers without the need to pay L1 gas fees for each transfer, and without the use of technologies based on Liquidity Providers such as ERC-7683.
Synchronous Composition: Allows synchronous calls between specific L2 and L1 or multiple L2. This helps improve the financial efficiency of the Decentralized Finance protocol. The former can be achieved without any cross-L2 coordination; the latter requires shared ordering. Based on Rollup technology, it automatically applies to all of these technologies.
What are the links to existing research?
Specific Address:
ERC-3770:
ERC-7683:
RIP-7755:
Scroll keystore Wallet design style:
Helios:
ERC-3668 (sometimes referred to as CCIP Read):
Justin Drake’s proposal for ‘pre-confirmation based (shared)’
L1S LOAD (RIP-7728): load-precompile/20388
在Optimism中的REMOTESTATICCALL:
AggLayer, which includes the idea of a shared token bridge:
What else needs to be done? What are the trade-offs?
Many of the examples above face the dilemma of when to standardize and which layers to standardize. If standardization is too early, it may entrench a poorer solution. If standardization is too late, it may result in unnecessary fragmentation. In some cases, there is both a short-term solution with weaker attributes that is easier to implement, as well as a long-term solution that is ‘ultimately correct’ but may take years to achieve.
These tasks are not just technical issues, they are also (and may even be primarily) social issues, requiring L2 and Wallet as well as L1 cooperation.
How to interact with other parts of the roadmap?
Most of these proposals are at a ‘higher level’ and therefore have little impact on the L1 level. An exception is the shared ordering, which has a significant impact on maximal extractable value (MEV).
Expand Execution on L1
What problem are we solving?
If L2 becomes very scalable and successful, but L1 still can only handle a very small volume, then Ethereum may face many risks:
The economic situation of ETH assets will become more unstable, which in turn will affect the long-term security of the network.
Many L2s benefit from close ties to highly developed financial ecosystems on L1. If this ecosystem is greatly weakened, the incentives to become an L2 (rather than an independent L1) will be reduced.
It will take a long time for L2 to achieve the same level of security as L1.
If L2 fails (e.g., due to malicious behavior or disappearance of the operator), users still need to recover their assets through L1. Therefore, L1 needs to be powerful enough to occasionally handle the highly complex and messy finishing work of L2.
For these reasons, continuing to expand L1 itself and ensuring that it can continue to accommodate more and more use cases is very valuable.
What is it? How does it work?
The simplest way to expand is to directly increase the Gas limit. However, this may lead to centralization of L1, weakening another important feature of Ethereum L1: the credibility as a robust foundational layer. There is still debate about how sustainable it is to simply increase the Gas limit, and this will also vary depending on the implementation of other technologies to make the validation of larger blocks easier (e.g. historical expiration, statelessness, L1 EVM validity proof). Another important thing that needs continuous improvement is the efficiency of Ethereum client software, which is much higher today than it was five years ago. An effective L1 Gas limit increase strategy will involve accelerating the development of these validation technologies.
These improvements will be discussed in more detail in future Splurge articles.
Finally, the third strategy is native Rollups (or enshrined rollups): Essentially, creating many parallel running EVM replicas, thus generating a model equivalent to what Rollup can offer, but more natively integrated into the protocol.
What are the links to existing research?
What else needs to be done, what are the trade-offs?
L1 expansion has three strategies, which can be carried out separately or in parallel:
Understanding these different technologies, we will find that each has different trade-offs. For example, native Rollups have many of the same weaknesses in terms of composability as regular Rollups: you cannot send a single transaction to perform operations across multiple Rollups synchronously, as you can with contracts on the same L1 (or L2). Increasing the gas limit would weaken other benefits that can be achieved through simplified L1 verification, such as increasing the proportion of users running validation nodes and increasing the number of solo stakers. Depending on the implementation, making specific operations in the Ethereum Virtual Machine (EVM) cheaper may increase the overall complexity of the EVM.
Any major question that needs to be answered by any L1 scaling roadmap is: what are the ultimate visions of L1 and L2 respectively? Obviously, it is absurd to put all content on L1: potential use cases may involve hundreds of thousands of transactions per second, which would make L1 completely unable to verify (unless we adopt the native Rollup approach). But we do need some guiding principles to ensure that we do not end up in a situation where Gas limits are increased by 10 times, severely compromising the Decentralization of the ETHereum L1.
One viewpoint on the division of labor between L1 and L2
How to interact with other parts of the roadmap?
Bringing more users into L1 not only means improving scalability, but also improving other aspects of L1. This means that more MEV will stay on L1 (rather than just becoming an issue for L2), making the need to address MEV more urgent. This will greatly increase the value of fast slot times on L1. At the same time, it also heavily relies on the smooth operation of L1 (the Verge) validation.