A deep dive into data availability on FIL

星球日报
FIL2,69%
PNG1,94%

深入探讨Filecoin上的数据可用性

Editor’s Note: This article is reproduced from the original content published by Turan Vural Yuki Yuminaga of Fenbushi Capital on April 5, 2024. Fenbushi Capital was established in 2015 and is a leading Asian blockchain asset management company with assets under management of 1.6 billion USD. The company aims to play a significant role in shaping the future of blockchain technology through research and investment. This article is an example of these efforts and represents the independent views of the authors, who have agreed to publish it here.

深入探讨Filecoin上的数据可用性

Data Availability (DA) is a core technology for Ethereum scalability, which allows nodes to efficiently verify whether data is available on the network without hosting the relevant data. This is crucial for efficiently building rollups and other forms of vertical scaling, allowing executing nodes to ensure transaction data is available during settlement. It is also crucial for shard chains and other forms of horizontal scaling (planned updates for the Ethereum network in the future), as nodes need to prove that transaction data (or blobs) stored in network shards are indeed available on the network.

Recently, several DA solutions have been discussed and released (e.g. Celestia, EigenDA, Avail), all aiming to provide high-performance and secure infrastructure for the deployment of DA in applications.

Compared to L1 such as Ethereum, the advantage of external DA solutions is that they provide a cost-effective and high-performance carrier for on-chain data. DA solutions are usually composed of their own public chains, which aim to achieve low-cost and permissionless storage. Even with modifications, hosting data locally on the blockchain is still extremely inefficient.

In view of this, we find it very intuitive to explore storage optimization solutions (such as FIL) as the foundation of the DA layer. Filecoin uses its blockchain to coordinate storage transactions between users and storage providers, but allows data to be stored off-chain.

In this article, we have studied the feasibility of building DA solutions on top of a decentralized storage network (DSN). We specifically consider Filecoin as it is the most widely adopted DSN to date. We outline the opportunities that such solutions will bring and the challenges that need to be overcome in building this solution.

DA layer provides the following functions for the services that depend on it:

1. User Security: No node can be sure that unavailable data is available.

2. Global Security: Except for a few nodes, all nodes agree on the availability/unavailability of data.

3. Efficient data retrieval capability.

All of these need to be efficiently completed in order to achieve scalability. The DA layer provides higher performance at a lower cost on the three points mentioned above. For example, any node can request a complete copy of the data to prove custody, but this is inefficient. By providing a system that meets the above three points, we have implemented a DA layer that provides the security required for L2 and L1 coordination, and provides a stronger lower limit in the presence of a malicious majority.

Data Hosting

The data published to the DA solution has a valid lifecycle: long enough to resolve disputes or verify state transitions. Transaction data only needs to be available for a sufficient amount of time to verify the correct state transition or give validators enough opportunity to construct fraud proofs. As of the time of writing, Ethereum calldata is the most commonly used solution for data availability in rollups.

Efficient Data Validation

Data Availability Sampling (DAS) is the standard method to address the DA problem. It has additional security advantages and enhances the ability of network actors to validate state information from their peers. However, it relies on nodes to perform the sampling: nodes must respond to DAS requests to ensure that mining transactions are not rejected, but there is no positive or negative incentive for nodes to request samples. From the perspective of the requesting sample node, there is no negative punishment for not executing DAS. For example, Celestia provides the first and only lightweight user-side implementation of DAS, which offers users stronger security assumptions and reduces data verification costs.

Efficient Access

DA needs to provide efficient data access for the projects that use it. A slow DA can become a bottleneck for services that rely on it, leading to low efficiency or system errors.

Decentralized Storage Network

Decentralized Storage Network (DSN), as described in the Filecoin White Paper, is a permissionless network composed of storage providers that provides storage services to network users. Informally, it allows independent storage providers to coordinate storage transactions with users in need of storage services, and provides low-cost and flexible data storage to users seeking affordable storage services. This is coordinated through a blockchain that records storage transactions and supports the execution of smart contracts.

The DSN scheme is a tuple of three protocols: Put, Get, and Manage. This tuple has properties such as fault tolerance guarantees and participation incentives.

Put (Data) → Key

To store data under a unique secret key, the client executes a Put operation. This is achieved by specifying the duration for which the data is stored on the network, the number of redundant data copies stored for redundancy, and the price negotiated with the storage provider.

Get(Secret Key) → data

The user executes Get to retrieve the data stored under the Secret Key.

Manage

Network participants invoke management protocols to coordinate the storage space and services provided by providers and to repair errors. For Filecoin, this is managed through the blockchain. The blockchain records data transactions between users and data providers, as well as proofs of correct data storage, ensuring the maintenance of data transactions. The correctness of data storage is proven by publishing proofs generated by data providers in response to network challenges. When a storage provider fails to timely generate proofs of replication or proofs of spacetime according to the management protocol’s requirements, a storage error occurs, which leads to a reduction in the rights of the storage provider. If multiple providers host data copies on the network, transactions can be fulfilled by finding new storage providers, thus completing self-repair.

DSN Opportunity

So far, the work done by the DA project has been to transform the blockchain into a hot storage platform. Instead of transforming the blockchain into a storage platform, we can simply transform the storage platform into a platform that provides data availability, thanks to the optimization of DSN for storage. The collateral provided by storage providers in the form of native FIL tokens can provide encryption economic security, ensuring data storage. Lastly, the programmability of storage transactions can provide flexibility for data availability terms.

The most powerful motivation for transforming DSN functionality into a solution for DA is to reduce the data storage cost under the DA solution. As described below, the cost of storing data on Filecoin is much cheaper than on Ethereum. Considering the current price of Ether/USD, writing 1 GB of calldata to Ethereum would cost over 3 million USD and would be pruned after 21 days. This calldata cost could account for more than half of the aggregate transaction cost based on Ethereum. However, the storage cost of 1 GB on Filecoin is less than 0.0002 USD per month. At this price or any similar price, it ensures that DA will reduce users’ transaction costs and help improve the performance and scalability of Web3.

Economic Security

In Filecoin, providing storage space requires collateral. If a provider fails to fulfill a transaction or comply with network guarantees, the collateral will be reduced. Storage providers who fail to provide services will face the risk of losing collateral and any profits earned.

Incentive Mechanism Adjustment

Many of the incentive measures of the Filecoin protocol are consistent with the goals of DA. Filecoin provides deterrent measures for malicious or lazy behavior: during the consensus process, storage providers must actively provide storage proofs in the form of Proof of Replication and Proof-of-Spacetime, continuously proving the existence of storage without assuming an honest majority. If a storage provider fails to provide proof, their stake will be reduced, they will be removed from the consensus, and they will be subject to other penalties. The current DA solution lacks incentives for nodes to execute DAS and can only rely on temporary altruistic behavior to prove DA.

Programmability

Customizable data trading capabilities also make DSN an attractive DA platform. Data transactions can have different durations, allowing DA users based on DSN to only pay for the DA fees they need, and adjust fault tolerance by setting the number of copies to be stored throughout the network. Further customization is supported by smart contracts (Actors) on Filecoin, which are executed on the FEVM. It also drives the growing ecosystem of Filecoin’s DApps, from compute-first storage solutions like Bacalhau to DeFi and liquidity staking solutions like Glif. Retriev provides incentive hooks for retrieval with licensed arbitrators through Filecoin Actors. Filecoin’s programmability can be used to customize the DA requirements needed for different solutions, so that DA-dependent platforms do not have to pay for more DA than they need.

Challenges Faced by the DSN-based DA Architecture

In our investigation, we have identified significant challenges that need to be overcome before building DA services on DSN. Now we are discussing the feasibility of implementation, and we will focus our discussion on FIL.

Proof of Latency

Filecoin’s encrypted proof of transaction and storage data integrity requires time to prove. When data is submitted to the network, it is divided into 32 GB sectors and “sealed”. Data sealing is the foundation of Proof of Replication (PoRep) and Proof of Spacetime (PoST). The former proves that the storage provider has stored one or more unique copies of the data, while the latter proves that the storage provider has continuously stored a unique copy throughout the storage transaction. The computational cost of sealing must be high to ensure that the storage provider does not seal the data on demand, thus compromising the required PoRep. When the protocol periodically requests storage providers to provide proof of unique and continuous storage, the secure time required for sealing must be longer than the response window, so that storage providers cannot temporarily forge proof or copies. Therefore, it may take storage providers about three hours to seal a data sector.

Storage Threshold

Due to the high cost of encapsulation operations, the sector size for encapsulating data must have economic value. For storage providers, the storage price must demonstrate reasonable encapsulation costs, and similarly, the resulting data storage costs must be low enough (in this case, approximately 32 GB data blocks) for users to be willing to store data on Filecoin. Although smaller sectors can be encapsulated, this would push up storage prices to compensate storage providers. To address this issue, data aggregators collect smaller data blocks from users and submit them as close to 32 GB data blocks to Filecoin. The data aggregators make commitments to user data through Proofs of Data Storage and sub-block CIDs (pCIDs). Proofs of Data Storage guarantee that user data is included in the sector, while sub-block CIDs (pCIDs) are used by users to retrieve data from the network.

Consensus Constraints

Filecoin’s consensus mechanism “Expected Consensus” has a block time of 30 seconds, with a finalization time of several hours, which may be improved in the near future (see FIP-0086 for more information on Filecoin’s fast finality). This is typically too slow to support the transaction throughput required by Layer 2, which relies on DA for processing transaction data. The block time of Filecoin is limited by the hardware lower bound of storage providers. The shorter the block time, the more difficult it is for storage providers to generate and provide storage proofs, and the more penalties they receive for missing the proof window for storing data correctly. To overcome this problem, the Inter-Planetary Consensus (IPC) subnet can be used to shorten the consensus time. IPC uses consensus similar to Tendermint and implements randomness with DRAND: in the case where DRAND becomes a bottleneck, we will be able to achieve a block time of 3 seconds using the IPC subnet; in the case where Tendermint becomes a bottleneck, PoC implementations like Narwhal achieve block times within a few hundred milliseconds.

Retrieval Speed

The last hurdle is retrieval. From the above constraints, we can infer that Filecoin is suitable for cold storage or warm storage. However, DA data is hot and requires support for high-performance applications. In Filecoin, it is difficult to incentivize retrieval; the data needs to be unsealed before being provided to the user, which increases latency. Currently, fast retrieval is achieved through SLA or storing unsealed data together with sealed sectors, but both methods are unreliable in the secure and permissionless application architecture on Filecoin. In particular, while Retriev proof can guarantee retrieval through FVM, fast retrieval incentivized on Filecoin remains an area that needs further exploration.

Cost Analysis

In this section, we will consider the costs associated with these design factors. We present the costs of storing 32 GB as Ethereum calldata, Celestia blobdata, EigenDA blobdata, and Filecoin sectors (using prices close to the current market price).

深入探讨Filecoin上的数据可用性

深入探讨Filecoin上的数据可用性

The analysis emphasizes the price of Ethereum calldata: the price of 32 GB data is 100 million US dollars. This price reflects the security cost behind the Ethereum consensus and is influenced by the fluctuation of Ethereum and Gas prices. The Dencun upgrade introduces Proto-Danksharding (EIP-4844), introduces Blob transactions, aims for 3 Blobs per block, each with a size of approximately 125 KB, and introduces variable Gas Blob pricing to maintain the target number of Blobs per block. This upgrade reduces the cost of Ethereum DA by 1/5: that is, the cost of 32 GB blob data is 20 million US dollars.

Celestia and EigenDA have made significant improvements: 32 GB data respectively require $8,000 and $26,000. Both are influenced by market price fluctuations and to some extent reflect the cost of consensus data security: Celestia uses its native TIA token, while EigenDA uses Ether.

In all of the above cases, the stored data is not permanent. The storage time for Ethereum calldata is 3 weeks, for blob is 18 days, and the default expiration period for EigenDA storing blob is 14 days. In the current implementation of Celestia, archive nodes store blob data indefinitely, but light nodes can only sample up to 30 days.

The last two tables are a direct comparison between Filecoin and the current DA solution. The cost equivalence first lists the cost of a single byte of data on a given platform, and then shows the number of Filecoin bytes that can be stored at the same cost within the same time period.

This indicates that Filecoin is several orders of magnitude cheaper than the current DA solution, and it only costs a fraction of a cent to store the same amount of data in the same amount of time. Unlike Ethereum nodes and other DA solution nodes, Filecoin nodes are optimized to provide storage services, and their proof system allows nodes to prove storage instead of replicating storage between every node on the network. Without considering the economic benefits of storage providers, such as the energy cost of packaging data, the basic cost of the Filecoin storage process can be negligible. Compared to Ethereum, this indicates that there is a market opportunity of up to millions of dollars per GB for systems that can provide secure and high-performance DA services on Filecoin.

Throughput

Next, we will consider the capacity of the DA solution and the demand generated by the main Layer 2 rollups.

深入探讨Filecoin上的数据可用性

Due to the fact that the Filecoin blockchain is organized in the form of tipsets, there are multiple blocks for each block height, so the number of transactions that can be performed is not limited by consensus or block size. The strict data constraint of Filecoin is the storage capacity within its network scope, rather than the capacity allowed by consensus.

For daily DA requirements, we obtain data from Rollups DA and ution provided by Terry Chung and Wei Dai, including daily averages within 30 days and data for individual sampling days. In this way, we can consider the average demand without ignoring the deviation from the average value (for example, the demand for Optimism on August 15, 2023, is approximately 261,000,000 bytes, which is more than four times the 30-day average of 64,000,000 bytes).

From this option, it can be seen that although the cost of DA may be reduced, a significant increase in DA demand is required to effectively utilize Filecoin’s 32 GB sector size. Although it is wasteful of resources to package less than 32 GB of data into a 32 GB sector, we can do so while still gaining cost advantages.

Architecture

In this section, we will consider the technical architecture that can be built today if we want to. We will consider this architecture in the context of any L2 application and the L1 chain it serves. Since the solution is an external DA solution, like Celestia and EigenDA, we do not consider Filecoin as an example L1.

深入探讨Filecoin上的数据可用性

Components

Even at a high level, the DA on Filecoin will leverage many different functionalities of the Filecoin ecosystem.

Trading: Downstream users conduct transactions on platforms that require DA, which may be L2.

Platforms using DA: These platforms use DA as a service, which can be the publication of transaction data to Filecoin DA on L2, or the commitment to L1 (such as Ethereum).

Layer 1: This is any L1 that contains data commitments pointing to DA solutions. It could be Ethereum or L2 that supports Filecoin DA solutions.

Aggregator: The frontend of the DA solution based on Filecoin is an aggregator, which is a centralized component that receives transaction data from L2 and other DA user terminals and aggregates them into 32GB sectors suitable for packaging. Although the simple concept verification will include a centralized aggregator, platforms using the DA solution can also run their own aggregators. For example, as an auxiliary device for L2 sorter, the centralization of the aggregator is similar to the decentralization of the L2 sorter or EigenDA disperser. Once the aggregator compiles a payload close to 32GB, it will reach a storage agreement with the storage provider to store the data. Users are assured that their data will be included in the sector in the form of PoDSI (Data Segment Contains Proof), and the data will be identified using pCID after entering the network. This pCID will be included in the state commitment on L1 for reference to support transaction data.

Validators: Validators request data from storage providers to ensure the integrity of state commitments and establish fraud proof. In the case of provable fraud, these proofs will be submitted to L1.

Storage Transaction: Once the aggregator compiles a payload of nearly 32 GB, it will enter into a storage transaction with a storage provider to store the data.

Publish blob (Put): To initiate a Put, the DA client submits a blob containing transaction data to the aggregator. This can be done off-chain or on-chain through an on-chain Oracle Machine. To confirm receipt of the blob, the aggregator returns PoDSI to the client, proving that the blob is included in the aggregation sector to be submitted to the subnet, along with the pCID (Partial Content Identifier). Once the blob is provided on Filecoin, the client and other relevant parties will use it as a reference to the blob.

Data transactions will appear on-chain within a few minutes after the transaction is completed. The maximum obstacle to latency is the packaging time, which may take up to three hours. This means that although the transaction has been completed and users can be sure that the data will appear in the network, the data cannot be queried until the packaging process is complete. The Lotus user interface has a fast retrieval function, where the unpackaged data copy is stored together with the packaged copy. As long as the retrieval transaction does not rely on proof of the packaged data appearing on the network, the service can be provided immediately after the unpackaged data is transmitted to the data storage provider. However, this function is determined by the data provider and is not provided as an encryption guarantee as part of the protocol. To provide a guarantee for fast retrieval, it is necessary to change the consensus and punishment/incentive mechanisms to enforce it.

Retrieve blob (Get): Similar to the put operation, a retrieval transaction is required, and the transaction will appear on the blockchain within a few minutes. The retrieval latency will depend on the transaction terms and whether unencapsulated data copies are stored for quick retrieval. In the case of fast retrieval, the latency will depend on the network conditions. If there is no fast retrieval, the data needs to be unpacked before being provided to the user end, which takes the same amount of time as encapsulation, approximately three hours. Therefore, without optimization, our maximum round-trip time is six hours. Significant improvements to the data service are needed before it becomes a feasible DA or fraud proof system.

DA Proof: DA Proof can be divided into two steps: providing PoDSI through submitting data to the aggregator during the transaction process, and providing continuous commitments of PoRep and PoST through the Filecoin Consensus Mechanism. As mentioned above, PoRep and PoST provide planned and provable guarantees for data storage and persistence.

This solution will use bridges extensively, as any user-side relying on DA (whether or not it builds proofs) needs to be able to interact with Filecoin. For state transitions published to L1 that include pCID, validators can perform initial checks to ensure that no false pCID is submitted. There are several ways to do this, such as publishing an Oracle of Filecoin data on L1 or verifying the existence of data transactions or sectors corresponding to pCID by validators. Similarly, bridges may also be needed for the validation of validity or fraud proofs published to L1, in order to ensure the validity or fraudulence of the proofs. The available bridges currently are Axelar and Celer.

Security Analysis

The integrity of Filecoin is achieved by reducing collateral. Collateral can be reduced in two cases: storage errors or consensus errors. Storage errors refer to storage providers failing to provide proofs of storage (PoRep or PoST), which is related to the availability of data in our model. Consensus errors refer to malicious behavior in the consensus, which is a protocol governing the transaction ledger, while FEVM is an abstraction from the transaction ledger.

  • Sector error refers to the penalty incurred for failing to submit a continuous storage proof. Storage providers have a grace period of one day during which they will not be penalized for storage errors. After 42 days of the sector experiencing an error, the sector will be terminated. The incurred fees will be destroyed.

BR(t) = ProjectedRewardFraction(t) * SectorQualityAdjustedPower

  • If a sector has an error for 42 days or the storage provider intentionally terminates the transaction, the sector will be terminated. The termination fee is equivalent to the highest amount the sector received before termination, with a maximum limit of 90 days of income. Unpaid transaction fees will be refunded to the user. The incurred fees will be destroyed.

max(SP(t), BR(StartEpoch, 20 d) + BR(StartEpoch, 1 d) * terminationRewardFactor * min(SectorAgeInDays, 140))

  • At the end of the transaction, there will be a reduction in the Storage Market Actor, which is a reduction in the collateral provided by the storage provider after the transaction.

Filecoin provides a level of security that is fundamentally different from other blockchains. In traditional blockchains, security is typically ensured through consensus. However, Filecoin’s consensus only guarantees the security of transaction ledgers, not the security of transaction reference data. Data stored on Filecoin must have sufficient security in order to incentivize storage providers to offer their services. This means that data stored on Filecoin is secured through error penalties and business incentives, such as user reputation. In other words, data errors on the blockchain are considered a violation of consensus, which can undermine the security of the blockchain or the validity of its transactions. Filecoin has fault tolerance in terms of data storage, so it uses consensus only to ensure the security of its transaction ledger and related activities. Storage providers who fail to fulfill their data transactions will be subject to penalties of up to 90 days’ worth of storage rewards, and they will lose the collateral they provided to secure the transactions.

Therefore, the cost of data withholding attacks initiated from Filecoin providers is only the opportunity cost of retrieving transactions. Data retrieval on Filecoin depends on the fees paid by users to incentivize storage providers. However, not responding to data retrieval requests will not have a negative impact on storage providers. To reduce the risk of individual storage providers ignoring or refusing data retrieval transactions, data on Filecoin can be stored by multiple storage providers.

Due to the significantly lower economic security behind the Filecoin data compared to blockchain-based solutions, preventing data manipulation must also be considered. Data manipulation is protected by the Filecoin proof system. Data is referenced by CID, and any data corruption can be immediately detected through CID. Therefore, data providers cannot supply corrupted data as it is easy to verify if the received data matches the requested CID. Data providers cannot store corrupted data in the location of non-corrupted data. After receiving user data, providers must provide proof of correctly encapsulated data sectors to initiate data transactions (select this option). Therefore, storage transactions cannot be initiated with corrupted data. During the validity period of storage transactions, PoST will be provided to prove the empty hosting (note that this can prove both the hosting situation of encapsulated data sectors and the hosting situation since the last PoST). Since PoST relies on encapsulated sectors when generating proofs, corrupted sectors will result in forged PoST, leading to sector errors. Therefore, storage providers cannot store or provide corrupted data, cannot be rewarded for providing services for non-corrupted data, and cannot avoid punishment for tampering with user data.

By increasing the collateral pledged by storage providers to the Storage Market Actor, security can be enhanced. The current collateral is determined by the storage providers and users. If we assume that the collateral amount is high enough (for example, the same as that of Ethereum validators), it is sufficient to incentivize providers not to default. In this case, we can consider what else needs to be ensured for security (although this is highly capital inefficient, as the collateral is needed to ensure the security of each transaction blob or aggregate blob sector). Now, data providers can choose to make the data unavailable for up to 41 days before terminating the storage transaction in the Storage Market Actor. Assuming the data transaction time is short, we can assume that the data is unavailable until the last day of the transaction. In the absence of coordinated malicious actors, this situation can be mitigated by replicating on multiple storage providers to continue providing data services.

We can consider the cost for attackers to overturn the consensus, either by accepting false proofs or rewriting the transaction history on the ledger, removing transactions from the order book, without punishing responsible storage providers. However, it is worth noting that in such a security breach, attackers can manipulate Filecoin’s ledger at will. To carry out such an attack, attackers need to have a majority stake in the Filecoin chain. Stake is related to the storage provided to the network, and currently, the data size of the Filecoin chain is 25 EiB (10¹⁶ bytes). Malicious actors would need at least 12.5 EiB to provide their own chain to win the fork choice rule. By reducing the slashings associated with consensus errors, this situation can be further mitigated, with the penalty being the loss of all staked collateral and block rewards, and a suspension from participating in consensus.

Off-topic: Prevent Attacks on Other DA Solutions

Although the above situation shows that Filecoin has shortcomings in protecting data from seizure attacks, it is not the only example.

  • Ethereum: Generally, the only way to ensure a response to a request on the Ethereum network is to run a full node. Therefore, full nodes do not need to satisfy data retrieval requests outside of consensus. Structures like PeerDAS introduce a peer scoring system for node responses to data retrieval, where nodes with low scores (essentially DA reputation) may be isolated from the network.
  • Celestia: Compared to the Filecoin structure, Celestia has stronger per-byte security and can withstand withholding attacks, but the only way to leverage this security is by hosting a full node. Requests made to the Celestia infrastructure will be reviewed and not penalized if they do not belong to internal ownership and operation.
  • EigenDA: Similar to Celestia, any service can run an EigenDA Operator node to ensure retrieval of its own data. Therefore, any data retrieval request outside the protocol will be reviewed. Please note that EigenDA has a centralized and trusted distributor responsible for data encoding, KZG commitments, and data distribution, similar to our aggregator.

Retrieve Security

Searchability is necessary for DA. Ideally, market forces would incentivize economically rational storage providers to accept retrieval transactions and compete with other providers to lower prices for users. Assuming this is enough to induce data providers to offer retrieval services, it is also reasonable to demand higher security given the importance of DA.

Currently, retrieval cannot be guaranteed through the above-mentioned economic security. This is because it is difficult to trust a minimal proof of un-received data from the encryption perspective (in cases where users need to refute storage providers’ claims of sending data). To ensure retrieval security through Filecoin’s economic security, protocol-level retrieval guarantee is required. With minimal changes to the protocol, this means that retrieval needs to be associated with sector errors or transaction terminations. Retriev is a conceptual validation that provides data retrieval guarantee by using a trusted “referee” to mediate data retrieval disputes.

Note: Retrieval of other DA solutions

As mentioned above, Filecoin lacks a protocol-level guarantee for preventing selfish behavior of storage (or retrieval providers). For Ethereum and Celestia, the only way to guarantee access to protocol data is to host a full node yourself or trust the infrastructure provider’s SLA. As a storage provider in Filecoin, guaranteeing retrieval is not easy. A similar setup in Filecoin is to become a storage provider (which requires a significant infrastructure cost) and successfully accept storage deals similar to those published by users, at which point one would pay themselves for providing storage.

Latency Analysis

Due to the design of the Filecoin proof system and the lack of retrieval incentives, Filecoin has not been optimized for high-performance round trip latency from initial published data to initial retrieval data. High-performance retrieval in Filecoin is an active area of research that is constantly evolving with the improvement of storage provider capabilities and the introduction of new features in Filecoin. We define “round trip” as the earliest time when data can be downloaded after submitting data transactions to Filecoin.

block time

In the expected consensus of Filecoin, data transactions can be completed within a block time of 30 seconds. 1 hour is the typical confirmation time for on-chain sensitive data, such as coin transfers.

Data Processing

The processing time of data varies depending on the storage provider and configuration. Using standard storage provider hardware, the sealing process takes 3 hours. Storage providers typically shorten this time by using special user-side configurations, parallelization, and investing in more powerful hardware. This change also affects the duration of sector unsealing, but fast retrieval options in Filecoin user clients (such as Lotus) can completely mitigate this situation. The fast retrieval setting stores uncached data copies together with cached data, greatly speeding up retrieval time. Based on this, we can assume that the worst-case delay from accepting a data transaction to the data being available on-chain is 3 hours.

Conclusion and Future Directions

This article discusses how to build a DA using the existing DSN (Filecoin). We consider the requirements of DA as a key element of Ethereum’s extended infrastructure. We explore the feasibility of building a DA on Filecoin-based DSN and use it to explore the opportunities that Filecoin solutions will bring to the Ethereum ecosystem, or any opportunities that will benefit from a cost-effective DA layer.

Filecoin proof DSN can significantly improve data storage efficiency in blockchain-based decentralized systems, saving $100 million for every 32 GB of data written, based on current market prices. Although the demand for DA is not enough to fill a 32 GB sector, the cost advantage of DA still exists if empty sectors are sealed. Although the current storage and retrieval latency on Filecoin is not suitable for hot storage needs, specific operations by storage providers can provide reasonable performance and ensure data availability within 3 hours.

Filecoin storage provider trust can be adjusted through variable collateral, such as in EigenDA. Filecoin extends this adjustable security, allowing for storage of a large number of copies on the network, thereby increasing adjustable Byzantine fault tolerance. To effectively prevent data withholding attacks, the guaranteed and high-performance data retrieval problem needs to be addressed. However, as with any other solution, the only way to truly guarantee retrievability is to host your own node or trust infrastructure providers.

We see the opportunity of DA in the further development of PoDSI, which can (together with Filecoin’s current proof) replace DAS to ensure that the data is included in larger sealed sectors. Depending on the actual situation, this may make the slow turnover of data tolerable, as fraud proof can be published within 1 day to 1 week, while DA can be guaranteed on demand. PoDSI is still a new technology and is under extensive development, so we don’t yet know how efficient PoDSI will be or the mechanisms required to build systems around it. Since there are already solutions for computing on Filecoin data, a solution for computing PoDSI on sealed or unsealed data may not be far away.

With the development of the DA and Filecoin fields, new combinations of solutions and support technologies may bring new concept verifications. As shown by the integration of Solana and the Filecoin network, DSN has the potential as an extension technology. The data storage cost on Filecoin provides an open opportunity with significant optimization space. Although the challenges discussed in this article are proposed in the context of supporting DA, their ultimate solutions will build a large number of new tools and systems beyond DA.

The chart data is from Filecoin spec, EIP-4844, EigenDA, Celestia implementation, Celenium, Starboard, file.app, Rollups DA and ution, as well as the current approximate market price.

Link to original article

View Original
Disclaimer: The information on this page may come from third parties and does not represent the views or opinions of Gate. The content displayed on this page is for reference only and does not constitute any financial, investment, or legal advice. Gate does not guarantee the accuracy or completeness of the information and shall not be liable for any losses arising from the use of this information. Virtual asset investments carry high risks and are subject to significant price volatility. You may lose all of your invested principal. Please fully understand the relevant risks and make prudent decisions based on your own financial situation and risk tolerance. For details, please refer to Disclaimer.
Comment
0/400
No comments