Can data really be stored forever?

The statement that data can be stored forever for a one-time fee is a paradigm shift in the way we relate to and use data.

Many people react skeptically to the idea of data permanence because it seems impossible and has never been achieved before. As with any new innovation of this magnitude it will require further research and proof of the viability of the claim.

For those of you who are new to ArDrive, know that we did not invent data permanence. Arweave invented the ‘permaweb’ and you can read a more in depth overview of Arweave here. ArDrive is an application (among many others) built upon the Arweave network and is the easiest and best way for the average person to get their files onto the Arweave network.

For our purposes here we will focus on how the Arweave network stores data permanently.

So how can data be stored forever? Data is able to be stored forever due to the combination of an economic incentive structure that enables the storers of the data to be paid over long periods of time and the innovation of technology around a blockchain database that Arweave calls a blockweave.

The paragraph above has many concepts jammed into it. Let’s unpack them one by one to see how Arweave allows data to be stored securely and reliably for extremely long periods of time.

In this article, we will go over the following:

  1. The Economics of permanent data storage
  2. The Technological Advance of the Blockweave

"Grave Naiskos of an Enthroned Woman with an Attendant," is on display at the J. Paul Getty Museum

Did the Greeks invent a laptop with USB ports?

1. The Economic Incentive of Permanent Storage

In almost every software business if someone does not pay for the service, it will not be provided. And if no one ever pays for the service in the future, the software will cease to exist.

This is very true with data storage: if nobody pays to host the data, it will not be hosted. And if nobody continues to pay for the data, it will stop being hosted and disappear. This is why the typical business model for web hosting companies requires ongoing or monthly subscriptions.

This business model seems so obvious we tend to believe it is the only way hosting businesses could operate. If I want an ongoing service, I need to provide ongoing payment.

So how can ArDrive on the Arweave network claim that they can pay once for permanent storage?

The Areave network is made up of people around the world willing to store other people’s data with more joining this effort every week.

This article will focus on how those people can profitably store other people’s data for long periods of time. So, let’s break down the economics of how this works.

The first thing to know is that your payment to upload data onto the Arweave network goes towards paying upfront storage costs for 200 years. Now 200 years is a long time – isn’t that expensive?

It sounds expensive because we are used to prices going up in almost all areas of our life, but data storage is one of the few areas that goes against that trend. Over the past 50 years. data storage has seen its costs decrease by an average over 30.5% per year.

It is important to note that this decrease in price for computer data storage has been fairly consistent and constant over that time span.

This trend actually extends much farther back in time. Decreasing costs for the storage of information extend far before the computer age. In ancient times, the papyrus housed in the world’s first libraries was extremely rare and expensive. But the cost and difficulty of print technologies has steadily improved from animal skins to parchment to the printing press. For thousands of years, humanity has found ways to make the recording of vital information cheaper.

Parchment

Record around 2 KB of data per parchment

The past, of course, does not determine the future, so how can we be sure that the price of storage will continue to decrease over time?

There are two main variables that allow for storage to be improved: data density and data reliability. Arweave estimated that data density, which is the amount of data that can be stored in a physical space, has about 400+ years of improvement left at current rates before it reaches its theoretical limits. Data reliability has even more ability to be improved.

Arweave’s working assumption is that the price of storage will continue to decline over time. Given the technological improvements still to be made, along with society’s increasing appetite for data storage, this seems to be a highly plausible assumption to make.

So for the economics of permanent storage to work, is Arweave betting that data storage costs will keep going down 30.5% per year?

No, definitely not!

The working assumption to make the economics viable for permanent storage is extremely conservative: Arweave assumes that data storage costs will decline by just 0.5% per year.

The initial cost a user pays to upload data to the Arweave network covers the first 200 years of storage. If data storage declines are anything greater than 0.5% per year, this simply adds to the number of years that the data will be stored.

The result is extremely cheap data storage costs over time. At current rates, the first 200 years of storage on Arweave are only $.007 per MB.

Here is a following table that looks at the number of years of additional data storage you will receive over time if the cost of storage declines at either 30.5%, 20%, 10% or 1% over the next 200 years:

1st Year + 261 years + 240 years + 220 years + 202 years
2nd Year + 341 years + 288 years + 242 years + 204 years
3rd Year + 444 years + 346 years + 266 years + 206 years
5th Year + 757 years + 498 years + 322 years + 210 years
10th Year + 2,865 years + 1,238 years + 519 years + 221 years
50th Year + Millions of years + 1,820,088 years + 23,478 years + 329 years
100th Year + Millions of years + Millions of years + 2,756,122 years + 541 years
200th Year + Millions of years + Millions of years + Millions of years + 1,449 years

As you can see, the number of years that data can be saved for increases exponentially if the cost of storage declines even at 1%. This length of time is really inconceivable to us.

Even at 1%, the length of time the amount of storage capacity accumulated can be measured in lifetimes. At a decline of 10% or higher the numbers are into the millions of years. For all intent and purposes, this is permanent data storage.

Storage Endowment: Paying over long periods of time

At this point, an objection is often made: “once a person is paid to store data on the Arweave network, what reason do they have to save it?”

An important element to understand is that the whole payment is not made to the person(s) initially storing the data. At the outset, only a small portion of it is paid. The rest of the payment goes into an endowment. The fees in this endowment will gain value gain over time, just like cash in a bank account accumulating interest.

The Eye of Uncle Sam

As the endowment gains value, it is designed to provide payouts as needed to keep the reward for storing data higher than the cost of storage.

Arweave estimates that withdrawals from the endowment will not be required until the data set is many times larger than the current size of the surface web (the portion of the internet that is readily available to the general public), which can be measured in Petabytes (millions of Gigabytes). In short, we have a long way to go before the endowment needs to be used.

Even with the Arweave network being fairly new, a sizable amount of value has already been built up in the endowment.

So even if people who store the data come and go (which they will), others will come and take their place and continue storing the data. As long as the economic incentive is there (where the price they receive to store data is greater than the cost of storing data) people will continue to maintain the permaweb.

In sum: as new data is added to the Arweave network, upfront profits are generated that keep the people store the data engaged, and the endowment is there to make sure that the economic incentive remains for them over long periods of time.

Never have to pay to retrieve data

It is important to mention that no one ever has to pay to retrieve their data from the Arweave network or ArDrive. The one-time payment covers the cost of putting it onto the network, and there is never an additional charge for retrieving that data.

Now that we have covered the economic overview, let’s shift our focus onto the technological innovation that Arweave developed.

Data Permanence, Not Network Permanence

One final note before we move onto technological innovation. Arweave does not promise network permanence but data permanence. What’s the difference?

Well, technological innovation comes in waves so there may (likely) come a time when data can be stored at a cheaper cost at a permanent basis. With the extremely cheap costs of storing data this new network work would ‘subsume’’ the Arweave network.

Therefore, the Arweave network may not be permanent but the data that is on the network will. So the data being permanent is what is important and this is what will live on.

2. The Technological Advance of the Blockweave

On the technological side there are a number of innovations that led to permanent storage but one is at the center of the innovation: the blockweave.

The blockweave is a blockchain-like structure designed to enable unchanging and scalable on-chain storage in a cost efficient manner for the first time.

Again, there is a lot to unpack in this statement, but let’s start with the term blockchain.

In case you are new to the blockchain technology popularized by Bitcoin, here is a short video from the BBC that explains the concept concisely:

Blockchain technology introduced a new type of database for verifying transactions. It is essentially a long ledger list that is continually appended as new transactions are added to it. Each time there is a new transaction, the whole blockchain records it by adding it to the end of the ‘chain’. This is an incredibly secure method of verifying transactions: no one has ever hacked Bitcoin. But it’s still not a good place for large amounts of data storage due to the extreme energy requirements and long transaction times.

To give you an idea of how inefficient the Bitcoin blockchain is, the whole ledger is about 320 GB in size (as of 2021) and the verification/reward competition requires around the amount of electricity each year of a country like New Zealand or Austria.

To keep the features of verification, but also to add the ability to store large amounts of information in a cost-efficient manner and scalable way, Arweave came up with a variation on the blockchain they called the blockweave.

What makes the blockweave so unique?

At this point we will provide an overview of the technical component of Arweave, but if you would like a deep technical dive into the project please visit the Arweave yellow paper.

Proof-of-Access (SPoRA)

SPoRA stands for Succinct Random Proofs of Access, and is at the center of what makes the blockweave.

Although SPoRA is not a name that rolls off the tongue, it is an extremely powerful innovation to the blockchain.

SPoRA requires the individual who is accepting new information into the blockweave to produce a randomly selected piece of information from a previous transaction into the network.

Instead of having to produce the entire ledger, a person only needs to recall a ‘chunk’ of information that has been randomly chosen by the algorithm. Once that previous ‘chunk’ of information has been verified, new data can be uploaded into the system.

What this system does compared to Bitcoin is that it requires much, much less energy to operate. This results in significantly less overhead for the storers of information. This mechanism offsets the value that is normally wasted in blockchain networks, with useful, energy-efficient data storage at extremely low prices.

Additionally, each piece of information put into the system is time stamped and given a unique transaction identifier. Once in the system, the information cannot be changed or removed.

What SPoRA does with all of the blocks of information is it creates a multi-directional ‘weave’ of data – instead of a single long line: hence, the blockweave.

This new and improved form of blockchain verification allows for large amounts of data to be stored in a highly-secure and cost-efficient manner. It also means that all of each piece of data on the permaweb is marked with a time stamped ID that cannot be changed. Uploading any new information to the Arweave network (and the economic benefit of storing this new data) requires that all the old information on the permaweb be kept intact as it will be continuously verified.

Wildfire

The blockweave is further differentiated from most other blockchains by a concept called ‘Wildfire’.

The Wildfire component of the blockweave prioritizes cooperation over competition: people do not compete with optimized servers and nations-worth of energy to solve math problems for rewards (like Bitcoin) but are incentivized to share the data they have with each other.

Although this is a complex process, Wildfire can best be described as ‘if you share with me, I will share with you’.

Share earbuds

Sharing is caring

Each person in the network is ranked by their peers based on how fast they share data. To get new data you need to have old data, and if you are not sharing the old data you will be ranked lower. A lower rank means that you will not be able to store new information on the permaweb as easily – and therefore be less able to earn economic rewards.

What the system ultimately does is to incentivize the storers of data to share with each other.

The end result is that data on the blockweave is intensely replicated. Instead of data being siloed into a few different storage areas, they are spread around and replicated hundreds or thousands of times. So the family photo you upload to ArDrive won’t just be housed on a single cloud server, it will be saved in hundreds or thousands of locations around the world.

So how doesn’t data get lost in the system?

The continual verification of data on the permaweb is almost beyond comprehension. The founder of Arweave, Sam Williams, shared the stats of how often the data within the system is validated because of the SPoRA technological innovation:

If you have a transaction on the network:

  • Read and validated approximately 1,890 times at the start
  • Greater than 5,670 validations of your transaction every day going forward
  • The integrity of your data is checked once every 14.4 seconds

Conclusion

Human beings are data savers. We’ve always found ways to record our ideas, beliefs, memories and artwork through the ages. Our methods of doing so have become remarkably more and more efficient.

Arweave does not hold the keys to the future of data storage. But as this article outlines, it has a reasonable claim to have developed a way to store data at a simple one-time price for centuries. This is a world first that we believe will see increasing adoption in the years to come. It has the potential of a zero-to-one innovation that changes the landscape of data storage forever.

But you’ll never know until you try.

ArDrive is the easiest way for you to get your personal photos, music, documents, and videos onto the Arweave network. Upload a few of your files and see how they remain ‘chiseled in digital stone’ while other online data disappears or requires ongoing payments. Try ArDrive

Wayne Jones

Wayne Jones

Wayne Jones is a content writer and research genius for ArDrive. He's also a prolific hockey blogger for hockeyanswered.com and happens to own an obscene number of blankets.

Leave a Comment

Your email address will not be published. Required fields are marked *