They say that history repeats itself and in the case of the decentralized internet at the moment, that might well be the case.
For while there has been rapid blockchain innovation and frenetic activity in the area of decentralized storage, a problem that earlier digital pioneers encountered seems to blocking progress once again.
Even before the decentralized internet, there were two primary data storage services – file system storage for files and database storage for data fields.
Files are usually over 10KB in size so relatively large and arbitrarily fixed, while data fields are typically small and of fixed size. Files are also not generally structured in a way that makes them searchable, while data fields are organized into groups and collections for easy searching.
These different characteristics made the storage solutions for each type of data quite different too. While data fields could be quickly stored and retrieved to achieve the best security, performance and scalability, file systems were optimized to deliver the entire file but lacked the granularity to search and retrieve data within them.
We’re now totally familiar with the major players in these fields, from Dropbox and Google Drive on the file storage side to Oracle and Mongo on the database side. To a greater or lesser extent, these solutions solved the problems surrounding access to data in the internet world.
What is the apparent now is that the same problems they faced then are occurring again in a decentralized internet. For while solutions such as IPFS/Filecoin, Storj, Sia and Ethereum’s Swarm have moved things forward, they can only progress so far.
Immutability is not an option
The immutability of the data on the blockchain is one of the key dynamics at play here. While never removing or changing data has many positive aspects to it, it also poses a serious issue when you consider how it works within important regulatory frameworks.
For example, the EU’s General Data Protection Regulation (GDPR), which is high on the agenda of many businesses right now, will require them to be able to purge customer data completely from their systems.
Even without the threat of GDPR, the immutability of data is seen as an unacceptable constraint for many real world data storage scenarios and can therefore be a deal breaker for software projects. This is one of the reasons we’ve seen so many decentralized software companies resort to traditional cloud-based, centralized databases.
“the immutability of data is seen as an unacceptable constraint for many real world data storage scenarios”
The issue of decentralized storage
Coupled with the issues of immutability is the fact that files are not stored in a useful way in new decentralized file storage services. Often, they are broken up into chunks with the divisions made at arbitrary locations, demonstrating little regard for the data in the file. Trying to access data when the underlying storage mechanism does not understand the nature of it is inefficient and likely to be error prone.
The reality is that, to read a simple mailing address from a relatively modest 10GB file on a storage service like IPFS, would require the entire file to be downloaded and then searched for the relevant information.
Even at download speeds of 1GB per second, it would take 80 seconds every time the file is accessed.
A database from the ground up
Alternatively, if you imagine a real database containing the same 10GB of data, that same 32-byte mailing address could be read in 100 milliseconds. You could build a database layer on top of the IPFS to tackle this issue too but, because the entire file would still need to be downloaded first, that would just add overhead and make performance worse.
The solutions that exists right now are file services. What is needed is a database service for the decentralized internet that can be quickly and cheaply scaled by dApps.
It is the missing piece in the decentralized internet, marrying the best aspects of blockchain decentralization with the lessons learnt from decades of database science and it will future proof the development of dApps.
It’s this combination of old and new that makes me think Mark Twain was much closer to the mark with his assessment that “history doesn’t repeat itself, but it does rhyme”.
Pavel Bains is CEO of Bluzelle, as well as a futurist, entrepreneur, designer and investor in exponential technologies. Bluzelle Networks builds blockchain and distributed ledger solutions for the finance industry. Pavel also provides advisory, M&A, and capital raising services for companies in digital media and technology. Pavel is an investor in fintech startup Bench and virtual reality startup VR Chat. the company was named as one of World Economic Forum’s 2017 Technology Pioneers.