One must take note of the fact that data marketplaces are indeed emerging in data economy. Blockchains are indeed emerging as an ideal solution to be able to address some of the key issues that are related to data marketplaces.
What are the key components of a data marketplace?
• Data Portal – This happens to be a marketplace portal that the data providers and the data consumers will either make use of data for sale or checkout data for purchase. The user experience is going to be the most critical thing for this portal to be able to succeed.
• Blockchain Platform: It would be interesting to take note of some of the key capabilities of Blockchain implementation – distributed ledger, support for cryptography, immutability assurance, smart contracts, oracles etc.
• Data Storage – The few things that need to be considered when it comes to storage are Volume, Variety, Location of data and perhaps Sensitivity. Blockchains have the potential to store all of the actual data on-chain. This implies that there would be external storage outside of the Blockchain.
• Marketplace Engine – It is important for the marketplace to function effectively and therefore the connecting glue between the on-chain data as well as the off-chain data will be indeed a set of services that do offer unified security, governance, management and visibility across all data sets. This indeed brings forth a seamless as well as unified data plane that both data providers, as well as consumers, can leverage easily.
• Blockchain Access Layer – There is no doubt multiple protocols or mechanisms through which a Blockchain platform can be accessed. Blockchain technology is still evolving.
The traditional integration tools do not meet up to such demands, particularly scale. Let us have a look at some ideal technologies for data ingestion as well as advanced analytics:
• Apache NiFi happens to be an integrated data logistics platform for automating the movement of data between disparate systems. It does provide a real-time control that indeed makes it easy to manage the movement of data between any source and any destination. It is data source agnostic and supports disparate as well as distributed sources of differing formats, schemas, protocols, speeds, and sizes such as machines, geolocation devices, click streams, files, social feeds, log files and videos as well as more.
• Stream Processing/ Analytics: Apache Kafka has been known to deliver data movement speeds of millions of transactions per second. Apache Storm, Apache Spark Streaming and other similar projects offer complex event processing cum predictive analytics capabilities. Some of the data consumers of this data marketplace architecture may indeed purchase the data and do the analysis offline themselves.
It is interesting to know that many processes go into Blockchain data management and these are technologically very superior modes of functioning. The world is making rapid technological advancement in data preservation and usage and Blockchain has indeed an important role to play in it. Its importance in business operations cannot be undermined.