Subscribe

The ballad of open data

The tale of big and open data is growing exponentially, to become an epic saga.

Jessie Rudd
By Jessie Rudd, Technical business analyst at PBT Group
Johannesburg, 26 Jun 2015

Enter stage right:
It is estimated that 2.5 quintillion bytes of data is created every single day; that 90% of the data in the world today has been created in the last two years alone [2]; that in 2015, the amount of data that crosses the Internet every five minutes or so will be equivalent in size to the total size of all movies ever made; and that annual Internet traffic will reach a zettabyte.

Let's put that in perspective. A quintillion bytes is equivalent to 57.5 billion 32GB iPads [3] and a zettabyte is roughly 200 times the total size of all words ever spoken by humans [1].

That is a massive volume of information being generated and stored. Mostly, it is passive data and completely lacking in context. It is 'dirty' big data, just waiting for the right question to be asked.

Enter stage left:
Appearing for the first time in 1995, the idea of sharing geophysical and environmental data, across countries and borders, gave rise to the concept behind open data.

The authors of the report, written and released by an American scientific agency, were promoting an open and transparent dialogue between countries in order to better analyse and understand the global phenomena they were studying [4]. What it came down to, and what it has since embraced, is the idea that collective knowledge, when it is applied to information, is for the greater good - the open and free dissemination of information for the benefit of the masses.

Enter the orchestra:
Across the globe, industry leaders in the world of open data/big data are becoming more and more wary of the amount of private and personal data being collected every day. Do people know who is collecting information about them? More importantly, do they know what it is going to be used for?

This staggering volume of freely available data is being stored and collated by business, governments and individuals. Granted, it gives them an unprecedented insight into consumers - allowing them to understand, analyse, and ultimately change the world on a much more personal and fundamental level.

It is 'dirty' big data, just waiting for the right question to be asked.

However, it also gives corporations and individuals an immense opportunity to do harm to consumers. The bigger and wider this digital world gets, the more of people's lives will be laid bare, for good or for bad. I am not sure there is anything that can be done about that. However, in understanding and knowing what is being collated and how it is being used, there is salve, perhaps even independence.

The players:
Open data is perhaps the most effective way to ensure the power inherent in the possession of large amounts of governmental data is made honest. By the sharing and dissemination of big data, passive data, small data, and archival data - there is an unprecedented opportunity to change the world. For a government, perhaps it can lead to a fairer and more honest reflection of democracy, one in which resources are channelled to the right places, where departments and municipalities are held accountable.

For the individual, perhaps it can lead to transparency and a clearer understanding of how passive and big data is being used to market, sell and monitor.

For business, perhaps it can help in redefining the way marketing campaigns are run and funded.

The curtain:
Who people are online is increasingly becoming who they are. They knowingly and unknowingly divulge huge amounts of personal information every day, for better or for worse.

At the moment, the open data movement is focusing on making governmental data open to the public. However, how far is the world from becoming a society where every waking, and sleeping, moment is caught on camera somewhere? Where social media updates are more important than the physical entity sitting across the table? Where hackers and terrorist organisations can gain access to personal information relatively easily?

[1] The Zettabyte Era-Trends and Analysis
[2] http://www-01.ibm.com/software/data/bigdata/
[3] http://www.storagenewsletter.com/rubriques/market-reportsresearch/viawest-2-5-quintillion-bytes-each-day/
[4] http://www.paristechreview.com/2013/03/29/brief-history-open-data/

Share