Subscribe

Open Gazettes South Africa


Johannesburg, 02 Aug 2017

Background

OpenUp, previously known as Code for South Africa, is a non-profit civic technology lab that uses data and technology to promote informed decision-making that drives social change. Its main goal is to promote the release of data under an open data licence to make it easily accessible and then build tools to make the data available and understandable to society.

Fact sheet
Solution: Fujitsu scanners
Industry: Non-profit data
Provider: Meniko Records Management Services
User: OpenUp

OpenUp is based in Codebridge, Cape Town and works closely with government, civil society and media to open data, disseminate it, and build a more data-literate society.

Challenge

Government gazettes are one of our country's most valuable entities. They contain a history and a culture we need to preserve for future generations. Currently, government gazettes still look like they're produced on the printing presses of the 1800s and can only be accessed at a few libraries.

OpenUp decided to embark on this process as:
* Gazettes from 2012 for all provinces (except the Free State) are available online for free from the Government Printing Works (GPW).
* Free State gazettes are available from the Free State government for a fee.
* Historical gazettes (especially those before 2000) are not available online for free, and there are very few (if any) complete copies in the country.

Most of these historical gazettes are unavailable for free and aren't digitised. More so, a few digital versions are only available behind paywalls and cost a fortune to access. Surprisingly, the government also charges for expensive subscriptions to gazettes, even though they are not subject to copyright and should be available to all South Africans.

These documents, stretching back to the turn of the previous century, are:
* Starting to moulder and crumble;
* Dwindling, as libraries are getting rid of paper copies due to space constraints;
* Not easily accessible;
* Becoming increasingly difficult to find; and
* Impossible to search as they are millions of pages long.


OpenUp, therefore, decided to scan all the documents, but had two options which presented two challenges:

The first was to scan in full colour at an archival quality where you can see the texture of the page. It was the closest to capturing the original documents and preserving them for future generations. This was a very expensive option, however, as an archival quality scan of a 70-page gazette weighs in at around 300MB - there are thousands of gazettes and millions of pages that need to be scanned.

The second was to scan at a much lower quality, which still allowed them to OCR the text and make it searchable, but it was a poor reproduction of the original and didn't meet ISO archival standards.

In addition to expensive scanning, high-quality OCR is also not cheap. This is where Meniko came in.

Solution

Meniko managed to scan about 200 gazettes using our Fujitsu scanners which are optimised for high volume bulk scanning. This allowed us to scan documents dating as far back as 1958!

We also managed to OCR the text to automatically analyse the printed text and turn it into a form that a computer can process more easily. This alone solved the problem that would come with manually searching for content on the document.

Feel free to explore gazettes here.

Benefits

The Meniko solution made sure that:
* Government gazettes are all in one place;
* There was instant access to anyone anywhere in the world;
* There was instant access to death notices, name changes and progress on laws with keyword search;
* It's easy to share and link to, including directly to individual pages; and
* There's no extra expenditure on storage.

The partnership between Meniko and OpenUp has allowed us to preserve critical sources of information and records of South African history.

Share

Editorial contacts

Rea Manonyane
Meniko Records Management Services
rea@meniko.co.za