SA astronomers go open source for massive MeerKAT data

Read time 3min 40sec
MeerKAT has the capacity to process 275GBps in real-time, equating to approximately 58 DVDs per second.
MeerKAT has the capacity to process 275GBps in real-time, equating to approximately 58 DVDs per second.

The South African Radio Astronomy Observatory (SARAO) has joined the Ceph Foundation to advance open source storage.

SARAO manages SA's activities in the Square Kilometre Array (SKA) radio telescope in engineering, science and construction.

It is a national facility managed by the National Research Foundation. It incorporates radio astronomy instruments and programmes such as the MeerKAT and KAT-7 telescopes in the Karoo, the Hartebeesthoek Radio Astronomy Observatory in Gauteng, the African Very Long Baseline Interferometry programme in nine African countries, as well as the associated human capital development and commercialisation endeavours.

This week, in Berlin, Germany, SARAO joined 30 other members in the establishment of the Ceph Foundation, to manage the massive growth in data and information generated from cloud, container and artificial intelligence applications.

The Linux Foundation, a non-profit organisation enabling innovation through open source, announced that over 30 global technology leaders are forming a new foundation to support the Ceph open source project community.

The Ceph project develops a unified distributed storage system providing applications with object, block and file system interfaces.

"Ceph has a long track record of success when it comes to helping organisations effectively manage high growth and expanding data storage demands," says Jim Zemlin, executive director of the Linux Foundation.

Ceph is built on the Reliable Autonomic Distributed Object Store, which provides a highly available and scalable fabric that can either be consumed directly or via higher-level object, block and file services that are built on top.

"This partnership will assist us to store and retrieve the huge volumes of data that will be collected by the MeerKAT radio telescope," says Dr Rob Adam, MD of SARAO.

The MeerKAT is a 64-antenna array radio telescope that has been built on the SKA site in the Karoo, and which will be integrated into the first phase of the SKA.

MeerKAT has the capacity to process 275GBps in real-time, equating to approximately 58 DVDs per second.

The telescope launched in July 2018, marking a significant milestone in the lifespan of the SKA project, as it means astronomers will now be able to study the formation of the first galaxies, magnetic relations between planets, as well as details around the large-scale structure of the cosmos.

SARAO has since indicated the radio telescope has already observed a rare burst of activity from an exotic star, which demonstrates its capabilities as a new instrument for scientific exploration.

The SKA project requires substantial technology development, particularly in big data and ultra-fast computing. It could well be the world's largest public data project. Just in its first phase, the telescope will produce 160TB of raw data per second that the supercomputers will need to handle. That's the equivalent of more than 35 000 DVDs every second.

Beyond the transformational science it will carry out (advancing humanity's knowledge) the SKA will collect and process vast amounts of data and stimulate cutting-edge advances in high-performance computing and big data science, especially the processing, analysis and visualisation of very large data sets.

Computer hardware and processing algorithms are being developed in many of the SKA countries, and there is a great deal of technology development and transfer, as well as the creation of high-level skills.

SARAO currently uses Ceph to synthesise a 20PB object-based storage system, for the data generated by the MeerKAT radio telescope array.

Ceph is used by cloud providers and enterprises around the world, including financial institutions, cloud service providers, academic and government institutions, telecommunications infrastructure providers, auto manufacturers and software solution providers.

"SARAO uses Ceph in concert with locally manufactured hardware to lower storage capital expenditure for the MeerKAT storage infrastructure," says Thomas Bennett, senior software systems engineer at SARAO.

"In order to share our experiences and showcase Ceph, the SARAO storage team is in the process of establishing a Cape Town Ceph community forum."

See also