Lack of data scientists poses SKA problem

Read time 3min 40sec
Some 50-60 data scientists in SA will be needed to interpret the data produced by the world's largest radio telescope.
Some 50-60 data scientists in SA will be needed to interpret the data produced by the world's largest radio telescope.

There are not enough data scientists in SA to handle the vast amount of data that will need to be interpreted when the Square Kilometre Array (SKA) telescope is live.

While several local universities have recently identified big data skills as essential, it will take some time for these data science students to filter into the workplace.

Meanwhile, the fear is that the shortage of qualified data scientists will result in the data generated by the SKA being shipped and hosted overseas.

Admitting defeat

Bruce Mellado, associate professor at the Wits School of Physics, says SA displays a shortage of highly qualified specialists in a wide range of areas, both in research and industry.

Mellado, who prefers to refer to "data specialists" rather than data scientists, says it is hard to assess the number of local data specialists that will be needed for the SKA project as that will depend on how much data will be produced.

"There is a definite shortage of data specialists in the country and if not remedied, SA will not be able to take leadership in the data management and analysis of data sets for large projects.

"The immediate effect would be that SA would not be able to host the bulk of the data generated by the SKA. The data would have to be shipped abroad and be hosted there. This would be unnatural."

Numbers game

Bernie Fanaroff, SKA SA outgoing project director, says: "It is difficult to quantify the number of data scientists that will be needed for SKA but it can be anything between 50 and 60 data scientists."

Graeme Bloch, education analyst at Wits University's Public and Development Management School, agrees there are not enough data scientists in the country.

"We need to popularise science to get more data scientists who can do stuff with the massive amount of information from SKA," he says. "Maths results must improve all over the country and racial barriers must be addressed."

According to Fanaroff, big data will be one of the biggest drivers of economic development in the next decade, and the SKA project presents growth opportunities for our economy. "Big data is a rapidly growing industry and we need to take advantage of that."

Data scientists gather data and analyse it to find meaningful patterns and insights. When the SKA project is complete, data scientists will be needed to collect and process vast amounts of data produced by the world's largest radio telescope.

Construction on the SKA is planned to start in Carnarvon, Northern Cape, in 2017/18, with some elements operational by 2020 and full operation under way in 2025, reflecting the need for SA to address the data scientist shortage now.

Increasing problem

Market analyst firm Gartner says the need for data scientists is growing at about three times those for statisticians and business intelligence analysts.

EMC Southern Africa country marketing manager, Sonelia du Preez, notes that across the Southern Africa region there is a big data skills gap that needs to be addressed.

She says the data scientist is a necessity for the modern business and a key factor in enabling competitive advantage.

Some hope

The new school of Computer Science and Applied Mathematics at Wits University recently introduced big data as its first major programme. The school will also run training that includes data management for the SKA.

"I believe the efforts made by the Wits School of Computer Science and Applied Mathematics are critical in order to train specialists. This is being complemented by efforts supported by the NRF and the Department of Science and Technology to train specialists in high-end techniques," Mellado says.

According to Fanaroff, the Sol Plaaitje University also recently started a course in data science to give first year students access to not only computing and electronics but also data analytics.

See also