Subscribe

MS Azure, CPGR partner to enable African genomics research

Staff Writer
By Staff Writer, ITWeb
Johannesburg, 02 Nov 2018
Ryno Rijnsburger, CTO of Microsoft 4Afrika.
Ryno Rijnsburger, CTO of Microsoft 4Afrika.

A new partnership betweenMicrosoft and the Cape Town-based Centre for Proteomic and Genomic Research (CPGR) aims to allow African scientists and academics to perform and collaborate on ground-breaking genomics research.

Microsoft, through its 4Afrika initiative, and the CPGR have joined forces to bring together intelligent cloud and genomics research to create a scalable, cost-effective technology platform to power advanced medical analysis and research on the continent.

The project aims to make genomic applications such as non-invasive prenatal testing, BRCA sequencing for breast cancer, HLA typing for stem cell matchmaking and ancestry profiling available on a cloud-based platform.

It will be enabled by Microsoft Azure, which will support the data transfer, storage and processing capabilities for genomics datasets.

"It is the first such initiative in Africa and could have enormous social and healthcare provision benefits, including avoiding prescription of ineffective drugs and sidestepping potential side effects, as well as enhancing the analysis of data, the dissemination of information and the aggregation of data to support regional genomic research, innovation and health provision," a joint statement reads.

"The project allows us to use an existing cloud-based data management ecosystem, while amplifying our own expertise in developing and running a genomics technology platform," says Reinhard Hiller, MD of CPGR. "We envisage creating a system that allows us to deliver value, and collaborate with others across the continent."

Based in Cape Town, the CPGR provides advanced 'omics' services to the life science and biotech communities in SA.

Omics informally refers to a field of study in biology ending in -omics, including proteomics and genomics, which 'zoom in' on the proteome (the proteins output of an organism) and genome (the genetic makeup of an organism) respectively.

The non-profit organisation uses leading technologies and bio-computational data pipelines to create and support tailored services for both academia and industry customers.

Arguably the most famous omics project was the Human Genome Project, which set out to map the entire human genome, and took 15 years to be declared complete.

Cloud benefit

Common diseases, such as cancer or diabetes, are influenced by the interplay of many different genetic markers. Studying these linkages requires the aggregation and analysis of very large datasets. In addition, provision of gene-based diagnostic testing requires the careful comparison of an individual patient's data with appropriate reference data sets.

The Microsoft and CPGR partnership, officially launched at the end of October, speaks directly to the potential of cloud technologies in enabling cutting-edge science and medicine. Microsoft says with a scalable platform like Azure as the backbone of a project, medical researchers can get to work on unlocking the secrets embedded in our cells.

"Given the rapidly expanding nature of biomedical science and the genetic diversity found in African populations, having a means to store and analyse data is a key pillar of Africa-led research and innovation. Such a solution will aid in reducing 'data drain' from Africa, as researchers will have solutions to better manage and analyse their data," the groups said in a statement.

"For Microsoft 4Afrika, it is one of the most exciting fruits of our long-term investment in Africa's economic, social and technological development," says Ryno Rijnsburger, CTO of Microsoft 4Afrika. "We are providing both financial and technical support to the CPGR and also intend to bring out Microsoft experts to work with the team through our MySkills4Afrika volunteer programme.

"The sheer volume of data that can flow through the system will drive genomic research and medical innovation, and public health across Africa will reap the benefits. Another of the intended outcomes is to stimulate interest and support the CPGR in accessing additional funding to continue operationalising the platform, which will in turn lead to additional technology investments into things like mobile apps and reporting capabilities," adds Rijnsburger.

Phased approach

Rijnsburger says the technology platform for the project will be built over a number of phases.

"During the first phase, the main aspects that were addressed were storage requirements for raw genomics data, processed data, as well as a system to allow capture of metadata pertaining to the raw datasets. Phase two will introduce the ability to perform analysis of the data on the cloud, rather than relying on hardware based in the lab environment. Future phases will then work on commercialising this platform and making it available to stakeholders outside CPGR," he told ITWeb.

"With the storage of datasets in the cloud, it becomes possible to easily share not just data with approved third parties, but also centralise the reporting around that data and make it available to stakeholders in the value chain. The constraints around how genomics datasets and derived, calculated analysis data is made available will be defined by the business process and legislation/regulation more so than the technology platform," Rijnsburger says.

He explains that research studies that require massive amounts of genomics data in a centralised, accessible infrastructure were previously difficult or impossible to do given the significant challenges in making storage and compute capacity available at scale.

"With the system in place, organisations such as CPGR can now build a repository of datasets that never needs to discard or lose data due to infrastructure challenges, and concentrate on the medical research and analysis aspects at the core of their business, rather than the IT challenges around it, according to Rijnsburger.

"As additional data is added over time, and datasets grow, the quality and scope of results obtained from it will continue to positively impact the scientific community and healthcare in general. The framework being created is applicable to management of datasets and research results to a wide variety of situations."

Share