A true data specialist knows no boundaries

True data specialists aren’t constrained by tools or technology, as the core principles and best practices remain uncompromised, becoming amplified in the context of cloud.
Windsor Gumede
By Windsor Gumede
Johannesburg, 14 Dec 2020

The new realm of cloud computing services has created quite the controversy when it comes to skills in the data and analytics world.

As a result, there are those that are looking to elevate themselves into new roles in the context of cloud and there are companies that are trying to increase their footprint and service offering to cloud specialisation.

One aspect, however, that remains true, is the fundamental role of the data specialist in getting cloud computing right.

The phrase ‘data in the cloud’ is associated with data that does not use, or partially uses, an enterprise’s local infrastructure, specifically regarding storage. Cloud computing, on the other hand, refers to the delivery of compute power, database storage, applications and other IT services.

However, what is intriguing (or confusing) is that the phrase and term are sometimes used interchangeably. And this to my mind is incorrect − cloud computing should be viewed as the overarching concept that encompasses the storage factor of data, which is then referred to as ‘data in the cloud’.

Why the focus on data in the cloud?

Often, at some of our clients, we hear the words ‘data in the cloud’ and ‘data on-premises’. This, I believe, is an over-referenced subject, especially when the organisation is undergoing digital transformation and transitioning towards using cloud services.

The ‘data’ in both phrases is referring to the same enterprise data that requires the expertise of a data specialist, and the only difference between the two statements is the locationof where the data is stored or processed.

As a result, this leads me to question why organisations believe they may require new skills for changing the location of the data to the cloud?

One of the many benefits of cloud computing is the ability to separate storage and compute to achieve phenomenal processing speeds.

As we know, the human capital that acquires, analyses, transforms, manages and makes data available for the enterprise are the data specialists. This is irrespective of the location of where the data resides or is being processed.

A data specialist should be an expert in the field of data, no matter the technology used. Further, the field of data is very broad and can be subcategorised into data architecture, data management, data governance, database administration and many others.

I believe therefore there is a common misconception of “I need a cloud specialist in order to migrate/manage data in the cloud”.

The reality is, there is a wide range of new tools that have emerged with the introduction of cloud computing services in the data and analytics space. Any good data specialist that understands the concept of distributed processing, and distributed storage, is well equipped with the knowledge required to adapt to cloud computing effectively.

One of the many benefits of cloud computing is the ability to separate storage and compute to achieve phenomenal processing speeds. The concept of separating storage and compute is not a new concept, in the broader sense.

In fact, data archiving strategies have leveraged on this type of architecture for many years, where the computation and storage of frequently used data is the responsibility of a “processor and storage” machine, and the data that has reached its retention period gets offloaded onto a network-attached storage device or any other storage mechanism that is not attached to the same machine that processed the data – a process any data specialist is familiar with.

But wait, don’t panic!

The title name change of an ETL developer has evolved to a data engineer over recent years. Yet, the core responsibilities of the role have not changed. A data engineer, data architect, data analyst, data modeller, etc, on-premises still has the same role to play in the cloud. The difference is dependent on the platform and technology used where the role will participate.

Any company on a cloud adoption journey must keep in mind that the evolution of concepts, technology and roles is an opportunity to skill up on new technologies. The use cases for data − in data − and analytics domains, in the cloud, such as enterprise data lake, enterprise operational data store and enterprise data warehouse, are different and thus the diverse domains might not necessarily use the same technologies; however, they all require the skills of a data specialist.

This, of course, does not negate the fact that system admins and security admins have become more relevant, but once the infrastructure and guardrails have been setup, the data specialists are able to demonstrate their unprecedented value.

True data specialists are unbound by tools or technology, as the core principles and best practices remain uncompromised. In fact, they become more amplified in the context of cloud considering the cost factor.

Poor or inefficient development of any cloud data solution results in a high cost to running the solution. However, having a data specialist on board can support a far more successful outcome when moving data to the cloud.