Subscribe
  • Home
  • /
  • Data Management
  • /
  • Real-time big data: Unlocking potential of mainframe data by bringing analytics closer to data

Real-time big data: Unlocking potential of mainframe data by bringing analytics closer to data


Johannesburg, 09 Jun 2015

The new millennium has introduced significant changes in technological advancements and we now find ourselves in a diverse, globalised, and complex social media-addicted society. This is further exacerbated by an ever increasing need for mobility, with more employees working out of the office and using mobile devices and cloud services to perform business tasks. All of these advancements has a profound impact on data, resulting in data growth within every aspect of 21st century society.

Big data, legacy data, operational data and streaming data - these are all impacting the ever-increasing volumes, velocity and variety of data. Regardless of industry, comprehensive access to data has become essential to formulating critical business decisions. However, not all of the data growth is attributed to transactional and application activities alone; there are other contributors like regulatory and machine-to-machine transactions like radio frequency identification (RFID) tags generally used for tracking.

For many organisations, the value of these big data systems can be summed up in one word: "analytics". Big data systems allow organisations to analyse data in ways that were previously not possible.

New big data storage technology is capable of analysing massive amounts of data in a fraction of the time needed with more conventional technology. Daily new tools are introduced to analyse and investigate all that big data, but unfortunately, most of the big data systems don't contain all the required data needed for analytics. Analysing big data very rarely ever paints the full picture.

More data is needed, and most of it is being gathered by organisations in their classic IT systems, and in many organisations, this data is stored on mainframes. Some may think that mainframes are obsolete, but they are not. And more importantly, mainframe data has definitely not ceased to exist. Hence, to fully exploit the value of the investment in big data systems and analytical tools, big data has to be integrated with other data sources, including that of the mainframe. Currently, data can be both an extreme challenge as well as a huge opportunity for business.

So, while this new wave of unstructured data has captured the headlines, coined as big data, there is an immense amount of system record data that still resides on mainframe servers. In fact, mainframe data is the original big data. It arrives in extreme volumes at very high rates requiring precise, extraordinary performance analytics. Some of the earliest mainframe data can aptly be described as "unstructured", particularly when you consider flat files and file storage access methods like Virtual Storage Access Method (VSAM). This can be well illustrated using the example of a large bank that utilises mainframes to process thousands of customer transactions per second, 24 hours a day. These support thousands of applications concurrently accessing terabytes of storage that are contained in databases with extensive customer account information that can be instantly and securely accessed from ATMs around the globe.

To be effective in the realm of business intelligence and analytics, mainframe data must be fully accessible and available in an instant or as close to real-time as possible. This means enterprises should be able to effortlessly blend relational and non-relational data, open systems and mainframe, and retrieve the combined information via a simple, single query or request. This would entail avoiding traditional data integration methods that physically moves volumes of data into a central repository before analytics can be performed.

The ideal scenario is data that is integrated in place, virtually combined with other data regardless of platform or location. Combining relational and non-relational data sources via data maximises the effectiveness of enterprise analytics to provide a comprehensive view of the business, the customer or the market. The challenge is getting mainframe data, particularly non-relational data, into a form that is compatible with modern analytics tools. Extract, Transform and Load (ETL) has been a widely employed approach to integration of mainframe data. ETL extracts and replicates whole segments of mainframe data, transforms the data into a compatible format, then loads it into another database or data warehouse. As soon as the data is extracted, it is no longer current, which diminishes a core strategic value of data - timeliness.

Although widely used, by its nature, ETL data is retrospective. So, in today's "we-need-it-now" environment, creating a copy, moving data and reformatting, all add delays, cost and complexity.

What is needed is a more agile data architecture. More recently, mainframe data virtualisation software has emerged that closes the separation between data and analytics. It employs software running directly on the mainframe with an important distinction - it diverts the majority of its integration-related processing to an IBM System z specialty processor. This is most commonly the System z Integrated Information Processor or zIIP. Unlike the mainframe central processors (CPs), specialty engines do not incur software license charges, so no MIPs capacity is expended for integration related processing. The result is dramatically reduced mainframe TCO that allows mainframe production data to remain in place and undisturbed.

The practical impact of this next-generation data integration solution is analytics are brought closer to the data on the mainframe. Latency is eliminated and consistency is maintained while information requests are honoured with live data responses. It enables standards-based SQL query access to mainframe non-relational data and file structures without the need to move data. This advanced middleware approach increases business efficiencies and helps control TCO. It complements BI/analytics tools that are being used in combination with larger federated data architectures that include data warehouses. It provides an abstraction layer that eliminates the difficulties associated to skilled developers and BI/analytics specialists who would generally be unable to use the mainframe data because of lack of skill or experience with that environment.

Executives in organisations are driven by objectives to increase business growth and mitigate risk. The timeliness, accuracy, usability and accessibility of data are critical factors in their successful decision-making. To support this, the mainframe's crucial role as the preeminent data server compels a seamless integration with BI and analytics applications and core systems. Direct access to legacy data is vital to enable information sharing across the enterprise, in real-time, to enable organisations to have insight into evolving customer expectations, competitive threats, fraud prevention, risk mitigation and emerging market opportunities.

Share

In4Group

In4Group, a level 1 BBBEE technology organisation, offers innovative out-of-the-box solutions for direct data access. Feel free to contact In4Group on info@in4group.com or +27 (10) 045 0320 for more information and assistance in your real-time big data initiatives.

Editorial contacts