Unstructured data single greatest contributor to information overload

Johannesburg, 06 Feb 2006

It is estimated that as much as 80% of today`s corporate data is now unstructured, leading to unmanageable levels of information overload. This is largely attributable to the fact that the cost of storage is decreasing while capacity increases.

"Collecting and storing vast amounts of information is simply becoming cheaper," says Colin Severin, Technical Support Manager of Bateleur Software. "Organisations are capturing increasing volumes of unstructured information with the result that the data becomes unwieldy and cumbersome, making it difficult to find the right piece of information quickly and efficiently."

Unstructured data includes most data which doesn`t reside in database fields, such as e-mails, electronic documents, and free text fields of enterprise systems such as CRM applications. It also encompasses customer communications, market knowledge and research notes.

Bateleur Software is the South African distributor for Identity Systems (IDS), a worldwide leader in identity searching and matching technology.

Changes in the legislative and regulatory landscape, as well as corporate governance policies, have compelled organisations to fine-tune their reporting capabilities and extend their record-keeping capabilities. In South Africa, laws such as the Financial Intelligence Centre Act (FICA) and the Financial Advisory and Intermediary Services (FAIS) Act have echoed the requirements of Sarbanes-Oxley legislation in the US.

According to Severin, companies need to question whether they are able to store unstructured information while simultaneously obtaining maximum business benefit and avoiding legal and regulatory liability.

"The question is whether it is still a safe or viable strategy to confine enterprise information analysis to structured databases," he says. "Organisations need to extract valuable information from both structured and unstructured data to discover emerging market opportunities, the next threat to customer relationships as well as the next threat to the enterprise in terms of fraud or regulatory non-compliance."

Version 2.6 of IDS`s Identity Search Server software, which has just been released, contains the ability to search and match unstructured data, thereby helping organisations gain control over these vast volumes of information. Advantages include the ability to extract unambiguous meaning from unstructured text, add various forms of structure to words, and to tag words to indicate whether they represent names of peoples, companies, geographical locations and so on. In addition, companies can extract currencies, financial indices, Internet addresses, measures, times and time periods and vehicles.

"With Version 2.6, this high-quality search and match capability can be used on all types of name, address and identification data in business documents, letters, e-mails, case notes, report depositions, police statements, press articles, call logs - in fact, any free form text containing identification entities," Severin says.

The new release of Identity Search Server includes Address Standardisation Module (ASM), a single platform for addressing the challenge of managing complex searching and matching requirements as well as robust data standardisation. The new ASM solution is designed to provide key cost-saving benefits, including increased effectiveness of mailing campaigns, reduced reshipment incidents and costs, and discounted mailing costs. A number of major South African blue-chip corporations are already using Identity Systems` software solutions.

Editorial contacts