The work that is now being done to make efficient business use of vast volumes of unstructured data is a lot like the development that took place 20 years ago in the field of structured data.
This is according to Bill Hoggarth, head of business intelligence (BI) at CQS Technology Holdings and a speaker at the upcoming ITWeb Data Warehousing Conference.
Hoggarth says: “We need to do for unstructured content what data warehousing has done for structured data.”
At the moment, he says, it is possible to do a logic-based search of a company's structured data; but you'd have to write a specific programme to ask a question like: “How many clients did we invite to World Cup matches?”
Hoggarth says currently, you might search varied sources like word documents, e-mails and spread sheets for days and sift through a lot of irrelevant information, before coming up with an answer.
There's no doubt that a vast amount of valuable information resides in the unstructured data within each company and in the Internet as a whole, he says.
Hoggarth says: “If you want to be truly information-centric, you must recognise that the data warehouse is just one source of necessary information - only a part of the overall information landscape.
“You must look at how this information interacts with all the vast amounts of data that exist outside of the data warehouse.”
“The importance of unstructured data must be recognised if you want your data warehouse and BI programme to keep up with changing trends. You can't just throw more horsepower at it.
“You must revisit the value of information and recognise that there are useful information sources that don't look like a chart or a table. If we, in the data warehouse and BI industry, are going to deliver effectively, we must recognise there are many useful sources of information.“
ITWeb's Data Warehousing 2010 conference
More information about the ITWeb Data Warehousing 2010 conference, which takes place on 30 November at Gallagher Estate in Midrand, is available online here.
Hoggarth says many firms are now looking at how to make sense of this wealth of unstructured data.
“There are vendors building products with context-sensitive BI functionality for unstructured data,” he says. “This reminds me so much of the emergence of SQL 20 for managing structural relational data years ago.
“History is repeating itself. Except we have gone from data warehousing to information warehousing.”
Hoggarth will address the annual ITWeb Data Warehousing conference at Gallagher Estate in Midrand on 30 November. For more information or to book your place at the event, click here.
Share