
The world's production of unstructured data continues as rapidly as ever and mining it for useful information is big business.
Most of the current concern about Facebook's ever-decreasing respect for its users is that all of the unstructured data in there can be mined for deeply personal information. But it's not just Web giants that are overflowing with lack of structure. Kevin Kemp, commercial sales manager at SAS Institute, says it's true of his enterprise clients as well.
"The biggest challenge is that 80% of all data in our customer base is unstructured," he says. "So that needs to be put into a structured format and standards need to be implemented so you can do something with it. There's a bunch of ways to do it, but the challenge is getting bigger as the data exponentially multiplies."
Cloud is immature right now, but these things take off very quickly.
Charl du Toit, Oracle Sales manager, EOH
One way is to avoid the problem altogether. Charl du Toit, Oracle Sales manager at EOH, says it's not always necessary to extract structure from the lack of it.
"Sometimes you can just deliver documents through your business intelligence portal with context. That might be enough instead of spending two years generating yet more data."
What's also happened is technologies to make sense of the Web's proliferation of data are crossing over to the business intelligence (BI) domain.
"There's a growing crossover between search and BI, between traditional access to information via BI tools and search technologies," says Ryan Jamieson, MD of IS Partners. "They are feeding off each other. One of the things that I've seen emerge are tools that do meta-data analysis against unstructured information."
Michael Jones, BI solution manager at SAP, agrees.
"There are two approaches," he says. "There's search, in which I have to know what I'm looking for initially, and then there's the question that says 'let's extract meaning from the data'. I could be looking for sentiment analysis, for example, which is one way of deriving meaning from text."
In some industries the job is even harder. Barbara Luckemeyer, BI consultant at KID, has worked with unstructured data in the medical field.
"It's a very different approach to working with traditional financials," she says. "Medical and research organisations know that they're looking for trends but they can't predict what those are ahead of time. And each hospital and clinic will have certain assets but they'll be in a different format. To try to bring those together and make sense of them is a huge challenge. You sit with different professors and analysts and try to figure out what they mean. It does take a long time. Patient records, diagnostics, observations from CAT scans, all are very wordy and different people have different styles of writing, which is the challenge."
But the bread and butter for BI at the moment is finances, as Keith Jones, MD of Inca, notes.
"The financial department makes some numbers, BI measures those metrics and works out the performance. Unstructured data is interesting but not really useful there. BI tools can now present unstructured data and let me drill down if I want to or if needed. Inevitably, structured and unstructured will converge and it won't matter where the data came from, but I'm not seeing that yet."
Should businesses be looking to get their unstructured information more integrated with their core systems? SAS's Kemp says yes.
"Should we be integrating unstructured data? The answer is absolutely yes. Looking forward over the next 18 months, as opposed to looking backwards, more of the decisions that businesses are making are in an unstructured format. Yes, the financials are there and structured, but unless we're able to access unstructured information, we'll miss out. Integration could be very costly and time-consuming, but that's what middleware is for; rather create a layer and keep the intelligence there."
Making it matter
David McWilliam, MD of Pro Solutions, says businesses need to be careful to distinguish true BI and mere content management.
"People get confused between the indexing and retrieval of content and the extraction of meaning from unstructured data from a BI system," he says. "Yes, the technology can get structure out of unstructured data. The trick is to understand what you want to do. It's not about technology or tools. Very clever people can spend a lot of time working with tools, making sense out of that data to no good effect whatsoever. There's a lot of technology around: text analytics, content analytics and so on. You can always find tools to do it but the important thing is to understand what the business wants."
That's a good initial premise when implementing any IT solution, but particularly business intelligence. Sean Paine, MD of EnterpriseWorx, says any solution has to start with the business' strategic goals and work back from there.
"That's really what BI is all about. We had a client who needed a solution to manage its inventory. The company needed to reduce inventory and the associated costs. It had been flying blind, or at best, working with spreadsheets that would take a week to produce, by which time the information was irrelevant. What we did was find the KPIs for what was needed and deliver the right information when it was needed."
But it can depend very much on the client. Nitesh Vallabh, director at PBT, explains: "We have clients who are willing to go on the journey and work with us. Then there are clients who create the perception that they know what they want, but it's only when you put the can on the table that they complain that it's Grapetiser rather than Coca-Cola. CIOs out there need to put their egos aside a bit and say, 'I don't know'. If you're working with someone who says, 'I don't know', you have a better chance of success. There are a lot of CIOs who come from an OLTP background and have no clue about how data needs to be organised to get insight from it."
There's a growing crossover between search and BI.
Ryan Jamieson, MD, IS Partners
Luckemeyer has experienced this too.
"Business knows it needs something, but it can't formulate the requirements into something specific," she says. "You eventually have to find a solution iteratively."
SAP's Jones says vendors often joke about businesses not knowing what they don't know, "but the B in BI stands for business and it's our job to start with the business requirements and work back from there," he says.
EOH's Du Toit summarises the problem: "A lot of the work I've done was with financial departments. Once I expanded and started working more with the IT department, I found the thinking quite scary. It's often diametrically opposed to business although standardisation and optimisation are common to both. Business doesn't care much about the plumbing - it just wants something that will help to make a decision. As BI practitioners, we're on that line between business and IT and it's what makes the job so challenging."
Cometh the cloud
Cloud computing is shaking up a few traditional industries. Could business intelligence be one of them? Jeremy Waterman, MD of Softline Accpac, says not in this country yet.
"We're finding no demand for it," he says, "but rather for some kind of virtual managed solution. Cloud-based BI is putting the cart before the horse. If people aren't really putting their ERP systems into the cloud, then why would they look at taking BI off site? I had a query last week about taking someone's inventory data into the cloud, where we would do some processing and hand him back some useful information. All fantastic in theory, but until there's widespread demand, it won't happen in South Africa."
Ashley Pillay, divisional director at Softline Accpac, points out another good reason.
"The take-up of software as a service in general is very, very slow because SME clients don't want to put the crown jewels in someone else's hands."
For Edward van der Walt, consultant at BI Practice, the keyword is legislation.
"How do you deal with private data in the cloud? It would be fine for younger people who don't mind exposing their personal data on the cloud. The older generation doesn't like it at all. How will companies deal with that? The other thing is that cloud is scalable and cheap, but can I trust where it's being hosted?"
And as Jones points out, cloud-based BI isn't making money yet.
"Right now, the largest hosted provider of BI in the world is Google. It has 100 000 users, the project is basically a toy and no one else in the hosted BI sector is making any money at all. Google did it basically to annoy Microsoft and it's worked. The rest of the sector isn't financially viable."
The important thing is to understand what the business wants.
David McWilliam, MD, Pro Solutions
There are also some logistic problems with cloud-based BI on a large scale.
"If you're a large corporate, then you can't move terabytes of data into the cloud," says EOH's Du Toit. "You can't host your data warehouse in the cloud. If I'm an SME with money constraints and lower BI requirements, then of course, I'm very much going to be in the cloud. Not worrying about hardware and people becomes a very valuable proposition. It's immature at the moment, but these things take off very quickly."
SAS's Kemp agrees and says it will be a generational thing.
"The cloud as a platform is ideally suited for software as a service and we will start seeing it being used for BI at the low end. And the younger generation is not too concerned about what's your content and what's their content; they're quite happy to combine their own content with someone else's using BI tools. From a governance perspective, it might be viewed as risqu'e, but the younger generation doesn't give a damn about that."
* Article first published on brainstorm.itweb.co.za
Share