Subscribe

Natural language processing a cornerstone of DLP


Johannesburg, 24 Nov 2008

Powerful natural language processing (NLP) technology can help organisations dramatically reduce the time and effort involved in identifying the sensitive information their data loss prevention (DLP) implementation should protect while enhancing the accuracy of detection of confidential data.

That`s according to Guy Golan, managing director of New Generation Solutions, a subsidiary of SecureData Holdings. He says with the explosion in the amount of data the average organisation needs to manage and secure, companies need an easy-to-use discovery solution to help accurately identify and classify confidential corporate information.

In the past, companies depended on fingerprinting of information and/or linguistic solutions to identify and classify sensitive information so it could be protected from deliberate or accidental leakage.

Fingerprinting is a powerful way to detect data because it`s extremely accurate, but it consumes a lot of processor resources during the initial phases of detecting data that must be fingerprinted and finding out where this data resides. It is, despite this drawback, highly recommended as a DLP solution for unstructured data.

The other approach companies have used to prevent the leakage of sensitive unstructured information is content filtering, based on linguistic dictionaries. Linguistic solutions, however, tend to produce an unacceptable number of false positives (content wrongly prevented from leaking) and false negatives (content that should be stopped from leaking, but isn`t).

Most linguistic solutions operate by searching for keywords, which may be harmless in some contexts but may point to sensitive information in others. For example, a user may write an e-mail containing the words `business plan` and `balance sheet,` but may not attach any confidential information to it. "In other words, context matters," says Golan. "For that reason, linguistics-based solutions are by far the weakest form of DLP in use today."

NLP addresses the drawbacks of linguistic solutions by using sophisticated algorithms to provide contextual analysis of even non-fingerprinted data. NLP can understand the context around the data to accurately identify both structured data (identity numbers and credit card numbers, for example) and unstructured data (balance sheets, budgets, business plans, intellectual property like formulas and blueprints).

"An NLP solution doesn`t only scan the content, but also looks at the context," says Golan. "For example, one could configure the solution to stop any transmission by SMTP of an e-mail that contains the word `balance sheet` that is attached to a file that has the characteristics of a balance sheet."

In a linguistic solution, the software engine will only stop the message from leaving if it encounters the exact term `balance sheet`. An NLP engine will stop the leakage of information that is attributed to a balance sheet (numbers, capital, loans etc) even if the exact expression `balance sheet` does not exist in the data.

The beauty of an NLP solution is that it allows businesses to intelligently create protection policies across all of their sensitive data. They know what characteristics define their sensitive data, even if they don`t necessarily know where all of this data is. NLP enables companies to put a policy-based DLP solution in place that maps closely to their business needs.

Concludes Golan: "An NLP is a vital part of the artificial intelligence that any DLP solution should provide. It allows enterprises to minimise the resources they need to prevent data leakages while improving the level of protection for their sensitive data. It should be one of the basic elements companies look for when they`re choosing a DLP suite. NLP used in conjunction with fingerprinting technology is a powerful solution that will prevent nearly any data leakage, allowing you to optimise on both detection accuracy and resource usage."

Share

NGS

NGS specialises in providing software, services and solutions that address internal security threats within companies of all sizes. A subsidiary of JSE-listed SecureData Holdings, NGS has established itself as South Africa`s authority in securing organisations` information from threats from within.

The company`s services stretch from helping clients to identify and assess internal information security risks, through to designing, implementing and supporting the solutions that allow enterprises to secure their systems from insider threats. NGS has partnered with a range best-of-breed vendors, including Websense, ActivIdentity and Cyber-Ark to offer its clients complete, proven solutions for their internal security needs.

Editorial contacts

Guy Golan
NGS
guy@ngs.co.za