Harnessing Big Data to Find Fraud?

First Step: Find the Right Data

Houston – January 2014 – Big data analytics holds big promise when it comes to helping companies identify fraud, even before a compliance failure unfolds. There is, however, a daunting impediment to getting started: knowing what data to gather and analyze from the heaps of information that most companies generate.

Using analytics to ferret out potential fraud is, of course, not a new concept, but traditionally it’s been limited to the analysis of structured data, such as spreadsheets and database records. Now that Big Data methods provide the ability to harness mass volumes of both structured and unstructured data simultaneously, and at speeds once inconceivable, companies are gaining greater insights into potential fraud and more accurate red flags.

“Big data analytics have given companies a lot more flexibility around how they look for fraud from a compliance perspective,” says Richard Sibery, head of fraud and investigations for the Americas for EY’s Fraud Investigation & Dispute Services practice. Companies now have the ability to look across a much broader spectrum of data sources in real-time to more easily identify anomalies and suspicious activity in the data that may be indicative of fraud.

The trouble with traditional data analytics tools for finding fraud is that they often result in a lot of false positives, says Vincent Walden, a partner in EY’s Fraud Investigation & Dispute Services practice. “Nothing is more frustrating to an internal auditor than having a whole bunch of hits that don’t turn out to be anything,” he says.

Consider the task of a global financial institution, for example, that has to check the names of individuals against those on a sanctions blacklist. Such a task can pose significant challenges for any multinational company with thousands of customers who may share the same name as those on the blacklist.

But now with Big Data capabilities, companies can collect a broader variety of information—such as the individual’s nationality, the names and locations of family members, and whether they’ve traveled to, or received money from, sanctioned countries—to more easily identify those who are truly sanctioned individuals versus those who only share a name with them.

Using Big Data analytics tools effectively reduces false positives and enables companies to more accurately hone in on what risk areas to audit or monitor on a much more targeted basis, says Walden, “which enhances overall internal audit and compliance success.”

Put the Risks First

Before a company determines what Big Data tools it needs to implement, it must first identify what specific risks the company is trying to protect against, says Eric Thompson, IT threat strategist at RSA, which provides information security and governance, risk, and compliance solutions. Only then can the company accurately assess what data it has against what data it needs, and how to go about accessing it. The question to ask, he says, is: “Where are the gaps?”

The amount and scope of data that companies have the potential to leverage is so vast that it can be costly to try and collect it all, “so you really want to collect and analyze only the data that is going to be of most value to you,” says David Jonker, senior director of Big Data marketing at business software company SAP. The question companies need to ask themselves is, “what data, if it could only be mined, would provide insight into a potentially fraudulent activity that is happening within my organization?”

Building effective Big Data analysis tools, however, requires having the right team of people at the table. “You really need data scientists in addition to subject matter experts,” says Thompson.

That means assembling a team of IT experts—who know what data is available, where it is located, and how it’s stored—and also an investigative team of subject experts, such as internal audit, legal, and compliance professionals to be able to interpret the results.

“Human intelligence is critical to any kind of analytics,” says Jill Davies, vice president of professional services for Audimation Services, a data analysis technology company. “Getting the right people with the right skills is very important. Every one of them brings a different perspective and a different set of experiences to the table.”

Part of spotting fraud means knowing what red flags to look for in the data, Davies adds. “Otherwise, you’re shooting in the dark,” she says.

Sibery and Walden agree that big data analytics in itself won’t prove fraud. “It’s only going to tell you where to look,” says Walden.

Using Big Data at Express Scripts

An online pharmaceutical company, Express Scripts, for example, has put together a fraud, waste, and abuse team, whose mission is to identify and fight against prescription drug fraud and abuse. Achieving that “involves a combination of detective work and state-of-the-art technology,” Jo-Ellen Nader, a senior accountant for Express Scripts, writes on the company Website.

To do this, the company uses “proprietary data analytics to uncover patterns of a potential fraud or abuse and scans for behavioral red flags to identify when someone is involved in wrongdoing,” she writes. In this way, the fraud, waste, and abuse team has been able to identify 290 potential indicators of pharmacy fraud, including:

The number of doctors visited;
Distance traveled to the physician or pharmacy;

“Human intelligence is critical to any kind of analytics. Getting the right people with the right skills is very important.”

Jill Davies, VP of Professional Services, Audimation Services

Below is an excerpt from the Cloud Security Alliance’s paper on big data and security intelligence that explains big data’s role in improving information security.

Data-driven information security dates back to bank fraud detection and anomaly-based intrusion detection systems. Fraud detection is one of the most visible uses for Big Data analytics. Credit card companies have conducted fraud detection for decades. However, the custom-built infrastructure to mine Big Data for fraud detection was not economical to adapt to other fraud detection uses. Off-the-shelf Big Data tools and techniques are now bringing attention to analytics for fraud detection in healthcare, insurance, and other fields.

In the context of data analytics for intrusion detection, the following evolution is anticipated:

1st generation: Intrusion detection systems – Security architects realized the need for layered security (e.g., reactive security and breach response) because a system with 100% protective security is impossible.
2nd generation: Security information and event management (SIEM) – Managing alerts from different intrusion detection sensors and rules was a big challenge in enterprise settings. SIEM systems aggregate and filter alarms from many sources and present actionable information to security analysts.
3rd generation: Big Data analytics in security (2nd generation SIEM) – Big Data tools have the potential to provide a significant advance in actionable security intelligence by reducing the time for correlating, consolidating, and contextualizing diverse security event information, and also for correlating long-term historical data for forensic purposes.

Analyzing logs, network packets, and system events for forensics and intrusion detection have traditionally been a significant problem; however, traditional technologies fail to provide the tools to support long-term, large-scale analytics for several reasons:

Storing and retaining a large quantity of data was not economically feasible. As a result, most event logs and other recorded computer activity were deleted after a fixed retention period (e.g., 60 days).
Performing analytics and complex queries on large, structured data sets were inefficient because traditional tools did not leverage Big Data technologies.
Traditional tools were not designed to analyze and manage unstructured data. As a result, traditional tools had rigid, defined schemas. Big Data tools (e.g., Piglatin scripts and regular expressions) can query data in flexible formats.
Big Data systems use cluster computing infrastructures. As a result, the systems are more reliable and available and provide guarantees that queries on the systems are processed to completion.

New Big Data technologies, such as databases related to the Hadoop ecosystem and stream processing, are enabling the storage and analysis of large heterogeneous data sets at an unprecedented scale and speed. These technologies will transform security analytics by (a) collecting data at a massive scale from many internal enterprise sources and external sources such as vulnerability databases; (b) performing deeper analytics on the data; (c) providing a consolidated view of security-related information; and (d) achieving real-time analysis of streaming data. It is important to note that Big Data tools still require system architects and analysts to have a deep knowledge of their system in order to properly configure the Big Data analysis tools.

Source: Cloud Security Alliance.

www.complianceweek.com » 888.519.9200

Reprinted from Compliance Week, January 2014

Best Practices , Data Analytics , Fraud

By Audimation Team

Shorten The Audit Lifecycle

Jul 20 While no two audits are the same, most auditors follow the same processes and strive to improve performance. Whether your organization has formal or informal be...

Tech Tip: Filling Empty Cells

Oct 23 Importing data that is not formatted correctly can drain hours of your analysis time. An IDEA user was working with general ledger detail that required an accou...

Innovation Starts with IDEA

Nov 14 More than 125 IDEA users, partners and CaseWare representatives gathered in Houston on November 1-2 to learn new strategies for implementing and growing their u...

Harnessing Big Data to Find Fraud?

First Step: Find the Right Data

Put the Risks First

Using Big Data at Express Scripts

BROWSER NOT SUPPORTED