Audimation Services has been acquired by Caseware International Learn More.


Blog Image

Data Acquisition – Tips for Success

Advice for Getting the Right Data & Preparing it for Analysis

More than ever, organizations are realizing the valuable insights and opportunities that lie hidden within both structured and unstructured data. Obtaining access to data is another story. Data acquisition remains a top challenge for auditors. A recent survey conducted by Protiviti found that the primary challenge of internal auditors is difficulty obtaining, accessing and compiling data. There is value to be gained from data, you just need the right approach to identify where it lives and understand how to access it.


IT – The Place to Be


If your organization falls within the lower- to mid-range size in revenues, chances are your IT department is primarily focused on day-to-day services such as desktop support, virus protection, troubleshooting, installing software, etc. They may not be focused on enhancing their software platforms through data integration or reporting, which can present a challenge as knowledgeable resources are not available when requesting raw system data in the proper format for analysis. Even larger companies sometimes lack skilled personnel to assist with data extraction.

Here are some tips to work through the obstacles described above:

  • Begin by educating IT on what you’re doing and why you need the data
  • Start early! – Identify what data you need and make requests as soon as possible
  • Work with company management to identify skilled IT resources who is able to provide you data acquisition services
  • Rally management to gain the support and trust of IT

Software-as-a-Service (SaaS) is becoming more prevalent, which allows for backend data access and on-premise or cloud data warehousing. Gain an understanding of available application program interfaces (APIs).


Prepare for Pushback


When requesting data, there will inevitably be a string of client questions, comments and objections such as:

  • You don’t really need that data for your audit
  • You’ve never asked for this before
  • We don’t know how to access the data
  • We can’t get the data in the format you’ve requested

However, there are very few accounting systems that cannot provide the necessary information in the format needed. Make the data request prior to starting your fieldwork, and be specific about exactly what you need. Canned reports often lack complete data, such as date, time, owner, etc. Be sure to include all the fields you will need to get a complete set of data.


Tone at the Top


Having audit objectives that demonstrate the value of using data to help meet strategic business goals is paramount. Meet with the CFO to review your audit plan and objectives, and be sure to communicate the benefits of using data analytics including:

  • Data integrity – Read-only access to the data means source data remains protected
  • Audit trail – IDEA automatically tracks all the steps taken during the analysis and provides them in a diagram
  • 100% coverage – IDEA allows you to analyze 100% of the data population rather than a sample to gain a complete picture of what is transpiring within the business

These capabilities will provide management with higher levels of assurance, and build support for the audit objectives.


Tips for Working with the 3 Most Common File Types


If you have the ability to select the export file type, ask for a delimited file using the pipe symbol ‘|’ as the delimiter. The reason for asking for pipe as a delimiter is that it is less commonly used in traditional datasets, whereas commas are used frequently in description or note fields. However, if this is not possible most systems allow and data extraction tools allow for extraction using the following file types:


1) Comma Separated Values (.csv)


Files are commonly used to transport large amounts of tabular data between companies or applications that are not directly connected. The files are easily editable using common spreadsheet applications like Microsoft Excel. Fields are separated by commas, and records are separated with system end of line characters.

The key to importing .csv files or any other delimited files, is knowing the delimitating character or field separator, which is often a “,” or “;”. The import assistant feature in IDEA will help you select “Delimited” as the file type and specify the field separator based on whether characters are separated by commas, colons, semicolons, tabs or spaces.


2) Print Reports (.pdf or .txt)


Can range from simple to extremely complex. The Report Reader within IDEA reverse engineers PDFs into clean data table imports, and can read both structured PDFs and those with irregular sizing and positioning. Some key criteria to keep in mind are:

  • Files should be system print files – avoid scanned paper when possible
  • Files should be true-text font type – ask for this format when making the data request
  • Files should not contain any shapes, text boxes, references, etc.

Original generation files are easier to import into IDEA. When in doubt, contact the IDEA Help Desk for assistance and additional tips.


3) Microsoft Excel (.xls)


The key to importing spreadsheets is formatting the information properly prior to importing.

  • Require First row field names (no spaces)
  • Remove blank columns (creates noise in the spreadsheet)
  • Remove No (sub)totals
  • All fields in Excel format (i.e., character, number, date, amount, etc.)


Reconcile Source Data


Once information is imported into IDEA, all information must be reconciled to the source data. The most effective way to do this is through “Control Totals.” The “Control Totals” can be used to calculate a grand total for numeric fields. In addition, you can use the “Field Statistics” to view fields of data type: Numeric, Dates and Time. Through this view you can see the grand total as well as the absolute grand total and check for any items that may have come in as errors. If the data does not reconcile, import errors may go undetected. This may including underlying issues with data supplied by the client. You can view both the “Control Total” and the “Field Statistics,” under the Properties window.


ODBC – Not as Complex as it Sounds


Open Database Connectivity (ODBC) is a standard programming language middleware for accessing information stored in structured files or database management systems (DBMS). This information typically but not always resides on database servers located within your network environment. ODBC is available on all Microsoft Windows environments, including your PC or laptop.

While it may seem difficult at first glance, it is simply a program interface available to IDEA that provides a connection to source systems for data import. In some cases, your IT department may need to provide you the proper security access and credentials as well as giving you the proper drivers for the source system.

Screenshot of ODBC Data Source Administrator for CaseWare IDEA


The data “lives” in the databases. ODBC simply allows you to extract data from the databases directly into IDEA. ODBC allows you to capture the required data fields, control filtering and establish repeatable processes to enable continuous auditing.

For example, if you request a canned report of journal entries, it may lack critical information about who posted the entry and when it was posted. By moving away from standard reports and using ODBC to extract data directly from the database, you can get the required fields you need.


ETL Process


Define What You Need Prior to Pulls


Before you start, be sure your audit program is fully developed. What are you trying to do? What tests will you perform? Do you know exactly what fields you need? Create a catalog of steps to perform, including the required columns, test steps and routines.

Define your scope before interrogating the system. Develop a catalog of data required for each test. For example, to test Accounts Payable – Procurement you will need:

  • Vendor name
  • Vendor street address
  • Vendor city
  • Vendor state
  • Vendor zip code
  • Vendor phone number
  • Invoice date
  • Invoice amount
  • Invoice number
  • Approver
  • Unique record number
  • Check or ETF number
  • Payment date
  • Payment amount
  • Purchase request/Purchase order number
  • Purchase request/Purchase order approver
  • Debit/Credit memo indicator
  • Any unusual fields in the system data

Sidebar – The AICPA Audit Data Standard Working Group has developed a library of standardized data models to help improve the timeliness and effectiveness of the audit process. Visit for information and data standard tools.


Accessing Business Cycle Data


The process of acquiring the right data often falls on the auditors. To further complicate matters, data is likely stored in different systems and in multiple formats. Extract, Transform and Load (ETL) is a process used to collect data from various sources, transform data in to a usable form where it can be loaded into a destination database or format. ETL is typically used to migrate data from one database to another or to convert databases from one format to another. ETL can be used by auditors pull and map data to the AICPA’s Audit Data Standard (ADS). Using ETL, auditors can gain access to business cycle data. It also allows for script scheduling and recurring data downloads.

Operational system, ERP, CRM and flat files are processed through ETL tools which transform their data in to formats such as: metadata, raw data or summary data. Typically, ETLs import data in to data warehouses for online analytical processing (OLAP) reporting and data mining.


More Tips for Success


ERP Manuals


Technical reference manuals are available online or by request from the vendor. They offer diagrams outlining the data structure to help the auditor identify where the fields reside within the system and the data relationships among tables. All tables and relational databases have indexes, which make it more efficient to pull the data you need. When extracting large amounts of data through a database query, indexes are important to understand and use.


Data Dictionary


All databases have a data dictionary, although each vendor has their own version.


Understanding Tables


Most large ERP systems like Oracle and SAP have standard tables. Auditors can request the Entity Relationship Diagram (ERD) located within the Technical Reference Manual to decipher where the data “lives” and the tables required for analysis. Some companies create custom extensions which are built in to the application. The database administrator should have information about the data stored within these custom tables.


Query Design Tool (via ODBC)


Visual query tools, such as MS Access or SQL Server can be used to see data connections. You can build a SQL query to create your own view of data stored within tables.

SQL statements can be used to select columns and tables as follows:

SELECT: Column 1, Column 2, SUM (Column 3)
FROM: Table 1, Table 2
WHERE: Table1.column1=Table2.column2
GROUP BY: Column1, Column


Documenting the Process


Forms can be used to document the use of data extraction software used to perform automated audit procedures. It is most effective when used to plan the engagement. The auditor should start by identifying the specific objectives to be addressed and determine what data extraction tools are needed to meet those objectives. The form should include client information, system information, time requirements, budget and suggestions for the next year.

Example of a Data Extraction Software Analysis Documentation Form:

Audit Area Audit Objective Reports Produced Data File Name Description of Data File Content Data File Type Method of Data File Transfer Data Corroboration Procedures Working Paper Reference

Acquiring the right data is critical to your audit success. It’s important to understand your organization’s IT environment which includes finding skilled resources that can provide assistance. Developing a data-centric IT audit program, including a data catalog for data requests, will ensure you have what you need to meet your audit objectives. Using ODBC to gain direct access to data can help you move towards a repeatable scripted process. All the resources discussed in this article can be used to help you get the right data, right from the start.

Information in this article was sourced from a presentation provided by BKM Sowan Horan, an IDEA Integration Partner, which offers a mix of Information Technology and Accounting knowledge to help clients get the right kind of data, develop tests that meet internal controls, search for fraud and errors, and generate meaningful insights. For additional information, visit or contact Kevin Rockecharlie, Director of IT Assurance Services, at [email protected].

Automate Procedures , Best Practices , CaseWare IDEA

Posted By

By Sarah Palombo
Sarah Palombo founded Avery Public Relations in 2007 and took on Audimation Services as her first client. She has more than 20 years of experience developing communications programs and creating content.

Related Posts
Innovation Starts with IDEA
Nov 14 More than 125 IDEA users, partners and CaseWare representatives gathered in Houston on November 1-2 to learn new strategies for implementing and growing their u...
Internal Auditors are Encouraged to use Technology to Improve Effectiveness
Mar 21 Abstract.   The Institute of Internal Auditors recently published a number of papers under their Practitioner Survey Series. The series reflects output ...
Wayfair Woes One Year Later
Jul 24 The rise in online shopping has sharply impacted state sales tax revenue. Rather than increasing the tax rate, states are setting parameters to define nexus, wh...

This website has been designed for modern browsers. Please update. Update my browser now