X
Icon

The 21st Minute


Blog
Blog Image

Infusing Python into Your Analysis with IDEA

IDEA Tech Tip


The tech community often refers to Python as the official programming language for non-programmers. It’s free and easy to learn (even if you don’t know where to start). In our August IDEA Update, we included an article about Exploring IDEA’s Built-In Power of Python, which covered some of the benefits of this popular programming language and various scenarios to help you assess your readiness.

With this post, we share some examples of how equivalent tasks in IDEAScript can be made much simpler in Python.

Extend Your Reach

The language of Python gives you increased flexibility and expansive access to your data. For those willing to give it a try and to get the most out of it, here are a few things to keep in mind:

For everyone: The quirks mentioned in our previous post have to do with the lack of user interactivity. “input()” statements cannot be used to get information from the user and “print()” statements cannot be used to output to a console. All output (and input, if required) is handled through accessing files on disk and IDEA .IMD tables.

IDEA installs with sample Python scripts to illustrate usage, which you’ll find in the local library under the Custom Functions.ILB and Macros.ILB directories.

View of Library Folders in IDEA Showing Custom Functions

An Excerpt Taken Directly from IDEA’s Python Documentation:

Python files and features are not supported on IDEA Server. Python is an interpreted, object-oriented programming language that is becoming one of the most popular programming languages used for analytics/data mining and machine learning. Python support has been integrated into IDEA to let users execute their existing Python scripts to perform analytic tasks.

IDEA is packaged with the Python interpreter, Python packages, and associated files for Python 3.5.3. This does not affect or interfere with any Python installation you may already have. However, if you currently have Python 3.5.3 installed and have libraries installed via pip (Python’s preferred installer program), you are able to access those libraries when executing your Python scripts from IDEA.

PackageVersion
scikit-learn0.18.1
matplotlib2.0.0
numpy1.12.1+mkl
pandas0.20.1
et_xmlfile1.0.1
jdcal1.3
jinja22.9.6
markupsafe1.0
openpyxl2.4.7
python-dateutil2.6.0
cycler0.10.0
SciPy0.19.1
pypiwin32219
pytz2017.2
pyparsing2.2.0
six1.10.0

Only Python packages that are delivered with the IDEA installation are supported.

By default, Python scripts that reference any of the Python packages that are delivered with IDEA will use the Python packages that were delivered with IDEA; even if you have the same packages as part of a custom installation of Python.

If you want your scripts to use packages that are part of your custom Python installation, you must edit your Python script to point to the desired location. IDEA does not support this configuration.

Compatibility

  • Due to the open-source nature of Python, backwards compatibility is not supported.
  • Only 32-bit Python packages and libraries are supported.

A Familiar Approach to Start

The fastest way to get a Python script running is by taking a small IDEAScript that only works with one table and wrapping a Python function around it, because aside from the new Python syntax and its libraries, code unique to IDEA that works with .IMD tables hasn’t really changed. Consider how familiar this example looks:

import win32com.client as win32ComClient

if __name__ == “__main__”:

try:

resultDbName = “Sample-Employees-Extraction.IMD”
criteria = “COUNTRY==”Mexico””

idea = win32ComClient.Dispatch(dispatch=”Idea.IdeaClient”)
table = idea.OpenDatabase(“Sample-Employees.IMD”)

task = table.Extraction()
task.IncludeAllFields()
task.AddExtraction(resultDbName, “”, criteria)
task.PerformTask(1, table.Count)

idea.OpenDatabase(resultDbName)

finally:

task = None
table = None
idea = None

Shown in Atom:

Python Script for IDEA Shown in Atom

For proficient Python users (aka: User A from our last post), the quickest way to get started is by understanding how IDEA calls Python scripts. There are three ways IDEA activates Python script to run. The most straightforward is by clicking Run under the Macros tab in IDEA and selecting the .py file to run. This is a fully standalone script that will not rely on passed-in arguments.

Macros Tab on the IDEA Ribbon

Selecting Python Script in IDEA

The other two ways involve passing in data to the script from the caller. An IDEAScript may call a .py file to execute and give it arguments with the line Client.RunPythonEx “Macros.ILBExamplePythonScript.py”, variableToBePassedIn.

The Python script may then access those arguments with something like:

import locale

if __name__ == ‘__main__’:

locale.setlocale(locale.LC_ALL, ”)

list = [locale.atof(num) for num in sys.argv[1:]]

The final way to execute Python from IDEA is to use a custom function. In IDEA’s Equation Editor, one would use @Python(“GetPercent”, SALES_TAX, SALES_BEF_TAX) to run this .py script:

def GetPercent(numerator, denominator):

if(denominator == 0):
return “N/A”
return “{0:.2f} %”.format(numerator/denominator*100)

For those who are familiar or new to programming (aka Users B & C from our last post), start by working with Python’s IDLE environment to pick up the language and don’t worry too much about other tools.

Those can be explored if Python is something that will be used extensively. If hesitation is a factor or if the environment won’t allow installation of new software, Python can be used any time online.

Give https://repl.it/ a try. It comes highly recommended and you are able to share your code with others.

Is it Worth the Effort?

Answering that question should be a blog post of its own, which we may do if there’s enough interest in this topic, but the short answer is it depends if you need the expressiveness and flexibility Python has over IDEAScript.

Python supports escape sequences and interpolation in strings, so criteria like…

“@GetNextValue(” & Chr(34) & EmployeeNumberField & Chr(34) & “)”

…which can very quickly become clunky if more items are added, simplifies in Python to:

f”@GetNextValue(“{EmployeeNumberField}”)”

Because of the escape () sequence and interpolation ({}), we avoid needing any concatenation for formatting. It simply tells the compiler to ignore those quotes and take them as part of the string’s value (rather than taking them as a delimiter) and to replace the placeholder EmployeeNumberField with whatever that variable holds; thereby greatly improving readability. By the way, that f at the beginning of the string isn’t a typo. That’s Python’s formatted string syntax. Feel free to Google that phrase or look out for our future tutorials where we go over all the basics and even some advanced topics!

Python supports the Object-Oriented Paradigm which is exceedingly useful in modern program design. It has much more built-in functionality that is succinct. For example, reversing a string in IDEAScript is pretty straightforward (StrReverse()), but getting every other letter, for example, would require something like…

Dim letters As String
Dim evens As String

letters = “only even letters”

For i = 1 To Len(letters)
If i Mod 2 = 0 Then
evens = evens + Right(Left(letters, i), 1)
End If
Next i

…but in Python, this is achieved with something simpler…

letters = “only even letters”
evens = letters[1::2]

…or, even shorter…

evens = “only even letters”[1::2]

This is an example of string slicing in Python which treats strings as character arrays (or in Pythonic terms, as character lists. “Arrays” are more associated with the NumPy library).

Try your hand at using Python in place of IDEAScripts to see how you like it. If we have enough interest in further instruction, we will introduce some “crash courses” to help you take advantage of this new IDEA capability. We want to hear from you – share your comments and questions with us at [email protected]. Have fun playing with Python – we promise it won’t bite!


Automate Procedures , CaseWare IDEA , Data Analytics , Tech Tip



Posted By

By


Related Posts
Give Python a Go
Mar 29 Python is considered the official programming language for non-programmers. It gives you increased flexibility and expansive access to your data. For those usin...
Unconventional Analysis
Jan 24 Data is often underutilized. The opportunity to use data analytics to gain insights, add more value and unravel opportunities are endless. We’ve rounded u...
5 Avenues for Importing Data into IDEA
Nov 21 One of the greatest benefits to CaseWare IDEA is the sheer amount of data it can read and make universally consumable – both in file types and in data...
BROWSER NOT SUPPORTED

This website has been designed for modern browsers. Please update. Update my browser now

×