Indexing data from Saas solutions running on relational databases

By Elias Haddad

As we began work on building the Salesforce.com app, I was again face to face with a familiar challenge…a challenge that you would encounter anytime you want to ingest structured data coming from any Saas based application that is running on a back-end relational database. In such a Saas based environment, the data is usually exposed via a REST, Webservices API or similar. As you know, in a typical relational database, all data is stored in multiple tables and records are linked across tables using ID’s. For instance the Incident table in ServiceNow does not have the Username that created that ticket but has a User Identifier (long cryptic string) referencing another record in the “Users” table that includes the Username, First Name, Last Name and other information.

The problem that you would face in this scenario would be: how do you ingest that data in order to make it easier to search? How do you build these lookups in Splunk knowing that the data is distributed across multiple tables?

The first solution that would come to mind would be to have the same python script that polls the data from SFDC or ServiceNow (as a scripted input) save the data in a CSV format directly and store it under the lookup directory of the app. The app would then use that CSV file as an automatic lookup on the searches that apply. However, this approach poses some limitations:

In a distributed environment, the lookups CSV files have to be stored on the search head. This means that either you have to come up with a mechanism to periodically copy the CSV files to the search head from the forwarder or use the search head to pull the data in directly. Both are not good options.
The python code has to handle updates to the lookup tables – in other words, update existing records in the CSV file with newly polled data that matches. This is doable but can be a bit tricky and would require some python coding skills.
What if I want to track changes to that lookup tables from an auditing perspective? Take the example of the Users table in ServiceNow, if someone changes the First Name of a given user in ServiceNow, the python script will poll the changes and update the existing record in the CSV lookup file erasing the previous information.

After many discussions, an alternative solution presented itself. This solution relies on indexing the lookup data as opposed to storing it as a CSV. Then the app runs periodic searches to build that lookup table in Splunk and keeps it up to date.

A good example of that search would rely on the “inputlookup” command to get the existing data in and “outputlookup” command to append the new result in the CSV file.

Elias Haddad

Elias is an Emerging Market Presales Architect working out of the Dubai office. Prior to that, he was a Product Manager responsible for Splunk data ingestion and held various pre-sales, post-sales and business development positions. Elias lives in Dubai and graduated from Purdue University with a master’s degree in computer engineering.

Tips & Tricks 2 Min Read

Windows Event Logs in Splunk 6

Tips & Tricks 1 Min Read

DIY: Software Asset Management with the Splunk Enterprise Free Edition

Handy tips on how to build your own SAM with Splunk to monitor software installs and usage

Tips & Tricks 1 Min Read

Handling HTTP Event Collector (HEC) Content-Length too large errors without pulling your hair out

Answer for dealing with HTTP Event Collector (HEC) error message 413 content too large: reset configurable pre-defined limit for max content using limits.conf.

About Splunk

The Splunk platform removes the barriers between data and action, empowering observability, IT and security teams to ensure their organizations are secure, resilient and innovative.

Founded in 2003, Splunk is a global company — with over 7,500 employees, Splunkers have received over 1,020 patents to date and availability in 21 regions around the world — and offers an open, extensible data platform that supports shared data across any environment so that all teams in an organization can get end-to-end visibility, with context, for every interaction and business process. Build a strong data foundation with Splunk.

Learn more about Splunk

Indexing data from Saas solutions running on relational databases

Related Articles

Windows Event Logs in Splunk 6

DIY: Software Asset Management with the Splunk Enterprise Free Edition

Handling HTTP Event Collector (HEC) Content-Length too large errors without pulling your hair out

About Splunk

Subscribe to our blog

Connect with Splunk on X

Connect with Splunk on Instagram