My Data takes me back to HD Videos

Last month I wrote about indexing video feeds and Vimeo was the site I featured for HD videos. The idea was to use the Vimeo REST API to gather all the meta data about your favorite Vimeo HD video channels and then index this into Splunk for historical look up or simply to have it available as a one stop dashboard where you can not only view the information that got indexed, but also use a workflow action to actually view the video.


Click on Show Video

Then, what happened was that the REST API called from Python changed in that I was getting one huge line per channel instead of nicely formatted XML. My code had logic to skip all lines with the words video, videos, or xml? in it. Naturally, the one huge line got skipped since it had these words and nothing got indexed.

I ended up changing the code in my scripted input to put a newline character in front of every occurrence of < and after every occurrence of > and then stripped out any blank lines. Now, the code would work as intended and the data would get indexed.

My main dashboard was still unpopulated. The problem was that field extractions used to populate the reports were no longer working since the raw format of the data changed. Thanks to Splunk’s late binding, which does not compel a field extraction to be based on a database schema, this was easy to fix. PCRE REGEX has a prefix, (?m), which can be placed in front of the REGEX to tell it that it is a multi-line REGEX. An example from the Vimeo app is:


This example will extract the user URL in between the XML elements. This will also work with the prior Vimeo raw format. The reason I did not use Splunk’s 4.3 spath command was to remain backwards compatible with Splunk 4.2. Also, the reason I did not use xmlkv was to allow drill down with Simple XML dashboards out of the box.

In conclusion, the flexibility of Splunk to ingest any text based data and change field extractions at search time let me see my list of videos from my Vimeo channels again using Splunk as my launch pad. Where will your data take you? Let’s find out at .conf2012. Register today.

Nimish Doshi
Posted by

Nimish Doshi

Nimish is Director, Technical Advisory for Industry Solutions providing strategic, prescriptive, and technical perspectives to Splunk's largest customers, particularly in the Financial Services Industry. He has been an active author of Splunk blog entries and Splunkbase apps for a number of years.