Zeromq as a Splunk Input

Occasionally, people ask me how to get a message from a message queue such as JMS to deliver its messages into Splunk. I point them to the approach I put up on Splunkbase where a JMS listener is called by Splunk as a scripted input and dequeues messages that are put on queues of interests. Obviously, after the message is dequeued, it is meant to to go into Splunk in this case. No other business application would have subsequent access to the same message on the same queue. Therefore, if you want to use a pure messaging system that is not part of your application to send time series messages to Splunk, this is not the approach you should be taking. You should use another queuing system that is light weight and flexible for your needs.

I looked around and turned to Zeromq (or known as 0MQ). It is open source, does not rely on a persistent store making it light weight and performance rich, and has a variety of client language bindings, although the core implementation is in C/C++. Zeromq also implements a variety of messaging design patterns making it flexible to adapt to your needs. Essentially, I built a few scripted inputs that call a zeromq listener to receive messages. These messages are then sent to standard output so they can be indexed into Splunk for further search and analysis. In my reference implementation, I wrote senders that send random time series temperature data messages as it’s easy to understand and Splunk’s reporting features can provide nice real-time graphs for such input. The approach can be summarized in this rather simple diagram.

As usual, I have posted the reference implementation on Splunkbase for you to download and try out. Although the senders are very specific, the recipients are generic in that they receive a message and send it to standard output to be indexed. I implemented several design patterns that were found in the zeromq documentation.

Pipeline Pattern: This is written in Python where the sender puts a message on a host:port combination that has a listener listening on that port for a TCP message. The listener receives the message and sends it to standard output in my case.

Publisher Subscriber Pattern: This is similar to to the above pattern, but in this case, a filter is provided with each message. The subscriber is the listener and if the filter meets the criteria, the message is received to be sent to standard output. This is also written in Python, so it should be fairly easy to just edit the code to change the filter to match your needs.

Request Reply Pattern: This implementation is written in Java where the receiver receives a message and also sends a reply to the sender. In my case, I simply send the same string each time back to the sender after receiving the message. You could use this string as a way of communicating an acknowledgement or other condition back to the sender.

One more note is since each of these patterns is designed where Splunk calls a program which calls another program for the listener, be sure to kill the listener when you stop or restart Splunk. Splunk may stop the original program called via “scripted input”, but the second program will still be running. You definitely want to kill the eventual listener when Splunk is stopped, otherwise when you start Splunk again, the listener’s address will already be in use, making it so that it does not start again.

Finally, just to try it out, I included a sample Java program along with the jar file from the Splunk Beta Java SDK to search Splunk and send to a zeromq request-reply listener as pattern to take data out of Splunk and put it on a message queue for remote processing. This is done for demonstration as it sends one line at a time to the listener, where in real life you would probably want to modify the sender to send one event at a time so as not to break up multi-line events.

In summary, using zeromq or an equivalent will provide you with a high speed messaging system to deliver events into Splunk using a variety of language bindings and design patterns to make Splunk an integral solution for time series data processing that involves queues.

Nimish Doshi
Posted by

Nimish Doshi

Nimish is Director, Technical Advisory for Industry Solutions providing strategic, prescriptive, and technical perspectives to Splunk's largest customers, particularly in the Financial Services Industry. He has been an active author of Splunk blog entries and Splunkbase apps for a number of years.