At .conf2015, we introduced HTTP Event Collector, a new exciting capability for developers to send events from applications, DevOps tools, and IoT into Splunk. In this post I’ll explain what it is and how it can help.
Why something new?
A common request we’ve heard from you, the Splunk developer community, over and over is “How can I send data directly to Splunk?”. When you say direct, what you really mean is without needing a local forwarder and generally you are talking about sending from clients living outside the corporate network.
Up until your options have been to use TCP/UDP inputs or the REST API. Each of these are usable, but they have their challenges and limitations as they were not built specifically for this purpose.
HTTP Event Collector in Splunk 6.3 offers you a newer and better option.
HTTP Event Collector
HTTP Event Collector (HEC pronounced H-E-C) is a new, robust, token-based JSON API for sending events to Splunk from anywhere without requiring a forwarder. It is designed for performance and scale. Using a load balancer in front, it can be deployed to handle millions of events per second. It is highly available and it is secure. It is easy to configure, easy to use, and best of all it works out of the box.* A few other cool tidbits, it supports gzip compression, batching, HTTP keep-alive and HTTP/HTTPs.
If you are a developer looking to get visibility into your applications within Splunk, looking to capture events from external systems and devices (IoT), or you offer a product that you’d like to integrate with Splunk, HTTP Event Collector is the way to go
Not only is HTTP Event Collector great for developers building custom apps, but it is also enabling our partners to provide more real-time integration with their products and Splunk. At .conf we announced integrations with AWS Lambda, Docker, and several IoT solution vendors including Xively and Octoblu. You’ve also probably seen Damien’s great post on how he is using HEC to turbo charge many of his inputs, like the AMQP Input. Each of these integrations are feeding events real-time to Splunk via Event Collector. This is just the beginning of what is now possible!
Using HTTP Event Collector
Now let’s take a look at how you can configure and use EC.
Enabling Event Collector
To use HC the first thing you have to do is enable it*. You can do this from the Splunk UI by going to Settings -> Data Inputs -> HTTP Event Collector. Once the dialog opens, click on “Global Settings” and then “Enabled” in the Edit Global Settings dialog. Notice the default port is 8088. HEC runs on its own dedicated port. This means you don’t need to expose 8089, the Splunk REST API port in order to make Event Collector accessible from the outside. This new port is specific only for sending events.
*Note: In Splunk Cloud, you must open a support ticket to use HTTP Event Collector
HTTP Event Collector uses a new token-based authentication mechanism. These tokens are not Splunk users or associated with Splunk users, they only allow sending events. This means you don’t have to worry that if a token is compromised that a hacker can use it to get access to Splunk data, something that is a concern with REST API credentials. Based on these tokens being restricted, an administrator can delegate to Engineering / DevOps the ability to create and manage them.
You can create a token right in the UI or using the Splunk CLI. Below you can see defining a new “test token”
Once “Submit” is clicked, the token is created.
Sending raw events
With a token you can now go to town and start sending events over HTTP/S to Splunk in our simple JSON format. Below you can see sending a batch of two events, one specifies the event as a raw string, the other as a JSON object. The JSON can be any structure, though below for illustration we use an object with a message field.
And as you can see below, both get happily ingested in Splunk, and that the JSON event shows as a JSON object as it should. No complex configuration on the server, no messing with .conf files, no messing with source types!
This is just the bare basics of sending events. There’s a bunch of rich capabilities the protocol enables in particular with regards to metadata. You can find out more in our docs here.
Sending raw JSON is one way. Another is to use the growing list of logging libraries (currently .NET, Java and nodejs) that we’re shipping which are compatible with many of the common logging frameworks you probably already use. For example, if you are a node developer, you may already be using bunyan, a popular node logger. Pull in “splunk-bunyan-logger” from npm and with some slight code changes like the ones below, everything sent to bunyan can be sent directly to Splunk.
A big question I know you’ll want to know is can it scale? The answer is YES. Out of the box HEC runs on a single instance, but we have designed this from the ground for scale. You can scale out to as many nodes as you need, and throw a load balancer in front. For example below you can see a simple distributed deployment which we showed at .conf, HEC running on 3-indexers with NGINX in front.
Go get it!!!!
To get started with the HTTP Event Collector, download Splunk 6.3, then head over to our Event Collector dev center. You’ll find documentation on how to use the collector, how to use our logging libraries, and how you can integrate with services like AWS Lambda.
What are you waiting for? Start sending your events directly to Splunk!