Recently, we were lucky to join the Eclipse Foundation’s IoT team for a webinar on “Practical Operational Intelligence for the Internet of Things”. Emphasis was on the practical. As I discussed in a recent blog for the IoT Solutions World Congress, when it comes to the IoT, turning data into insights shouldn’t be so hard. With the proliferation of complex architectures, interfaces, and slow time-to-value, it’s no wonder the “hype” of both big data and the IoT sometimes eclipses their successes.
With that in mind, I’m kicking off a multi-part blog series on “Practical” IoT Operational Intelligence and Analytics with Splunk. Goal here is to get you to value from IoT generated data as quickly and as easily as possible. General concepts to be discussed in this series include:
- Overview (why Splunk?)
- Getting Started (Download, Installation and App and Input Configuration)
- Searching, Alerting, Visualizing and Predicting with Splunk Enterprise
- Developing on and Extending Splunk Enterprise for Advanced IoT Applications
So without further ado, here goes:
Part 1. So why use Splunk for IoT Anyway?
This approach enables you to easily connect to remote devices and applications, ingest the machine data generated by those sources, and to securely deliver that data across complex networks to a massively scalable time-series index. You can then search that data, alert on the content of that data, extract information from, enrich, and statistically process that data. Finally, you can run time-series analytics on that data (including predict and anomaly detection), and visualize and report on the raw data or the output of any of those analytics using the platform’s native charting libraries or third party visualization platforms.
The above capabilities are exposed to end users via a web interface, and to developers and third party applications via well-documented APIs and a host of SDKs. Platform configuration and management features (including role-based access management for both data and content) are exposed to administrators via web browser, API and SDK as well.
The architecture is installed with a simple executable or tarball, and can be deployed on a laptop, in the datacenter, in a private cloud, as SaaS, or as a hybrid. It is designed to make machine data accessible, useable and valuable to anyone (that’s actually our mission), and does so through a unique combination of ease of use, self-service deployment and content creation, and fast time to value for solutions. Large corporations and individual developers alike already use Splunk as part of their IoT strategy.
As I see it, there are benefits to this approach vs. common alternatives:
- No worries about scalability. You can use the same application to scale from a single machine install to massive horizontal and vertical deployments collecting and analyzing data at any volume, velocity, or variety.
- You don’t need to custom code against every new data source. The Modular Input Framework allows you to wrap common communication libraries as a native Splunk input, complete with web user interface. Teams have already built Modular Inputs for accessing data from MQTT, COAP, AMQP, REST and JMS Splunk also has native inputs for file system monitoring, TCP, UDP, and can run scripts to collect data. Community supported apps such as Protocol Data Inputs enable collection of data from binary sources and just about anything else you can imagine.
- You can collect data from any point of origin – cloud, datacenter, or device. With Splunk’s flexible forwarding, SDKs and APIs, and IoT relevant add-ons on Splunkbase, data can be securely pulled or pushed from any connected location.
- You can manage your entire stack via a web browser. Yup, you can connect to the IoT or industrial environments, forward and process data, enrich time-series data with structured data, search, investigate, analyze, and visualize, all through a web interface. Oh, and you use that same web interface to manage the entire architecture and the users who access it, no matter how distributed or complex the architecture.
- You can use one application to do both real-time and historical visualization, analytics, and alerting. Splunk’s SPL is an incredibly powerful search and analytics language. Queries written in SPL can be applied to historical and real-time windows of data for time-series analytics and statistical processing. And like many other features of the platform, SPL is extensible via custom python search commands, the community has taken this even a step further by leveraging custom commands to extend SPL with R, and algorithms like haversine.
So hopefully this makes the case for the technical “why”, the next installment will try the same for the first bit of technical “how”, including downloading, installing and configuring an instance of Splunk on a local machine and in an AWS account, and configuring an MQTT Modular Input on that instance to collect data from a remote broker. If you have any questions or comments for me on this first installment, or on anything related to Splunk for IoT, feel free to reach out on Twitter:
or on LinkedIn.
Solution Expert, IoT and Industrial Data