Splunk Edge Processor and Federated Search: Do I Need It?

In today's data-driven landscape, organizations are confronted with an overwhelming volume of data, which is often accompanied by budgetary constraints. To address these challenges, a thoughtful data tiering strategy is crucial. This can be done by developing the practice of:

  1. Understanding and ranking datasets based on how critical these datasets are to the least used dataset.
  2. Storing these datasets across platforms that have the right balance of cost and performance.

After which, powerful data management and federated search capabilities become imperative: with these, you’ll have the flexibility to access data sets across different platforms — and correlate them when you need based on the use case at hand.

Simplifying Data Management and Reach With Splunk

Our goal at Splunk is to make data management and accessibility easy and flexible for our customers — so you can gain value out of your voluminous data more efficiently. To that end, we’ve made a couple big announcements this year:

  1. Launch of Splunk Edge Processor
  2. General availability of Federated Search for Amazon S3.

Splunk Edge Processor is a service offering deployed at the edge with a data control plane accessible from Splunk Cloud Platform. It is designed to help customers achieve greater efficiencies in data transformation close to the data source, data placement and improved visibility into data in motion. With Edge Processor, customers can filter, transform, and route data from the edge into Splunk indexes or Amazon S3 buckets.

Federated Search for Amazon S3, on the other hand, is a new capability that allows customers to search data from their Amazon S3 buckets directly from Splunk Cloud Platform without the need to ingest it into Splunk.

In this blog, we will dive into how Splunk Edge Processor and Federated Search for Amazon S3 can help build and implement data strategies to efficiently maximize the value derived from your data.

Edge Processor Streamlines Data Management

When addressing data transformation, Splunk Edge Processor is designed to extract only the critical data, employing data reduction techniques to streamline data ingestion into Splunk indexes.

Capturing and cleaning data at close proximity to the source is crucial especially when it comes to sensitive data sets that cannot leave the organization's network boundaries. This way organizations can ensure that only the essential and clean data gets ingested into Splunk. Any extraneous data? You can store that in an external data storage like Amazon S3.

Now, let's look at how you can implement these policies on edge processors.

In addition to the two major announcements, Splunk also announced an updated version of Splunk’s search language SPL2. SPL2 caters to users with diverse query language backgrounds, seamlessly blending SPL and SQL syntax for familiarity. Unleashing an array of robust features, including built-in functions, ability to create custom functions and custom data types, comment integration and many more. SPL2 sets a new standard for concise and powerful data queries.

Now imagine this: anything that can be implemented in SPL2 can be implemented in Edge Processor! That means that any task you implement using SPL2 can be part of your Edge Processor pipelines, including:

All this to say: you can now build data pipelines specific to your organization’s needs.

Today, Splunk Edge Processor can receive data from many different sources like Universal Forwarders, HTTP Event Collector, syslog and many more; and route data to destinations including Splunk Cloud Platform, Splunk Enterprise and Amazon S3. Check out the full list of supported sources and destinations.

Splunk Federated Search For Amazon S3 Transforms Data Exploration

In recent years, AWS S3 has become the most popular storage platform for various different use cases because of its ease of use and storage capabilities. It is used for storing data for various different use cases. It could be your web applications writing data to S3, storing analytical data, storing data for compliance/long term retention and many more.

Now with Splunk Federated Search for Amazon S3 you can make these data sets available to Splunk — which means you can use Splunk’s powerful search language to explore them and correlate these data sets with data in Splunk. Yes, this includes data that an Edge Processor sends to Amazon S3.

And an added benefit that Edge Processor provides: data written by Edge Processor is partitioned by time and stored in JSON format in Amazon S3. This enables Splunk Federated Search for S3 to work with the dataset efficiently.

Federated Search for Amazon S3 works by seamlessly integrating Splunk with AWS Glue Data Catalog which provides the necessary schema and metadata for Splunk Cloud Platform to interpret compatible datasets from Amazon S3. This collaboration allows Splunk to effectively search various data formats such as JSON, CSV, Parquet, ORC, compressed files like bzip, gzip, and many more.

This integration enhances the search capabilities for Splunk users, providing a comprehensive and streamlined data exploration experience.

What Next?

Now that we have learned how Edge Processor and Federated S3 works together to simplify data management and reach, let's see this in action. Here’s a video of how Buttercup Enterprises, a fictional gaming company, is looking into using Splunk’s Edge Processor and Federated-S3 to solve their data engineering problems.

YouTube video player

While the possibilities of what customers could leverage Federated Search and the Edge Processor for are unlimited, this blog is an attempt to give a primer on how to leverage these two features and open up ideas on how they can be leveraged for a specific challenge in your organization.

Get Edge Processor Today

If you are a current Splunk Cloud Platform customer hosted in the US, EMEA (Dublin, Frankfurt, Germany), UK (London), or APAC (Tokyo, Japan and Singapore) Splunk Cloud regions, you can get access to Edge Processor today. Contact your Splunk sales representative, or send an email to EdgeProcessor@splunk.com with your company name, Splunk cloud stack name, and Splunk Cloud region. If you are a Splunk Cloud Platform customer hosted in other Splunk Cloud regions, also contact your Splunk sales representative or send an email to get on the list to be enabled once Edge Processor is available in your region.

For more about Edge Processor, including release plans to support additional sources, destinations, and new functionality, see release notes and documentation.

This blog was co-authored by Joseph Kandatilparambil, Principal Technical Marketing Engineer and Raja Tamilarasan, Senior Sales Engineer at Splunk.

Related Articles

Announcing the General Availability of Splunk POD: Unlock the Power of Your Data with Ease
Platform
2 Minute Read

Announcing the General Availability of Splunk POD: Unlock the Power of Your Data with Ease

Splunk POD is designed to simplify your on-premises data analytics, so you can focus on what really matters: making smarter, faster decisions that drive your business forward.
Introducing the New Workload Dashboard: Enhanced Visibility, Faster Troubleshooting, and Deeper Insights
Platform
3 Minute Read

Introducing the New Workload Dashboard: Enhanced Visibility, Faster Troubleshooting, and Deeper Insights

Announcing the general availability of the new workload dashboard – a modern and intuitive dashboard experience in the Cloud Monitoring Console app.
Leading the Agentic AI Era: The Splunk Platform at Cisco Live APJ
Platform
5 Minute Read

Leading the Agentic AI Era: The Splunk Platform at Cisco Live APJ

The heart of our momentum at Cisco Live APJ is our deeper integration with Cisco, culminating in the Splunk POD and new integrations, delivering unified, next-generation data operations for every organization.
Dashboard Studio: Token Eval and Conditional Panel Visibility
Platform
4 Minute Read

Dashboard Studio: Token Eval and Conditional Panel Visibility

Dashboard Studio in Splunk Cloud Platform can address more complex use cases with conditional panel visibility, token eval, and custom visualizations support.
Introducing Resource Metrics: Elevate Your Insights with the New Workload Dashboard
Platform
4 Minute Read

Introducing Resource Metrics: Elevate Your Insights with the New Workload Dashboard

Introducing Resource Metrics in Workload Dashboard (WLD) – a modern and intuitive monitoring experience in the Cloud Monitoring Console (CMC) app.
Powering AI Innovation with Splunk: Meet the Cisco Data Fabric
Platform
3 Minute Read

Powering AI Innovation with Splunk: Meet the Cisco Data Fabric

The Cisco Data Fabric brings AI-centric advancements to the Splunk Platform, seamlessly connecting knowledge, business, and machine data.
Remote Upgrader for Windows Is Here: Simplifying Fleet-Wide Forwarder Upgrades
Platform
3 Minute Read

Remote Upgrader for Windows Is Here: Simplifying Fleet-Wide Forwarder Upgrades

Simplify fleet-wide upgrades of Windows Universal Forwarders with Splunk Remote Upgrader—centralized, signed, secure updates with rollback, config preservation, and audit logs.
Dashboard Studio: Spec-TAB-ular Updates
Platform
3 Minute Read

Dashboard Studio: Spec-TAB-ular Updates

Splunk Cloud Platform 10.0.2503 includes a number of enhancements related to tabbed dashboards, trellis for more charts, and more!
Introducing Edge Processor for Splunk Enterprise: Data Management on Your Premises
Platform
2 Minute Read

Introducing Edge Processor for Splunk Enterprise: Data Management on Your Premises

Announcing the introduction of Edge Processor for Splunk Enterprise 10.0, designed to help customers achieve greater efficiencies in data transformation and improved visibility into data in motion.