Introducing Edge Processor: Next Gen Data Transformation

We get it — not only can it take a lot of time, money and resources to get data into Splunk, but it also takes effort to shape the data in a way that will provide you the most value. But it doesn’t have to anymore, thanks to Splunk’s latest innovation in data processing.

Splunk is pleased to announce the general availability of Splunk Edge Processor, a service offering within Splunk Cloud Platform designed to help customers achieve greater efficiencies in data transformation close to the data source, and improved visibility into data in motion. Edge Processor provides customers new abilities to filter and mask, and otherwise transform their data, before routing it to supported destinations. Edge Processor joins Ingest Actions as part of Splunk’s pre-ingest data transformation capabilities. All current Edge Processor features are free to all Splunk Cloud customers.

What gives Edge Processor its data transformation power is Splunk’s next generation data search and preparation language, SPL2. With SPL2, customers have much more flexibility to shape data so that it is formatted exactly how they want before sending it to be indexed.

Unique to Edge Processor is its architecture, chiefly the cloud-based control plane. Edge Processor nodes are easily installed and configured on customer servers or customer cloud infrastructure using a single command, and managed completely from Splunk Cloud Platform. These nodes are an intermediate forwarding tier, and receive data from edge sources. Customers manage their entire fleet of edge processors and have visibility into both inbound and outbound data volumes through their edge processor network, all from a single place. Any node can then scale horizontally to handle increasing processing or data volume requirements by simply adding instances.

Customers have detailed metrics to view the impact of their pipelines on data flowing through each of their edge processors and can closely track unexpected spikes or troughs in their data

From the central cloud control plane, customers define data processing logic — pipelines — that dictate their desired filtering, masking and routing logic, and can apply their pipelines to any or all edge processors in their network. Edge Processor pipelines are constructed using SPL2 in the new pipeline editor experience, where users can see previews of the data showing the impact of applying a pipeline before making a change.

The data plane remains completely within the customer control — customers point data sources to an edge processor node that is installed on their hosts, and that data is only sent to where customers direct it to be sent. At launch, Edge Processor can receive data from Splunk Universal and Heavyweight Forwarders, and route data to Splunk Enterprise, Splunk Cloud Platform, and Amazon S3.1

Customers have a guided pipeline editor experience with the ability to preview the effect of their pipeline on sample data that they provide

Edge Processor using SPL2 makes data transformation easy and flexible. One of the most common use cases for Edge Processor is to filter verbose data sources, such as Windows event logs, to retain selected events or content within an event. An explicit set of examples for this use case is retaining only Windows events that match a certain event code, masking the extensive message field at the end of Windows events, and routing an unfiltered copy of data to an AWS S3 bucket. The pipelines below show how these examples are constructed; the user controls what data the pipeline applies to, how that data is to be processed, and then where the processed data is routed to.

Pipeline definition (SPL2)
$source
$destination
Filter Windows system events on event id, route to Splunk Cloud index “Security”
$pipeline =
| from $source
// Extract event code field
| rex field=_raw
/EventCode=(?P<event_code>\d+)/
// retain all events with windows event code = 9
| where event_code = 9
| into $destination;
sourcetype =
winEventLog:
system
Splunk index:
Security
Mask Windows system events to remove the final “Message” contents, route this copy to Splunk Cloud index “Main”
$pipeline =
| from $source
| eval _raw=replace(_raw,
/(Message=.*[\r\n?|\n])((?:.|\r\n?|\n)
*)/, "\\...")
| into $destination;
sourcetype =
winEventLog:
system
Splunk index:
Main
Route unfiltered copy of ALL Windows events to AWS S3 bucket “Windows”
$pipeline =
| from $source
| into $destination;
Sourcetype =
winEventLog*
S3 bucket:
Windows

With Edge Processor, customers will experience increased visibility of data in motion and improved productivity, simplicity, and control of data transformations, all at scale. What’s more, Edge Processor is another capability to help customers manage costs and boost value from your Splunk investment, serving as a sort of forcing function to organize and prioritize your data according to use case so that you work with just the data you want, in the location you need it.

If you are a current Splunk Cloud Platform customer hosted in the US or Dublin Splunk Cloud regions, you can get access to Edge Processor today. Contact by your Splunk sales representative, or send an email to EdgeProcessor@splunk.com with your company name, Splunk cloud stack name, and Splunk Cloud region. If you are a Splunk Cloud Platform customer hosted in other Splunk Cloud regions, also contact your Splunk sales representative or send an email to get on the list to be enabled once Edge Processor is available in your region.

For more about Edge Processor, including release plans to support additional sources, destinations, and new functionality, see release notes and documentation.

[1] See release notes for updates on new features, including additional supported sources and destinations.

----------------------------------------------------
Thanks!
Jodee Varney

Related Articles

Developing the Splunk App for Anomaly Detection
Platform
13 Minute Read

Developing the Splunk App for Anomaly Detection

A technical overview of the Splunk App for Anomaly Detection, which uses machine learning to automatically configure anomaly detection jobs on time series data.
Enhancements To Ingest Actions Improve Usability and Expand Searchability Wherever Your Data Lives
Platform
4 Minute Read

Enhancements To Ingest Actions Improve Usability and Expand Searchability Wherever Your Data Lives

Along with the respective Splunk Enterprise version 9.1.0 and Splunk Cloud Version 9.0.2305 releases, Ingest Actions has launched a new set of features and capabilities that improve its usability and expand on configurability of data routed by Ingest Actions to S3.
Flatten the SPL Learning Curve: Introducing Splunk AI Assistant for SPL
Platform
3 Minute Read

Flatten the SPL Learning Curve: Introducing Splunk AI Assistant for SPL

At .conf23, we announced the preview release of Splunk AI Assistant - Splunk's first offering powered by generative AI.
Splunk Edge Processor Enhancements Offer Greater Data Access and Improve Data Management
Platform
1 Minute Read

Splunk Edge Processor Enhancements Offer Greater Data Access and Improve Data Management

On the heels of an exciting GA in March and the April announcement of its regional expansion, we are excited to share the latest updates to Splunk Edge Processor that will make it even easier for customers to have more flexibility and control over just the data you want, nothing more nothing less.
Fastest Time-to-Value Anomaly Detection in Splunk: The Splunk App for Anomaly Detection 1.1.0
Platform
3 Minute Read

Fastest Time-to-Value Anomaly Detection in Splunk: The Splunk App for Anomaly Detection 1.1.0

Splunk App for Anomaly Detection simplifies ML, making anomaly detection easy. It streamlines tasks, enabling ML integration in everyday workflows. Just load data, select the field, and click "Detect Anomalies."
Swimming in Sensors and Drowning in Data: The Role of Splunk Partners in Delivering Splunk Edge Hub
Platform
3 Minute Read

Swimming in Sensors and Drowning in Data: The Role of Splunk Partners in Delivering Splunk Edge Hub

With the proliferation of edge computing and the release of Splunk Edge Hub, partners have additional functionality to accelerate the detection, investigation and response of threats and issues that will inevitably occur in physical and industrial environments.
Introducing New Deep Learning NLP Assistants for DSDL
Platform
6 Minute Read

Introducing New Deep Learning NLP Assistants for DSDL

The Splunk App for Data Science and Deep Learning (DSDL) now has two new assistant features for Natural Language Processing. DSDL has been offering basic natural language processing (NLP) capabilities using the spaCy library.
Announcing the General Availability of Cloud Monitoring Console’s Maintenance Dashboard
Platform
3 Minute Read

Announcing the General Availability of Cloud Monitoring Console’s Maintenance Dashboard

The new Maintenance Dashboard in the Cloud Monitoring Console app aims to assist Splunk Cloud Platform admins in effectively managing maintenance tasks and staying informed about Splunk-initiated maintenance for improved operational efficiency.
What is Splunk Virtual Compute (SVC)?
Platform
7 Minute Read

What is Splunk Virtual Compute (SVC)?

Learn about what SVCs are, how they fit in with workload pricing, and how to size, monitor, and manage workload to get the most out of Splunk.