Addition of Syslog in Splunk Edge Processor Supercharges Security Operations with Palo Alto Firewall Log Reduction

Now generally available, Splunk Edge Processor supports syslog-based ingestion protocols, making it well-equipped to wrangle complex and superfluous data. Users can deploy Edge Processor as an end-to-end solution for handling syslog feeds such as PAN logs, including the functionality to act as a syslog receiver, process and transform logs and route the data to supported destination(s).

Before you start writing a SPL2 pipeline to process and transform incoming logs, configure Edge Processor to natively listen for events coming over syslog by:

Once you have configured Edge Processor to receive syslog events, you will see it appear in the Edge Processor console as shown below:

The addition of syslog for Edge Processor expands the ability to filter, mask, transform and route data generated from events from network devices, Linux/Unix-like OS, and more.

Supercharge Security with Palo Alto Networks Firewall Log Reduction

Let’s delve into real-world examples of wrangling data in motion in the context of cybersecurity, specifically, log reduction of Palo Alto Networks (PAN) sources. TL;DR – no worries, check out this video demo to see this use case in action.

Have you ever been swamped by the relentless surge of log data from your Palo Alto Networks (PAN) devices? Ever felt like finding the crucial information within these logs is akin to searching for a needle in a haystack? You're not alone.

The current volume and frequency of PAN firewall log data in syslog results in delayed incident detection, longer search processing times, and slow response and automation. And given that not all log types are created equal, nor are meaningful to an organization's needs, there’s also increased log management costs to contend with. The table below characterizes the most common log types according to size and volume:

Log Type
Splunk Sourcetype
Log Size
Log Volume
Traffic
pan:traffic
Large
Very High
Threat
pan:threat
Large
Low
Threat : url*
pan:threat
Large
Very High
Threat : file*
pan:threat
Large
High
System
pan:system
Medium
Medium
Configuration
pan:config
Small
Low
Correlation
pan:correlation
Small
Low
HIP Match
pan:hipmatch
Small
Medium

*Note: URL and File logs are of type Threat, but are called out separately because they have a different frequency than most threat logs.

You can see from the table that Traffic logs and URL logs are the most frequent and largest, with File logs coming in second. These log types will make up the bulk of what would be ingested and indexed in Splunk.

Generally speaking, Edge Processor could support log reduction in the following ways:

  1. If you use Splunk in a SOC for security, but are not responsible for the operational health of the firewalls, you could consider filtering out System and Config log types.
  2. Traffic logs are large and high volume. You can trim their volume by filtering out 'Start' logs. 'Start’ logs often have an incorrect app because they are logged before the app is fully determined. The 'End' logs will have the correct App and other data such as the session duration.

Cutting Through (PAN Log) Noise

More specifically, consider this example — backup software runs every night, generating thousands of connections from endpoints to a backup server. This generates a large volume of low value data that is not critical to detecting threats. Enter Edge Processor! Create a pipeline in Edge Processor for this backup app to retain only threats and drop all other events belonging to log traffic sessions, URLs, or files.

Let’s take a closer look at the challenge of gaining control of PAN logs via syslog, where our ultimate goal is to improve search performance. How? By reducing event size; removing unnecessary, “noisy” fields; and routing a full-fidelity copy of the data that is to be maintained for compliance purposes in AWS S3 — all of which, in turn, reduces ingestion and storage costs.

In particular, we aim to:

PANning for Gold – Creating Edge Processor Pipelines

Now, let’s get started with creating pipelines in Edge Processor to transform those PAN logs and ultimately, supercharge your security operations!

The two pipelines below show how a user controls what data the pipeline applies to, how that data is to be processed, and then where the processed data is routed. The first pipeline shows how to filter and minimize data volume on the way to a Splunk index, and the second keeps a raw copy in an AWS S3 bucket for compliance reasons! However, it's essential to note this is one example of how the Edge Processor can be employed. Just like SPL, the actual query definition depends on the nuances of the data (and your creativity!), and we encourage you to tailor the Edge Processor pipelines to best fit your unique needs.

Below, you will see references to commands you may not recognize, like remove_readable_timestamp. These aren’t out-of-box SPL2 commands, but are custom functions that you can define to improve usability. Continue reading to the “Making Security Function-al” section to learn more about user-defined functions.

Pipeline definition (SPL2)
$Source
$destination
Pipeline 1: Filter Palo Alto Firewall logs, route to Splunk Cloud

$pipeline =

| from $source

// First, drop the human readable timestamp which is added by syslog, as this is redundant and not used.

| remove_readable_timestamp

// Then, extract the useful fields like other timestamps and event type

| extract_useful_fields

// Drop events of specific type and subtype which are not useful for security analysis

| drop_security_noise

// As field extraction generates extra fields which are not needed at index-time, use the fields command to keep only _raw

| fields _raw

// Lastly, route the filtered events to a specific index used for security incident analysis

| eval index="security_paf_index"

| into $destination;

sourcetype= pan:firewall

Splunk Index = security_paf_index

Splunk destination =

splunk_stack_security_s2s

Pipeline 2: Route unfiltered copy of all PAN firewall logs to AWS S3 bucket

$pipeline =

| from $source

| into $destination;

sourcetype= pan:firewall
S3 bucket: security_compliance_s3

Making Security Function-al

As you review the first pipeline definition, you might be thinking, “wow, those SPL2 commands are super readable and straightforward!” — and you’d be right! Or at second thought, you may wonder, “hang on, there’s no way extract_useful_fields is an out-of-box SPL2 command, so how does Splunk know what’s a useful field?” — and you’d also be right!

The extract_useful_fieldscommand is made possible through custom SPL2 functions. Custom SPL2 functions are named, reusable blocks of SPL2 code that can wrap a bunch of complex SPL2, in a simple custom command or eval function; think of it like an SPL marco, but way more powerful! Let's explore this capability further.

function remove_readable_timestamp($source) {
return | from $source
| eval readable_time_regex = "\\w{3}\\s\\d{2}\\s\\d+:\\d+:\\d+"
| eval _raw=replace(_raw, readable_time_regex, "")
| fields -readable_time_regex
}
function extract_useful_fields($source) {
return | from $source
| rex field=_raw
/(\d{4}\/\d{2}\/\d{2}\s\d{2}:\d{2}:\d{2}),([\w\d]+),(?P<event_type>[A-Z]+),(?P<event_subtype>[\w\d]*),\d*,(\d{4}\/\d{2}\/\d{2}\s\d{2}:\d{2}:\d{2})/
}
function drop_security_noise($source) {
return | from $source
| where not(event_type IN ("CORRELATION", "HIPMATCH"))
| where not(event_type in ("SYSTEM")) or (event_type IN ("SYSTEM") and not(event_subtype in ("routing", "ras")))
}

As you can see, the bodies of each of these custom SPL2 functions are composed of standard SPL2 — just like a macro. All a user needs to do is use the functions in the pipeline, but if you prefer to inline all SPL2 in your pipeline without using custom functions, you absolutely can:

$pipeline = | from $source
//Remove readable & redundant timestamp
| eval readable_time_regex = "\\w{3}\\s\\d{2}\\s\\d+:\\d+:\\d+"
| eval _raw=replace(_raw, readable_time_regex, "")
| fields -readable_time_regex
//Extract useful fields
| rex field=_raw /(\d{4}\/\d{2}\/\d{2}\s\d{2}:\d{2}:\d{2}),([\w\d]+),(?P<event_type>[A-Z]+),(?P<event_subtype>[\w\d]*),\d*,(\d{4}\/\d{2}\/\d{2}\s\d{2}:\d{2}:\d{2})/
//Drop security noise
| where not(event_type in ("CORRELATION", "HIPMATCH"))
| where not(event_type in ("SYSTEM")) or (event_type in ("SYSTEM") and not(event_subtype in ("routing", "ras")))
| fields _raw
| eval index="security_paf_index"
| into $destination;

Put Edge Processor to Task

Given the capabilities described above, the Edge Processor stands out with its resilient approach to modern log reduction. Its robust security foundation filters essential information and efficiently manages log data, reducing incident analysis time and accelerating threat identification. Edge Processor goes beyond these core functions, unifying security operations through effortless integration with Splunk Cloud Platform and paving the way for easier alert enrichment in future updates. The result is a tool that empowers security teams to detect and respond to threats with unmatched speed and precision, ensuring minimal disruptions to the current infrastructure.

Splunk Cloud Platform customers can access the Edge Processor for free! To activate an Edge Processor tenant in your environment, contact your Splunk sales representative or shoot an email to EdgeProcessor@splunk.com with your details.

Together, let’s make your security operations smarter, faster, and more robust!

This blog was co-authored by Xi He and Sri Tejaswi Gattupalli, Product Manager Interns, Summer 2023 and Aditya Tammana, Senior Product Manager.

Related Articles

Introducing Splunk 10: Empowering a Secure and Compliant Future
Platform
4 Minute Read

Introducing Splunk 10: Empowering a Secure and Compliant Future

Announcing the general availability of Splunk Enterprise 10.0 and Splunk Cloud Platform 10.0, the most secure, stable, and modernized platform for a digitally resilient, compliance-ready future.
Dashboard Studio: New Features Highlighted At .conf21
Platform
4 Minute Read

Dashboard Studio: New Features Highlighted At .conf21

Scheduled Export for Dashboard Studio? Sankey Diagram? Click on a viz to set tokens? Learn more about these features and more in the Splunk Cloud Platform 8.2.2109 release.
Splunk Edge Processor Now Available in Sydney
Platform
1 Minute Read

Splunk Edge Processor Now Available in Sydney

Splunk Edge Processor simplifies data processing and provides customers will flexible capabilities to filter, mask, transform and route data, close to the source.