TIPS & TRICKS

Splunk Connect for Syslog: Turnkey and Scalable Syslog GDI - Part 2

In part 1 of this series, we explored the design philosophy behind Splunk Connect for Syslog (SC4S), the goals of the design, and the new HEC-based transport architecture. In this installment, we'll cover the high-level configuration of SC4S and highlight relevant sections of the documentation that provides the details needed for deployment in a production environment.

SC4S configuration is modular and templated, with separate syslog-ng configuration sections that are highlighted below. In the container version of SC4S, the bulk of this is hidden completely from the administrator, with a local mount point exposed for local configurations. In the BYOE version, the entire configuration is available for inspection and modification.  

The sections below outline configuration of SC4S with the “Out-of-the-Box” data sources with minor modifications.

Syslog-ng Data Flow

When a message arrives at the syslog server, it is subject to a number of filtering operations to determine how to process the message. In the case of SC4S, that processing goal is to send the message to Splunk and the proper index – properly formatted, with the proper metadata (sourcetype, timestamp, etc.) included. For any given data source format (usually, but not always, defined simply by the vendor), a set of filters are developed and encapsulated in a “log path”. These log paths define which filters, in which order, are applied to a message to ascertain whether it belongs in a specific category (which in Splunk’s case is a sourcetype). 

Once this category is determined, parsers are applied to format the message appropriately, the metadata is set, and the whole package is bundled in a JSON blob and sent to Splunk in batches via the HEC /event endpoint. Using the /event endpoint affords the ability to send additional enriched metadata as well and can be used for further classification. This is useful for PCI or geographical scope and many other use cases.

SC4S Environment Customization

We will discuss the Container architecture in the following section; the “BYOE” architecture is similar and utilizes the same concepts outlined here. In general, SC4S is configured by a very specific set of environment variables that are encapsulated in a file containing a set of SC4S environment variables. The contents of this file are documented in the "Getting Started" guide and the accompanying runtime documents, and allow the administrator to customize much of SC4S without a deep knowledge of the syntax of syslog-ng itself. The following steps are the high-level configuration tasks that are needed to configure SC4S:

Basic Configuration (Required)

There are only a few items that SC4S needs to start up out of the box:

  • HEC URL (either a list of endpoints or load balancer VIP)
  • HEC Token
  • Default Data collection port (typically 514)
  • Number of HEC endpoints (needed to properly configure syslog-ng for scale)
  • Disk Buffer Size
  • An empty directory to be “mounted” to the container for local customizations, which will be populated with template examples at first SC4S run. Many shops will need this to override the default indexes.

These are set via environment variables in the file detailed above. If you do nothing but configure the above, you will have a meaningful OOTB SC4S experience!

Pre-Instantiation Customization (Optional)

There are certain key items the container must “know” before starting up.  These are:

  • TCP, UDP, or TLS port(s) to listen on (other than the default, typically TCP/UDP 514).  These ports can be customized and are often the sole filter criterion in a “log path” used to determine one or more sourcetypes.
  • Whether or not TLS and/or custom certificates are used.

Each of these need to be determined before the container starts, and they are fixed regardless of the nature of the traffic arriving on any of the listening ports. To modify the underlying syslog-ng configuration to properly set these parameters, a templating operation is performed so that syslog-ng can start with the proper port/TLS parameters in place based on the values of the environment variables the administrator has set. These environment variables are documented in the "Configuration" portion of the document set. This templating operation is automatic for the OOTB sources. When developing custom log paths (below), elements of the template are exposed to the administrator, and must be properly accounted for.

Runtime Customization (Optional)

You may also customize SC4S based on other characteristics of arriving data in addition to the source port/TLS configuration outlined above. Data sources can be categorized (filtered) on hostname wildcard patterns (globs) as well as arriving CIDR block(s) and can be used (like the source port configuration above) as mechanisms to differentiate a particular sourcetype. Lastly, any metadata can be overridden or amended in this customization step.

Runtime customization differs from the Pre-Instantiation steps above in one important way: rather than modifying environment variables, a small snippet of the underlying syslog-ng config file is exposed to the administrator via a local “mount point” to the container. Care must be taken in providing proper syntax (which will be checked prior to startup) and is the one place where rudimentary syslog-ng config file syntax is helpful. You will quickly learn that copying existing code and modifying to suit your need is often the best route to success!

Custom Log Path Development (Optional)

Part of the local mount point includes a full directory tree that allows custom log paths and filters to be developed on site (and accessible from the container via the local mount). We will dive deep into this process in Part 3 of this series, but in the meantime there are documented examples in the local directory to help in this process, and the full suite of SC4S pre-defined filters, rewrites, and parsers are available. Again, looking at the sample and copying/pasting (and obeying the comments inside) is the best route to success!

This is also an area in which we’re looking forward to help from the community. Ask on the Slack channel if a prior log path or filter has been done by the community, or if one that is similar can be adapted. If you develop a custom log path or filter for a device that fulfills a general need, it is encouraged to submit the enhancement via the github repo as a PR for inclusion into a future version of SC4S.  


Splunk Connect for Syslog Community

Splunk Connect for Syslog is fully Splunk supported and is released as Open Source. We hope to drive a thriving community that will help with feedback, enhancement ideas, communication, and especially log path (filter) creation! We encourage active participation via the git repos, where formal request for feature (especially log path/filters) inclusion, bug tracking, etc. can be conducted. Over time, we envision far less “local filter” activity as more and more of the community’s efforts are encapsulate in the containers OOTB configs.

Splunk Connect for Syslog Resources

There are many resources available to enable your success with SC4S! In addition to the main repo and documentation, there are many other resources available:

We wish you the best of success with SC4S. Get involved, try it out, ask questions, contribute new data sources, and make new friends!

Mark Bonsack
Posted by

Mark Bonsack

Mark Bonsack is a Principal Sales Engineer at Splunk, and is responsible for Strategic Accounts in the Southwest US region. During his 9-year Splunk career, he has developed a particular interest in data acquisition (AKA "GDI") and has guided Splunk's largest customers in this area. He is a “Brady Bunch” dad: 2 girls of his own, 2 by marriage; all the same age like the Bradys. His professional beach volleyball career didn’t work out, but is usually on the winning team during competitions at Splunk team-building events...

TAGS
Show All Tags
Show Less Tags