I am new to Splunk. How do I configure a source type for RFC 5424 compliant syslog messages?
The venerable old-skool Splunk forums are now closed. Feel free to search for old content here, but new posts are no longer supported.
Instead, please visit the thriving community at answers.splunk.com to ask and answer questions about your Splunk deployment and how to get the most out of it.
Forums: SplunkAdministration: RFC 5424
Previous Topic: How to index data for linux? | Next Topic: Intermediate cert chain installation
You don't need to. There is already one (actually several versions) defined in Splunk. If you're sending syslog directly, you simply have to configure a listening port.
Also, if you simply feed log files of that format to Splunk without specifying the source type, it is likely to guess the type correctly.
I tried this before posting my original message. If had worked "automagically" I wouldn't have posted the question.
Splunk doesn't seem to completely understand RFC 5424 syslog messages. While splunk has several syslog formats to choose from, none of them match the syslog specification. Splunk seems to be able to interpret the name value pairs of structured data elements but it doesn't have any idea what the structured data ids are nor is it able to interpret the fixed fields that are present in the syslog header.
Can you be more specific as to what you are expecting Splunk to do? The famously incomplete RFC you reference does not require syslog data to have structured data ids nor have fixed fields. If you can let us know specifically what it is you want Splunk to do, we can make it happen for you.
Wow. Your comment is so incorrect I have to believe you have never read the RFC. Please review http://tools.ietf.org/html/rfc5424.
"The famously incomplete RFC" - if it is so famously incomplete why don't I find any comments like that with Google? Furthermore, if this document is so incomplete why are there are other RFCs such as 5674 & 5676 that are building upon it?
"you reference does not require syslog data to have structured data ids" - The RFC clearly states that if structured data is not present a NIL value must appear in its place. One or more structured data elements can appear and each must have an id, the format of which is very well defined. While the SD-ID must appear the structured data element might not contain any key/value pairs.
"nor have fixed fields" - The header is nothing but fixed fields. See below.
For reference I've pasted the ABNF below. How is that not well defined?Current issues with Splunk and RFC 5424:
1. It doesn't know what the hostname is.
2. it doesn't know what the appname is.
3. it doesn't know what the procid is.
4. It doesn't know what the msgid is.
5. it doesn't know how to deal with the structured data id's.
6. Although it does parse the keys and values in the structured data it doesn't associated them with the structured data id.
FWIW - syslog-ng has no problem with any of this, but it doesn't do reporting. It only allows the data to be used for filtering.
What I expect splunk to be able to do is allow me to specify filters so that I only view specific sets of records. These filters should include any of the well defined data in the record.
From RFC 5424 -
The syslog message has the following ABNF [RFC5234] definition:
SYSLOG-MSG = HEADER SP STRUCTURED-DATA [SP MSG]
HEADER = PRI VERSION SP TIMESTAMP SP HOSTNAME
SP APP-NAME SP PROCID SP MSGID
PRI = "<" PRIVAL ">"
PRIVAL = 1*3DIGIT ; range 0 .. 191
VERSION = NONZERO-DIGIT 0*2DIGIT
HOSTNAME = NILVALUE / 1*255PRINTUSASCII
APP-NAME = NILVALUE / 1*48PRINTUSASCII
PROCID = NILVALUE / 1*128PRINTUSASCII
MSGID = NILVALUE / 1*32PRINTUSASCII
TIMESTAMP = NILVALUE / FULL-DATE "T" FULL-TIME
FULL-DATE = DATE-FULLYEAR "-" DATE-MONTH "-" DATE-MDAY
DATE-FULLYEAR = 4DIGIT
DATE-MONTH = 2DIGIT ; 01-12
DATE-MDAY = 2DIGIT ; 01-28, 01-29, 01-30, 01-31 based on
; month/year
FULL-TIME = PARTIAL-TIME TIME-OFFSET
PARTIAL-TIME = TIME-HOUR ":" TIME-MINUTE ":" TIME-SECOND
[TIME-SECFRAC]
TIME-HOUR = 2DIGIT ; 00-23
TIME-MINUTE = 2DIGIT ; 00-59
TIME-SECOND = 2DIGIT ; 00-59
TIME-SECFRAC = "." 1*6DIGIT
TIME-OFFSET = "Z" / TIME-NUMOFFSET
TIME-NUMOFFSET = ("+" / "-") TIME-HOUR ":" TIME-MINUTE
STRUCTURED-DATA = NILVALUE / 1*SD-ELEMENT
SD-ELEMENT = "[" SD-ID *(SP SD-PARAM) "]"
SD-PARAM = PARAM-NAME "=" %d34 PARAM-VALUE %d34
SD-ID = SD-NAME
PARAM-NAME = SD-NAME
PARAM-VALUE = UTF-8-STRING ; characters '"', '\' and
; ']' MUST be escaped.
SD-NAME = 1*32PRINTUSASCII
; except '=', SP, ']', %d34 (")
MSG = MSG-ANY / MSG-UTF8
MSG-ANY = *OCTET ; not starting with BOM
MSG-UTF8 = BOM UTF-8-STRING
BOM = %xEF.BB.BF
After digging around - and discussing this with you in person - I suspect that the "standard" you were thinking of was CEE, http://cee.mitre.org/. That one seems to have been going on for years yet there is still nothing available for public review.
Nope, I was confusing 5424 with the one that it superseded:
http://www.faqs.org/rfcs/rfc3164.html
I'm still not very familiar with the new RFC, as I don't have any current customers adhering to this new standard, but I am looking forward to it if it can help simplify log management.