Documentation: 3.0.2
Print Version Contents
This page last updated: 01/14/08 01:01pm

Mask sensitive data in an event

You may want to mask sensitive personal data that goes into logs. Credit card numbers and social security numbers are two examples of data that you may not want to index in Splunk. This page shows how to mask part of confidential fields so that privacy is protected but there is enough of the data remaining to be able to use it to trace events.

In this example, there is a SessionId and Ticket number we want to mask in an application server log. We will mask all of these IDs except the last 4 characters.

An example of the desired output:
SessionId=###########7BEA&Ticket=############96EE

A sample input:

{{"2006-09-21, 02:57:11.58",  122, 11, "Path=/LoginUser Query=CrmId=ClientABC&ContentItemId=TotalAccess&SessionId=3A1785URH117BEA&Ticket=646A1DA4STF896EE&SessionTime=25368&ReturnUrl=http://www.clientabc.com, Method=GET, IP=209.51.249.195, Content=", ""}}
{{"2006-09-21, 02:57:11.60",  122, 15, "UserData:<User CrmId="clientabc" UserId="p12345678"><EntitlementList></EntitlementList></User>", ""}}
{{"2006-09-21, 02:57:11.60",  122, 15, "New Cookie: SessionId=3A1785URH117BEA&Ticket=646A1DA4STF896EE&CrmId=clientabc&UserId=p12345678&AccountId=&AgentHost=man&AgentId=man, MANUser: Version=1&Name=&Debit=&Credit=&AccessTime=&BillDay=&Status=&Language=&Country=&Email=&EmailNotify=&Pin=&PinPayment=&PinAmount=&PinPG=&PinPGRate=&PinMenu=&", ""}}

Configuration

To mask the data you will need to modify your props.conf and transforms.conf files in your $SPLUNK_HOME/etc/bundles/local/ directory.

props.conf

Edit $SPLUNK_HOME/etc/bundles/local/props.conf and add the following:

    [<spec>]
    TRANSFORMS-anonymize = session-anonymizer, ticket-anonymizer

<spec> can be:
1. <sourcetype>, the sourcetype of an event
2. host::<host>, where <host> is the host for an event
3. source::<source>, where <source> is the source for an event.

session-anonymizer and ticket-anonymizer are TRANSFORMS class names whose actions are defined in transforms.conf. For your data, use the class names you create in transforms.conf.

transforms.conf

In $SPLUNK_HOME/etc/bundles/local/transforms.conf, add your TRANSFORMS:

    [session-anonymizer]
    REGEX = (?m)^(.*)SessionId=\w+(\w{4}[&"].*)$
    FORMAT = $1SessionId=########$2
    DEST_KEY = _raw

    [ticket-anonymizer]
    REGEX = (?m)^(.*)Ticket=\w+(\w{4}&.*)$
    FORMAT = $1Ticket=########$2
    DEST_KEY = _raw

REGEX should specify the regular expression that will point to the string in the event you want to anonymize.
Note: The regex processor can't handle multi-line events. To get around this you need to specify in transforms.conf that the event is multi-line. Use the (?m) before the regular expression.
FORMAT specifies the masked values. $1 is all the text leading up to the regex and $2 is all the text of the event after the regex.
DEST_KEY = _raw specifies to write the value from FORMAT to the raw value in the log - thus modifying the event.

Previous: Multiline events    |    Next: How event types work

Comments

No comments have been submitted.

Log in to comment.