Splunk software enables you to collect data and easily search it, looking for patterns, optimizations, and opportunities that can help your organization or business succeed. Gathering data is essential for just about any use case you pursue with the Splunk platform, but it's important to remember that there are people on the other side of the data that you collect.
Protecting privacy is important for everyone, but when you collect data it's even more important. Thankfully, with Splunk software, it's easy to protect privacy when you're collecting data.
If you promise to protect the personal information of people that you collect data about, you have a responsibility to protect their privacy. If you're dealing with data about consumers, the FTC enforces that responsibility but even when there is no legal or financial motive for protecting user privacy, it's just a good idea.
The best way to protect privacy is to collect only the data you need about people. The more data you collect about people, the less easily you can protect their privacy. Make sure there is a clear business case for collecting data before adding it to the Splunk platform. If you understand why you are adding data to the Splunk platform, and what insights you want to get out of it, you can more easily protect privacy at the same time.
If you still want to collect data about users and have a good business case for doing so, take precautions to protect privacy in several different ways:
- Anonymize the data that you collect. Make it hard to identify the people behind the data to protect the privacy of the people whose information is added to the Splunk platform. See Anonymize data for some ways to anonymize data when you add it to the Splunk platform.
- Keep data only as long as you need it. Set index retention policies, and don't keep sensitive or potentially identifiable information for longer than you need it. See Set a retirement and archiving policy to learn more about the attributes involved in index retention.
- Control access to data. Limit who can access data about people to those that have a business need to know that information. Keep track of who has access and periodically validate that those with access still need it. You can control access in Splunk software using capabilities, restricting access to specific indexes, search filters, and more. See Use access control to secure Splunk data.
- Encrypt everything. If you're sending valuable data to Splunk software, make sure no one can eavesdrop on it while you're sending it. Consider encrypting the data after it reaches the Splunk indexers too, by using an encrypted file system. See About securing data from forwarders.
I was inspired to write this post after attending a panel by the Electronic Frontier Foundation (EFF) about how to protect your privacy as a consumer or as a provider of consumer-used software. Thanks to Amul and Erica at the EFF. I also recently read The Circle by Dave Eggers, which reminded me of the importance of protecting privacy when it comes to big data.