This is the first of a two-part blog series about a data scientist's first experience at RSA Conference from guest blogger Lauren Deason, lead data scientist at PUNCH Cyber Analytics.
Read part two here.
I was recently afforded the opportunity to attend one of the biggest cybersecurity conferences in the world, courtesy of Splunk's Splunk4Good initiative to increase representation of underrepresented groups at such events. Thanks to their generous support, I was able to travel to San Francisco and attend a full week of talks, trainings, and other events at this year's RSA Conference. This included a trip to Splunk's headquarters where I was able to meet and talk with Splunk employees in various disciplines.
As a data scientist working in cybersecurity, attending RSA was extremely valuable both in terms of content and networking opportunities. Oh, and SWAG. Lots and lots of swag.
I ended up really enjoying and getting a lot out of the majority of talks I attended, which largely fell into two categories:
- Machine learning applications to cybersecurity problems
- Deep dives into specific exploits/attacks
Now, my level of expertise in these two categories is highly imbalanced; I have been working on and researching machine learning for cybersecurity for a little over two years as a contractor on DARPA's Network Defense project, while my experience as a hacker is limited to some introductory online coursework and an affinity for wearing hoodies.
The machine learning talks I attended provided the most value in terms of highlighting high-level problems and ideas that other people are working on in this space, without getting into the technical details (for anyone looking for a conference that does cater to those interested in these details, check out CAMLIS). These talks also provided a great opportunity to connect with some of the other researchers and chat about ideas.
As for the talks detailing specific attack vectors, the high-level nature of the talks was perfect for someone at my level; I am very interested in learning the details of different vulnerabilities and exploits (partly because of the relevance to my work, but also because it's just fun!), but I don't yet have the background in cybersecurity or networking to be able to follow highly technical, narrow talks full of obscure terminology.
Probably my favorite talk attended at RSA was the presentation by the (very snappily dressed) CrowdStrike co-founders, who went through five specific exploits, detailed at a high level how they worked, and did a live demo implementing each. Such concrete examples are extremely useful in developing intuition that I rely on when brainstorming algorithms that can be applied to log data to detect various types of malicious activity. Seeing each line of code executed in real time allows me to think about what traces of the given activity may be captured in different logs, which helps to guide feature selection and engineering. I would argue this tends to be the most difficult and important part of constructing machine learning algorithms that generate relevant, actionable security alerts in the cyber realm.
Hacker for a Day, Hoodie Included
Following this desire to expand my hacking chops, I took advantage of some of the training sessions available through the conference. I attended one capture-the-flag-styled tutorial that walked through some tools that could be used to detect various exploits in a Windows environment. As with the demos described above, learning to use the same tools as they would be employed by analysts within a SOC to search for suspicious indicators provides me with real examples of the types of actions that I am ultimately trying to expand upon and automate in my work. Such automation of human analysis is currently in high demand as the scale of log data produced by most organizations' networks tends to far outpace what can feasibly be analyzed by humans.
Unfortunately, this human couldn't even keep up with the scale of her two-hour tutorial, so I expect to be getting most of the value from this particular course when I go back through the exercises on my own. Luckily, the material for this (and other trainings, one of which was already at capacity when I arrived) are generally available online post-conference, as are all the slides from talks, and so I have a fair amount of homework yet to do and have also been able to read through the slides of talks that I couldn't make in person.