This summer, the Splunk4Good and the Splunk University Recruiting team hosted their second annual Splunktern classes, educating interns on our product and applying Splunk to open data sources. Participating Splunkterns formed teams, identified social issues, and then used Splunk to promote change for social good. Final projects were presented to a panel of executives and the winning team was awarded a $1,000 charitable donation for their project’s cause.
In this series, we’ll hear how Splunkterns selected their project topics and how this impacted their intern experience at Splunk. Check out the first post "Using Splunk for Good: The Splunktern Way," and stay tuned to learn who was crowned the winning team.
This post comes from guest authors Vandita Anand (Sales Engineer Intern), Sareena Kazla (Customer Marketing Intern), and Jessica Guo (Data Analytics Intern).
One day, our teammate Vandita told us a story about her first time in San Francisco. She was meeting a friend and decided to walk to her destination alone instead of paying for transportation. Halfway through the trip, she heard glass breaking from across the street and saw a group of men running past her. She soon realized that she had witnessed a robbery.
While Vandita’s experience may not happen to everyone, the prevalent crime in San Francisco is a reality all three of us have witnessed and has caused us to feel unsafe. For this reason, we wanted our Splunk4Good project to help increase awareness about the safety of San Francisco neighborhoods.
To evaluate safety in different neighborhoods, we used a dataset from 2017 crime in San Francisco. However, we felt it wasn’t enough to simply count the crimes in each district because some offenses—such as assault or homicide—are far more dangerous than littering or loitering. To account for this, we scored each crime on a severity scale from 0-5, and weighted each crime by its severity. Combined with geographic data for the city and population densities of each neighborhood, we were able to create a dashboard showing the most dangerous and least dangerous areas, with the ability to normalize by population density, or filter by day vs. night and neighborhood.
We wanted to make these crime statistics even more relevant to people by combining them with online walk scores. Walk scores measure “walkability” on a scale from 0-100 based on walking routes to destinations such as grocery stores, schools, parks, restaurants, and retail. While this provides helpful information, a major shortfall is that it does not account for safety. Therefore, we created a second dashboard with adjusted walk scores, in hopes of finding a final rating that would give an accurate “livability” score to each area in San Francisco.
In completing this project, we found a lot of interesting insights about the crime and safety of San Francisco. For instance, despite the common belief that more crime happens at night, the data showed a similar amount of crime occurred during the day vs. night. Also, while the SOMA neighborhood is a large hub for technology offices and only spans ~1.3% of San Francisco’s square area, it had the most amount of crime, accounting for ~10.8% in all of 2017.
We hope that our tool can empower people to make more informed decisions in choosing where to live and to keep safety in the forefront of their minds. With more time and data, we believe this tool could be applied to cities across the US, and our adjusted walk scores could be accessible on real estate sites online. We could even use the Splunk Machine Learning Toolkit to predict the up-and-coming neighborhoods for the future. Until then, though, you may want to stay alert around SOMA, and consider spending some more time by Monterey Heights.