Using Splunk for Good: The Splunktern Way

This summer, the Splunk4Good team hosted their second annual Splunktern classes, educating interns on our product and applying Splunk to open data sources. Participating Splunkterns formed teams, identified social issues, and then used Splunk to promote change for social good. Final projects were presented to a panel of executives and the winning team was awarded a $1,000 charitable donation for their project’s cause.

In this series, we’ll hear how Splunkterns selected their project topics and how this impacted their intern experience at Splunk. Stay tuned to learn who was crowned the winning team.

This first post comes from Priyanka Nayak (Software Engineer intern), Anisha Dangoria (Financial Planning & Analysis intern), and Yiran Jia (Software Engineer intern).

When thinking about a topic for our project, the three of us knew that there was one thing that we thought could be improved: the Bay Area public transit system. Public transportation is an important asset for cities to have, especially those that have dense populations and high business development like San Francisco. The efficiency public transportation can provide to commuters can lead to a high social and economic impact in the Bay Area, which is why we wanted to analyze this topic further for our project.

We leveraged Splunk capabilities to understand current traveler patterns for targeted spending, illustrate the correspondence between GoBikes and BART, and extrapolate broader trends in commuting growth to forecast future BART ridership. Our research showed that there is a 112% increase in SF super commuters—people whose daily commute time exceeds 90 minutes—in the last decade. With over 265,000 people commuting into SF each day for work, we believe that scaling public transportation will greatly improve the commuter experience and reduce pollution in the future.

We first employed Splunk visualization and statistics features to evaluate the three busiest stations (Embarcadero, Powell, and Montgomery) and how trip patterns vary by time of day. Then, we related current BART ridership to bike share usage by computing the hourly Bike-to-BART ratio for each station. Shown is the plot of the ratio for Montgomery station from 5 AM to 11 PM, denoting the number of passengers using a Ford Bike at Montgomery divided by the total number of people getting off the Montgomery BART in that same hour.

From the data, we made three major observations:

  1. A strong indication that bike share usage corresponds to BART ridership during peak transit hours

  2. Possibly a high percentage of people transferring from bike to BART and vice versa for the daily commute

  3. A difference in Bike-BART ratio among Bay Area stations

Based on these discoveries, we suggested that bike-share companies could increase the availability of bikes at peak times and several key locations, as well as allocate more marketing resources to those stations with a low Bike-to-BART ratio in order to raise awareness of their service. We also proceeded to predict future BART usage and discuss how this may impact the bike share industry, using linear regression to estimate how BART ridership is growing throughout the years and time series analysis to forecast how ridership will change in the future.

We also analyzed how Ford GoBike can allocate bikes accordingly to increase usage and profit. After aggregating the passenger-wise transit data into yearly station totals, we utilized the Splunk Machine Learning Toolkit to identify a general trend of ridership growth.

To more accurately model and predict BART ridership, we used a time series analysis, which takes into account cycles, trends, seasonality, and random noise. Our time series model matches the weekly ups and downs of commuters almost perfectly for each station, and from the plot, we could also discern subtle monthly patterns. Using this result, both BART and Ford GoBikes could prepare for future growth in ridership and respond to changes in real-time.

Overall, by optimizing the operation of our public transportation, our project could potentially help scale the bike-sharing market and support daily operations of the BART system. Given the powerful capabilities of Splunk indexing and search, our solution could be scaled across multiple levels. Most importantly however, doing this Splunk4Good project allowed us to see how society can benefit from public transportation, and how it can lead to a more sustainable future.

To learn more about Splunk’s commitment to research, education and community service, visit the Splunk Pledge website.

As a member of the Social Impact team, Patricia manages our global employee engagement programs and community partnerships. Prior to joining Splunk, she supported global corporate responsibility programs developing charitable giving strategies, nonprofit partnerships, and sustainability, diversity, and volunteer initiatives. Patricia has created social impact programs for media, cybersecurity, HR, and other technology companies. She is a native of the Pacific Northwest and holds a B.S. in Anthropology from Santa Clara University.

Show All Tags
Show Less Tags