Splunk Answers President’s Call for Open Data with eRegulations Insights

Open data has become all the rage, and while governments at the local and state level have aggressively moved to release vast quantities of data, the White House is clearly leading the charge. With more than 100,000 data sets, data.gov has become the mother lode of open data. But how to make sense of all of this data? Why, by Splunking it, of course.

eRegulations Insights was announced this morning and developed in response to President Obama’s call for technology leaders to help harness the power of open data through collaboration and public-private partnerships.

Public data projects are a big part of what we do, and naturally we try to find interesting ways to drill into open data to both make it more useful to both policy makers and the public. Brought to you by Splunk and Splunk4Good – the corporate social responsibility program for Splunk – eRegulations Insights was conceived and designed by the inquiring minds of our very own Splunkers (think chief minds and Splunk ninjas) using Splunk Enterprise, and is available now at http://eregulations.splunk4good.com.

eRegulations Insights is a set of public dashboards and visualizations designed to help decipher the tone of public response to regulatory proposals. Users can explore different agencies, proposals, the volume of public comment by agency and regulation, issues of concern addressed in public responses, and even primary community influencers that are mobilizing public engagement around a proposal.

Screen Shot 2014-06-02 at 9.44.22 AM

eRegulations Insights taps into nearly 1.2 million comments received by federal agencies since the beginning of 2012.


The dashboards are completely open to the public, and are designed to bring greater insight and visibility to the black box of public comment. This site taps into nearly 1.2 million comments received by federal agencies since the beginning of 2012 – and helps to make that information more useful and easier to understand. In short, we wanted to improve how the information could be used to ensure that it gets used.

Some other interesting findings:

  • A small number of agencies drive the vast majority of public comment. While thousands of comments are received by Regulations.gov, we found that nearly one-third of all comments since 2012 were driven by just three issues. Not surprisingly, these issues centered around three of the most contentious issues in the news: the Affordable Care Act (also known as Obamacare), the Keystone Pipeline, and the political activities of tax-exempt organizations. In 2014, the last two of these regulations have accounted for more than 70 percent of all comments.
  • Sentiment analysis can provide better insight into the tone of public conversations, scoring responses based on common phrases and language used, and even heat scores driven by excessive profanity. While we aren’t literally reading every comment, this approach ultimately helps improve the accessibility of the information, as it allows aggregation and scoring of literally millions of comments based on intricate calculations that discover and identify interesting patterns. The National Park Service receives the highest sentiment scores, and at the bottom? US Citizenship and Immigration Services.
  • Cluster analysis shows that the number of comments may not necessarily reflect citizen priorities. Analyzing the number of responses that are either identical or substantially similar shows that even significant interest in a proposal may not be what it appears. For example, more than 92,000 identical comments were submitted in response to a notice about the national interest of the Keystone XL Pipeline. Each of these comments were submitted by BKM Strategies, a Washington, DC based lobbying firm.
  • Networks of online influencers can drive significant traffic. The influencers section shows that public comments may be driven by some unexpected online influencers. Taking at look at activity so far in 2014, there are a number of unexpected influencers, some indicating direct influencers, and some are indicators of coordinated campaigns via Facebook and by organizations such as the American Families Council.

Screen Shot 2014-06-03 at 11.17.37 AM

Public comments can be driven by unexpected influencers, including coordinated online campaigns via Facebook and other sites.


What’s next? What does this all mean?

While we used the open data API from Regulations.gov to Splunk public comment from federal agencies and to build eRegulations Insights, this is far from the only application. This type of approach – real time analytics on public comment and other public discourse – could have very real implications at all levels of government. Public officials and regulators constantly grapple with the realities of public participation and how to improve engagement beyond a determined few. Public meetings frequently feature the same cast of characters, comments are submitted by those with a vested interest, and ultimately public comment is limited to those with the time to attend public meetings or the means to influence legislation.

But what if legislators and regulators had real time access to public comment, and the ability to effectively digest the data in a way that could help them make better use of public comment? Perhaps they could expand on the very concept and use public comment as a way to actively inform public debate, leverage channels such as social media, and improve civic engagement on important public issues?

In effect, applications such as eRegulations Insights can help provide public officials and citizens alike with the resources they need to truly make governments more efficient, more responsive, and more accessible – many of which might already be in use in their organizations. Tools such as Splunk – in use for other purposes in hundreds of public facing organizations – could be repurposed to help make the public process more open. As local and state governments grapple with the challenge of making public meetings more accessible, officials often hesitate to open online public comment for fear of being inundated with information without the means to constructively process the dramatically increased volume. Visualizations and analysis of public sentiment, influencers, and even social media could help remove organizational obstacles to precisely the openness and transparency that people are clamoring for.

Government agencies can use this information to gauge people’s interest in an issue, see who is driving traffic and activity, and to decipher the tone of the public conversation. All in real time and 100 percent available to the public. And they could use that information to improve the transparency and quality of important public discussions. Let’s leverage the wealth of open data we already have, which in combination with these new technologies can dramatically improve the quality of our public processes.


Special thanks to our friends at Regulations.gov

None of this project would have been possible – nor any of the insights achieved – if it weren’t for the fine folks at Regulations.gov. This amazing repository of open data magic is expertly managed by the Environmental Protection Agency, and contains a wealth of information. The Data to Knowledge to Action initiative instigated this conversation, and our friends at the National Science Foundation and White House Office of Science and Technology Policy have been active supporters in the development of this site. Thank you!

Corey Marshall

Posted by


Show All Tags
Show Less Tags