Defense and Intelligence Agencies

The Power of Big Data for Defense and Intelligence Agencies

Big data is being used everyday to protect our frontline command and control infrastructures and the Internet-enabled supply chain. Big data is also used to drive efficiencies in data center operations, improve security operations, protect IT infrastructure touch points between public sector buyers and private sector suppliers and watch for patterns and identify potential insider threats.

In addition to maintaining the security of U.S. national intelligence, agencies are tasked with collecting, analyzing and storing tremendous amounts of data to watch for patterns and outliers and create correlations. These agencies must also utilize systems that can handle extremely granular role-based access controls (RBAC) so that only those that 'need-to-know' have access to the right data at the right time.

Unlike Hadoop-based products, Splunk software lets you detect patterns and find anomalies across terabytes raw data in real time without specialized skills, fixed schemas or months of development. With over 7,000 customers, Splunk is widely accepted as the commercial off-the-shelf (COTS) standard for big data analysis. For agencies with an existing Hadoop distribution that struggle with how to extract value from it, Splunk offers Hunk™ for Hadoop and NoSQL Data Stores. Hunk lets you combine the power of Splunk with the batch storage capabilities of Hadoop. This provides a unified view of your data--whether it's in a traditional database, Hadoop or in Splunk.

Traditional perimeter-based defense approaches are ill equipped to handle today's sophisticated threats. Splunk's platform for big data is ideal for detecting patterns and discovering malicious behavior and attacks not seen by signature and rule-based systems.

Many defense and intelligence agencies rely on Splunk technology for their security solution, from replacing their current security information and event management (SIEM) to augmenting it. When agencies look to augment their SIEM it usually means they need Splunk to help accelerate their response time to cyber events. Splunk is used by incident response and forensics teams to get to root cause analysis faster in the face of higher data volumes and more data types.

Splunk helps them to capture security and operations log data from mission-critical custom applications, where the data doesn't fit neatly into a predetermined schema.

When agencies replace their SIEM with Splunk it is out of a larger need to move from being reactive to proactive to security events and incidents. This move also indicates they need wider coverage from a single security solution to include protecting against insider threats and fraud. A key driver is usually a data breach that occurred because the SIEM was configured to correlate event data from security point products but it doesn't perform statistical anomaly detection.

Splunk customers realize the most value and the fastest incident response times when capturing data from traditional security point solutions, credentialed user-to-machine interactions and combining this data with IT operations data for additional context.

The need to detect insider threats has driven agencies to look for ways to understand complex user behavior. Seemingly harmful employee and contractor behavior can be classified as "innocent mistakes" while others may be willful acts. Discovering the difference means understanding when user activity is normal and when it is outlier or abnormal in the context of other employee behavior. Splunk can index tens of terabytes of data per day. It lets you apply statistical analysis to baseline data and watch for outlier behavior and link this to other data to understand context--all key to detecting insider threats.

Intelligence Community Standard Number 700-2 (ICS 700-2) is a good guideline for any agency looking to use log data to detect insider threats. The advantage of using this data is clearly spelled out, it:

"Enable[s] the identification and evaluation of suspicious, unauthorized or anomalous activity that may indicate intent to bypass or defeat security safeguards, disseminate information to unauthorized recipients or otherwise adversely affect the national security by users accessing information resources of the IC; and Support damage assessments related to espionage, unauthorized use, either intentional or unintentional, or unauthorized disclosures by insiders."

Identifying activities that are unauthorized or suspicious can best be done through the use of Splunk statistical analysis commands and the ability to time-index very large data volumes. Damage assessments can be created by understanding traffic and communication patterns in network log data and proxy data combined with packet capture data from emails and other file transfer methods and protocols.

Splunk can access any data via web services, or direct database access. Accessing customs and border patrol data can reveal foreign travel; facility access data can be loaded into Splunk from data bases that contain this information and financial disclosure data can be obtained through public facing credit services. All this data can be used for context for data-drive events inside the organization.

Those individuals that have access to sensitive government information should be monitored for IT system activities that might be construed as malicious behaviors in the context of other external information.

Unauthorized travel, wild fluctuations in credit scores, major relationship changes and starting a business are activities that can be tracked in IT systems, time management systems agency business systems or via APIs to other systems that record this information.

Splunk can perform statistical analysis on IT system usage watching for anomalous behaviors, and perform on-demand correlations to other external data sources inside and outside the agency. This approach helps you understand the difference between an accidental policy violation and someone with malicious intent.

Internet of Things and the Data-Driven Battlefield

Gartner estimates that the Internet of Things will contain more than 26 billion connected devices by 2012. There are many tactical advantages to having everything connected and to correlate sensor data against this information, on what will become the data-driven battlefield. In 2015, the Army will start testing TALOS (Tactical Assault Light Operator Suit) in the field for deployment in 2018. Sensor data from the suit can provide information on the operating status of suit hydraulics and batteries. Next steps often discussed include monitoring soldier vital signs and hydration.

This data can be correlated with GPS data, weapon performance data and soldier health information to provide location and condition information for any unit. Units can be monitored in near real-time and proactively resupplied as data collected from RFID tagged equipment is added to the mix. RFID data can be used to track inventory to support supply chain management and notify suppliers to restock. Weapon and ordinance sensor data can be used to monitor ordinance performance and with a look-up to manufacturer data understand which lot number from which manufacturer may be underperforming.

Once these correlations are supported in a big data system, the data can be used to drive visualizations that give commanders near real-time insights into strategic deployments and battlefield simulations.