Splunk proudly welcomes the new Splunk Firehose Nozzle for Pivotal Cloud Foundry! This release builds on the previous efforts of our work with Pivotal, moving the release from a private beta to an official public release.
What is Pivotal Cloud Foundry?
Pivotal Cloud Foundry (PCF) is one of the world’s most powerful cloud-native platforms to rapidly develop and run modern and legacy applications at startup speeds. In addition, Pivotal Cloud Foundry is the only application development platform that runs on any cloud infrastructure—across public, private and managed clouds.
Cloud Foundry is used by over half of the Fortune 500 and a rapidly growing portion of the Fortune 2000. Pivotal’s customers have experienced a 2,000 percent increase in developer productivity, as well as a 50 percent reduction in IT costs due to platform automation. To learn more, visit the Pivotal Cloud Foundry website.
What is the Splunk Firehose Nozzle for Pivotal Cloud Foundry?
Pivotal Cloud Foundry consolidates application logs and platform components’ metrics to a PCF component known as Loggregator. To get events out of PCF and into your Splunk environment, you need a Nozzle that attaches to the Loggregator Firehose.
This is where the Splunk Firehose Nozzle for Pivotal Cloud Foundry comes in!
Splunk Firehose Nozzle connects to the Loggregator Firehose Endpoint and streams all available events into your Splunk environment via the HTTP Event Collector (HEC).
Use the Splunk Firehose Nozzle for PCF to select, buffer, and transform your events. Adding extra metadata to your events can be leveraged in Splunk correlation searches for data loss tracking and event filtering, among other interesting use cases.
It’s What's Inside that Counts
The Splunk Firehose Nozzle for PCF was developed as an application that runs on Pivotal Cloud Foundry. The Nozzle subscribes to the Loggregator endpoint and writes events to an external Splunk environment.
Figure 1 – High-Level System Integration Diagram (Splunk + Pivotal Cloud Foundry)
- The Splunk Firehose Nozzle for PCF collects events from the PCF Loggregator endpoint and streams them to Splunk via HTTP event collector. Nozzle has in-memory queue buffers to increase reliability, and has parallel client to scale out multiple ingestion channels to HEC.
- The Splunk Firehose Nozzle for PCF can be deployed natively within a PCF environment and is available as a free tile from the PCF Marketplace.
- The Splunk HTTP Event Collector clients run concurrently, consuming events from the queue to enrich PCF events by attaching metadata fields. For scaling out, add as many HECs and place a load balancer in front.
- After data has been ingested into Splunk, it can be explored using the Splunk Add-on for Cloud Foundry—an add-on to Splunk which parses data from any Cloud Foundry distribution—which includes pre-built panels and pre-configured search parameters*.
*Note: Many configurable environment parameters are included in this release, which can modify the features discussed above as well as many others. For a closer look at these configurations, see this page on the Splunk Firehose Nozzle.
Searching PCF Data in Splunk
After ingesting Pivotal Cloud Foundry events, use Splunk’s search language (SPL) to configure visualizations and alerts on important PCF logs and metrics.
The following SPL returns the percentage of events that your Splunk deployment sees and indexes. This search can identify any data loss within the Splunk Firehose Nozzle, which you can use to trigger an investigation into your environment.
subscription-id=splunk-firehose uuid=b28978ba-f83d-4d2f-99c3-c18b1a3f8ebf | stats count as total_events , max(nozzle-event-counter) as max_number | eval total=(total_events/max_number) * 100 | table total
Figure 2 – SPL demonstrating percentage of successfully indexed events.
Version 1.0.0 Release Highlights - Fast, Scalable, Reliable
Highlights in this release include:
- Scale out capability. Deploy more than one nozzle to your Pivotal Cloud Foundry environment for high availability.
- Stream events efficiently and securely to Splunk via the HTTP Event Collector (HEC).
- Concurrent streaming into your Splunk environment with multiple clients sending to HEC for increased event throughput.
- Send custom event metadata along with PCF events to enrich data inside your Splunk environment.
- Log tracing feature allows correlation searching and improves results inside your Splunk environment.
Hearing what our users want, we included three core capabilities into v1.0.0:
- Fast – An increased throughput from PCF to Splunk by 10x compared to the private beta release.
- Scalable – Deploy Splunk Firehose Nozzle concurrently to scale up as your PCF environment grows.
- Reliable – Improved stability of Splunk Firehose Nozzle by implementing fixes seen and reported by users.
We tested the Splunk Firehose Nozzle running as a single deployed nozzle on the AWS instance type c4.4xlarge. This EC2 instance type has 8 CPU and 32 GB memory. Storage is EBS-only and has a dedicated EBS bandwidth of 2,000 Mbps.
Tests were performed with structured and unstructured data with two different event sizes—256 and 1,024 bytes. The following is a table showing average performance metrics:
Figure 3 – Test results
Note: These performance results are a guideline as different configurations and environments may vary results.
Want to Send Feedback or Enhancement Request?
For those interested in assisting in the next release, this project is hosted on github. Feel free to open Pull Requests and raise issues there. For technical feedback and questions, please reach out to Splunk directly here or raise any questions on Splunk Answers.
Principal Product Manager – Splunk Data Collection