Extended Berkeley Packet Filter (eBPF) is an exciting technology that provides secure, high-performance kernel programmability directly from the operating system. It can expose a wide range of applications and kernel telemetry that is otherwise unavailable.
But with operating systems frequently processing very large volumes of network data, even with an efficient framework and cheap eBPF program runs, costs can add up quickly.
eBPF helps to maintain low overhead while enabling a real-time, high-granularity, no-sampling architecture for network insights in seconds — reducing MTTD. This article explains eBPF, including:
- How it works
- Benefits it offers
- How to use it with the Flowmill Collector and OpenTelemetry
- Solutions for common challenges
What is eBPF?
The Extended Berkeley Packet Filter (eBPF) is a kernel technology that allows programs to run without requiring changes to the kernel source code or the addition of new modules. It's a sandbox virtual machine (VM) inside the Linux kernel where programmers can run BPF bytecode that uses specified kernel resources.
eBPF reduces the need to alter kernel source code, simplifying software's ability to exploit existing layers. As a result, it's a strong technology that has the potential to change the you deliver services such as:
Initially, eBPF’s main use was a way of increasing observability and security while filtering network packets. Today, its functionality has been extended to various use cases such as providing high-performance networking and load balancing in modern data centers and cloud-native environments. Its core capabilities include:
- Extracting granular security observability data with low overhead
- Assisting application developers in tracing applications
- Providing insights for performance troubleshooting and preventive application and container runtime security enforcement, among others
How eBPF works
eBPF lets programmers execute custom sandboxed bytecode within the kernel without having to change the kernel or load kernel modules — all by unlocking access to kernel-level events. It does this by:
- Verifying programs being loaded at the hook points within the kernel that are triggered by specific events.
- Calling helper functions to manipulate program data at optimum efficiency.
- Using key-value pairs mappings to share data between the user and kernel space.
Benefits of eBPF
eBPF is typically used to trace user-space processes within the Linux kernel and improve on security and observability in networking. The possibilities of the eBPF innovation are endless and it is a safe method to ensure and enhance the following components.
eBPF enables the visibility and control of all aspects to be combined to develop security systems that are more context-aware and have a higher level of control. Programs are effectively sandboxed, which means kernel source code is safe and unaltered. The verification phase makes sure that resources aren't clogged up by programs that operate indefinitely.
Using eBPF ensures programmability and increases network efficiency. Since the code is run directly in the kernel, the process of packet processing is optimized without adding additional parsers and logic layers.
Observability & monitoring
eBPF provides a single accessible framework interface for collection and in-kernel aggregation of custom metrics, which:
- Provides in-depth visibility and a central monitoring dashboard of events’ metrics from a wide range of sources.
- Significantly reduces the overall system overhead.
Tracing & profiling
eBPF provides a single, powerful and easy-to-use framework for unified profiling and program tracing. When eBPF programs are attached to tracepoints in both the user and kernel spaces, it allows unprecedented visibility into the application runtime behavior, which could generate insights for troubleshooting.
The introspection provides enough sample data for internal visibility and performance improvement.
Using eBPF, Flowmill Collector & Open Telemetry for observability
By guaranteeing that the kernel layer is monitored, eBPF improves observability, allowing for greater visibility, context and accuracy in your data and infrastructure.
One way is with Flowmill Collector, an agent that uses the eBPF technology to collect low-level data directly from the Linux kernel. It accomplishes this with very little expense in terms of CPU and network resources by leveraging open-source eBPF infrastructure to help create robust low-overhead observability.
Network observability is vital when solving system complexity challenges. Modern deployments are complex, some having hundreds to thousands of loosely coupled microservices written in multiple languages, and application frameworks running across an ephemeral compute infrastructure. This complexity makes problems difficult to diagnose. Deployment changes day-to-day as services evolve.
What you might get from observability is a real-time map of the network and its dependencies including where each service is running. It also provides metrics on how services and their dependencies are performing — regardless of the programming languages and application frameworks that the services are built with — by analyzing the data of a setlist of important events. With network telemetry, it is possible to drill down to an individual pod or host level due to its granularity.
In this video, Jonathan Perry and the Splunk architect team explain some challenges they faced when building the Flowmill Collector and how OpenTelemetry solves them:
Challenge 1: More data for value-add context
One challenge of building a collector was that collecting information only from sockets was insufficient. One of the major advantages of network monitoring with eBPF is that you can see not only IP addresses, but also the context of the communication, the process container, and the host associated with the traffic.
- Information from the cloud provider
- Information about containers from Docker and the orchestrator
- Information about network address translation and the mapping of external addresses to names that the users understand
The solve: The Flowmill Collector contains all this instrumentation, and the key advantage is that much of this instrumentation is reusable for other types of eBPF observability. For example, the metadata continues to be useful if you want to…
- Monitor context switches
- Collect profiling information
- Monitor files instead of sockets
- Monitor system calls
Challenge 2: Controlling overhead
Another challenge is how to reduce overhead. To measure live systems, every container update contains thousands of process and socket updates and hundreds of thousands of socket activity reports. If you encode container information on every socket report, you could be spending a lot of CPU time sketching, coding, and decoding container metadata.
The solve: The Flowmill Collector solves this by ensuring that it only sends updates for container process and socket metadata. Those updates are cached, eliminating much of the redundant work. This is one of the design decisions that enable the collector to achieve low CPU and network overheads so users can get always-on granular reporting in production.
Other challenges encountered include:
- Header fetching and caching
- Visibility into observability
- Causality when reading from multiple perf-rings
- Fast dev cycles when loading eBPF code from CLI
Splunk supports OpenTelemetry
Splunk is committed to improving observability efficiency through open-source projects like Open Telemetry, including by donating donating the Flowmill Collector. The telemetry combined with eBPF technology will not only provide a platform for high-performance kernel programmability, but will also augment your observability data pipeline to give users in-depth vital information about their distributed applications.
What is Splunk?
This article was written by Faith Kilonzi, a full-stack software engineer, technical writer, and a DevOps enthusiast, with a passion for problem-solving through implementation of high-quality software products. She holds a bachelor’s degree in Computer Science from Ashesi University. She has experience working in fin-tech, research, technology, and consultancy industries.
This posting does not necessarily represent Splunk's position, strategies or opinion.