eBPF (extended Berkeley Packet Filter) is transforming how we observe and secure systems. It allows running sandboxed programs in the Linux kernel without changing kernel source code or loading kernel modules. The implications for observability are profound.
Here’s how eBPF is changing the game.
What Is eBPF
The Concept
┌─────────────────────────────────────────────────────────────────┐
│ User Space │
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ App 1 │ │ App 2 │ │ eBPF Tool │ │
│ └─────────────┘ └─────────────┘ └──────┬──────┘ │
│ │ │
├───────────────────────────────────────────────┼──────────────────┤
│ Kernel │ │
│ │ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ eBPF Programs │ │
│ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ │
│ │ │ Tracing │ │Networking│ │ Security │ │ XDP │ │ │
│ │ └─────────┘ └─────────┘ └─────────┘ └─────────┘ │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │ │
│ ┌───────────────────────────┼───────────────────────────────┐ │
│ │ Kernel Functions │ │
│ │ syscalls, network stack, scheduler, filesystems, etc. │ │
│ └───────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
Why It Matters
Traditional observability:
- Limited to kernel-provided metrics
- Custom monitoring requires kernel modules
- Kernel modules can crash the system
- Significant overhead for tracing
eBPF observability:
- Programmable kernel instrumentation
- Safe, sandboxed execution
- Near-zero overhead possible
- Access to everything in kernel
Use Cases
System Tracing
# BCC tool: trace file opens
from bcc import BPF
program = """
int trace_open(struct pt_regs *ctx, const char __user *filename, int flags) {
bpf_trace_printk("open called: %s\\n", filename);
return 0;
}
"""
b = BPF(text=program)
b.attach_kprobe(event="do_sys_open", fn_name="trace_open")
b.trace_print()
Network Observability
// XDP program for packet counting
SEC("xdp")
int count_packets(struct xdp_md *ctx) {
__u32 key = 0;
__u64 *count = bpf_map_lookup_elem(&packet_count, &key);
if (count) {
__sync_fetch_and_add(count, 1);
}
return XDP_PASS;
}
Continuous Profiling
CPU profiling without application changes:
# Profile all processes at 99 Hz
profile -F 99 -a -g > flamegraph.txt
# Profile specific process
profile -p $(pgrep myapp) -F 99 > myapp_profile.txt
Container Observability
// Track container network connections
SEC("kprobe/tcp_connect")
int trace_connect(struct pt_regs *ctx, struct sock *sk) {
struct task_struct *task = (struct task_struct *)bpf_get_current_task();
__u32 cgroup_id = bpf_get_current_cgroup_id();
// Get container info and connection details
struct connection_t conn = {};
conn.pid = bpf_get_current_pid_tgid() >> 32;
conn.cgroup_id = cgroup_id;
bpf_probe_read(&conn.daddr, sizeof(conn.daddr), &sk->sk_daddr);
bpf_probe_read(&conn.dport, sizeof(conn.dport), &sk->sk_dport);
bpf_perf_event_output(ctx, &events, BPF_F_CURRENT_CPU, &conn, sizeof(conn));
return 0;
}
Tools and Frameworks
BCC (BPF Compiler Collection)
Python-friendly eBPF toolkit:
# Histogram of read() latencies
from bcc import BPF
bpf_text = """
#include <uapi/linux/ptrace.h>
BPF_HISTOGRAM(dist);
int trace_read_entry(struct pt_regs *ctx) {
u64 ts = bpf_ktime_get_ns();
u32 pid = bpf_get_current_pid_tgid();
start.update(&pid, &ts);
return 0;
}
int trace_read_return(struct pt_regs *ctx) {
u64 *tsp, delta;
u32 pid = bpf_get_current_pid_tgid();
tsp = start.lookup(&pid);
if (tsp != 0) {
delta = bpf_ktime_get_ns() - *tsp;
dist.increment(bpf_log2l(delta));
start.delete(&pid);
}
return 0;
}
"""
b = BPF(text=bpf_text)
b.attach_kprobe(event="vfs_read", fn_name="trace_read_entry")
b.attach_kretprobe(event="vfs_read", fn_name="trace_read_return")
b["dist"].print_log2_hist("usecs")
bpftrace
High-level tracing language:
# One-liner: syscall counts
bpftrace -e 'tracepoint:syscalls:sys_enter_* { @[probe] = count(); }'
# File I/O latency by process
bpftrace -e '
kprobe:vfs_read { @start[tid] = nsecs; }
kretprobe:vfs_read /@start[tid]/ {
@us[comm] = hist((nsecs - @start[tid]) / 1000);
delete(@start[tid]);
}
'
# TCP connections by process
bpftrace -e '
kprobe:tcp_connect {
@[comm] = count();
}
'
Cilium
eBPF-based Kubernetes networking:
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
name: allow-frontend
spec:
endpointSelector:
matchLabels:
app: backend
ingress:
- fromEndpoints:
- matchLabels:
app: frontend
toPorts:
- ports:
- port: "80"
protocol: TCP
Pixie
Automatic application observability:
# Install Pixie
px deploy
# Query application metrics without instrumentation
px run px/service_stats
px run px/http_data
px run px/dns_data
Production Observability
Continuous Profiling
# Parca / Pyroscope / Polar Signals
setup:
- Deploy eBPF agent to nodes
- Automatic profiling of all processes
- No code changes required
- Minimal overhead (<1% CPU)
benefits:
- Always-on profiling
- Historical analysis
- Compare before/after deployments
- Find performance regressions
Network Flow Monitoring
# Hubble (Cilium's observability layer)
observability:
network_flows:
- Source and destination pods
- Protocol and port
- DNS queries
- HTTP requests (L7)
security_events:
- Dropped packets
- Policy violations
- Connection attempts
performance:
- Latency histograms
- Throughput metrics
- Retransmission rates
Security Monitoring
# Falco / Tetragon for runtime security
events_captured:
- Process execution
- File access
- Network connections
- System calls
- Container escapes
response_options:
- Alert
- Log
- Kill process
- Block network
Performance Considerations
Overhead
ebpf_overhead:
well_designed:
- Sub-microsecond per event
- Near-zero when not triggered
- Scales with event rate
poorly_designed:
- Excessive map lookups
- Complex computations in kernel
- Too many attached probes
best_practices:
- Filter early (in eBPF, not user space)
- Use appropriate map types
- Batch data transfer to user space
- Monitor eBPF program overhead
Safety
verifier_guarantees:
- No infinite loops
- Bounded execution time
- Memory safety
- No kernel crashes
limitations:
- Stack size limited (512 bytes)
- Program size limited
- Some kernel functions not callable
- Verifier can reject valid programs
Getting Started
Prerequisites
# Check kernel version (4.15+ for most features)
uname -r
# Install BCC tools
apt-get install bpfcc-tools linux-headers-$(uname -r)
# Or bpftrace
apt-get install bpftrace
Simple Examples
# List available tracepoints
bpftrace -l 'tracepoint:*'
# Count syscalls
bpftrace -e 'tracepoint:raw_syscalls:sys_enter { @[comm] = count(); }'
# Block I/O latency histogram
biolatency
# TCP connection tracing
tcpconnect
# File open tracing
opensnoop
Key Takeaways
- eBPF enables programmable kernel observability without kernel modifications
- Safe, sandboxed execution in kernel means no risk of crashing the system
- Near-zero overhead when designed well; filters data in kernel
- BCC and bpftrace provide accessible interfaces for custom tracing
- Cilium brings eBPF networking to Kubernetes with observability built-in
- Continuous profiling without code changes or restart
- Security monitoring can observe syscalls, network, and file access
- eBPF is becoming the foundation for next-generation observability
- Start with existing tools (BCC, bpftrace) before writing custom programs
eBPF changes what’s possible for observability. Deep system visibility that was previously impossible or expensive is now routine.