Fixing eBPF Spinlock Issues in Linux Kernel
Summary
A deep dive into fixing system freezes caused by eBPF spinlock contention in the Linux kernel. The Superluminal CPU profiler was causing periodic system freezes, and this article documents the debugging journey that led deep into kernel internals.
Key Findings
- Root Cause: NMI (Non-Maskable Interrupt) sampling interrupts competing with eBPF ring buffer spinlocks
- Freeze Duration: Exactly 250ms, matching
RES_DEF_TIMEOUTin kernel spinlock code - Minimal Repro: Only 20 lines of eBPF code needed to reproduce the issue
- Solution: Discovered bug in Linux 5.17's rqspinlock implementation
Technical Details
The issue occurred because:
- eBPF programs use
bpf_ringbuf_reserve()which acquires a spinlock - NMI interrupts cannot be disabled (unlike regular interrupts)
- When sampling interrupt hits while spinlock is held → 250ms wait timeout
- Multiple CPUs experiencing this simultaneously → system freeze
Debugging Techniques
- Physical machine required (VM couldn't reproduce)
- Serial port for kernel debugging (gdb)
- Binary search through eBPF code to find minimal repro
- Reading kernel source to understand lock implementation
Why This Matters
This is a great example of how subtle kernel interactions can cause hard-to-debug issues. The fix involved working with the Linux kernel team to address the underlying rqspinlock problem.