Kernel Debugging Guide

This guide covers essential debugging techniques for Linux kernel development. Master these tools to diagnose kernel bugs efficiently.


Quick Reference

ProblemToolCommand
Kernel panicdmesgdmesg \| tail -50
Module won’t loaddmesgdmesg \| grep -i error
Function tracingftraceecho function > current_tracer
Syscall tracingstracestrace -e openat ./program
Performance issuesperfperf top
Memory issuesKASANEnable CONFIG_KASAN
Step debuggingGDB+QEMUSee section below

1. printk() - Your First Debugging Tool

Log Levels

printk(KERN_EMERG   "Emergency: %s\n", msg);   /* 0 - System unusable */
printk(KERN_ALERT   "Alert: %s\n", msg);       /* 1 - Action required */
printk(KERN_CRIT    "Critical: %s\n", msg);    /* 2 - Critical condition */
printk(KERN_ERR     "Error: %s\n", msg);       /* 3 - Error condition */
printk(KERN_WARNING "Warning: %s\n", msg);     /* 4 - Warning */
printk(KERN_NOTICE  "Notice: %s\n", msg);      /* 5 - Normal but significant */
printk(KERN_INFO    "Info: %s\n", msg);        /* 6 - Informational */
printk(KERN_DEBUG   "Debug: %s\n", msg);       /* 7 - Debug messages */

Modern Macros (Preferred)

pr_info("Module loaded\n");
pr_err("Failed to allocate memory\n");
pr_debug("Debug: value = %d\n", val);   /* Only if DEBUG defined */

/* Dynamic debug - can be enabled at runtime */
pr_debug("count = %d\n", count);
/* Enable: echo 'module mymod +p' > /sys/kernel/debug/dynamic_debug/control */

Useful Format Specifiers

pr_info("Pointer: %px\n", ptr);      /* Raw pointer */
pr_info("Symbol: %pS\n", func);       /* Kernel symbol name */
pr_info("dentry: %pd\n", dentry);     /* Dentry name */
pr_info("task: %s\n", current->comm); /* Current process name */
pr_info("PID: %d\n", current->pid);   /* Current PID */

Viewing Kernel Messages

# View kernel log
dmesg
dmesg | tail -20
dmesg -w                    # Follow (like tail -f)
dmesg -T                    # Human-readable timestamps
dmesg -l err,warn           # Filter by level
dmesg -c                    # Clear and display

# Persistent logs (systemd)
journalctl -k               # Kernel messages
journalctl -k -f            # Follow kernel log

2. GDB + QEMU Debugging

Setup

Step 1: Configure Kernel

# Essential config options for debugging
CONFIG_DEBUG_INFO=y
CONFIG_DEBUG_INFO_DWARF5=y
CONFIG_GDB_SCRIPTS=y
CONFIG_FRAME_POINTER=y
CONFIG_KGDB=y
CONFIG_KGDB_SERIAL_CONSOLE=y

# Disable for reliable debugging
# CONFIG_RANDOMIZE_BASE is not set  (Disable KASLR)

Step 2: Start QEMU

# Start with debug stub
qemu-system-x86_64 \
    -kernel arch/x86/boot/bzImage \
    -initrd initramfs.cpio.gz \
    -append "console=ttyS0 nokaslr" \
    -nographic \
    -s -S    # -s = GDB on :1234, -S = pause at start

Step 3: Connect GDB

gdb vmlinux
(gdb) target remote :1234
(gdb) continue

Essential GDB Commands

# Breakpoints
break start_kernel              # Break at function
break kernel/fork.c:1234        # Break at line
break copy_process if pid == 0  # Conditional breakpoint
delete 1                        # Delete breakpoint 1
info breakpoints                # List breakpoints

# Execution
continue (c)                    # Continue execution
next (n)                        # Step over
step (s)                        # Step into
finish                          # Run until function returns

# Inspection
print task->pid                 # Print variable
print/x addr                    # Print in hex
print *ptr                      # Dereference pointer
x/10x $rsp                      # Examine 10 hex words at RSP
x/s str                         # Examine string

# Stack
backtrace (bt)                  # Show call stack
frame 3                         # Select frame 3
info locals                     # Show local variables
info args                       # Show function arguments

# Threads/CPUs
info threads                    # List CPUs/threads
thread 2                        # Switch to CPU 2

# Kernel-specific
lx-dmesg                        # Show kernel log
lx-ps                           # Show processes
lx-lsmod                        # Show modules
lx-symbols                      # Reload symbols after module load

Debugging a Kernel Module

# 1. Load module in QEMU
(qemu) insmod mymodule.ko

# 2. Find module load address
(gdb) lx-lsmod
# Or: cat /sys/module/mymodule/sections/.text

# 3. Add symbol file
(gdb) add-symbol-file mymodule.ko 0xffffffffa0000000

# 4. Set breakpoints
(gdb) break my_function

Common Scenarios

Debugging a Crash

# After panic, examine the crash
(gdb) bt
(gdb) info registers
(gdb) print $rip
(gdb) list *$rip

Finding Where Memory Corruption Occurred

# Set hardware watchpoint
(gdb) watch *(int*)0xffff88007c0d1234
(gdb) continue
# GDB stops when value changes

3. ftrace - Kernel Function Tracer

Basic Usage

cd /sys/kernel/debug/tracing

# Enable function tracer
echo function > current_tracer
echo 1 > tracing_on
cat trace_pipe

# Disable tracing
echo 0 > tracing_on
echo nop > current_tracer

Trace Specific Functions

# Filter functions
echo 'vfs_*' > set_ftrace_filter
echo function > current_tracer

# Exclude functions
echo '!vfs_read' >> set_ftrace_filter

# Clear filters
echo > set_ftrace_filter

Function Graph Tracer

echo function_graph > current_tracer

# Output shows call graph with timing:
#  0)               |  vfs_read() {
#  0)   0.123 us    |    rw_verify_area();
#  0)               |    __vfs_read() {
#  0)   0.456 us    |      new_sync_read();
#  0)   0.789 us    |    }
#  0)   1.234 us    |  }

Tracepoints

# List available tracepoints
cat available_events

# Enable syscall tracing
echo 1 > events/syscalls/sys_enter_openat/enable
cat trace_pipe

# Custom format
echo 'comm == "myprogram"' > events/syscalls/sys_enter_openat/filter

4. perf - Performance Analysis

Basic Profiling

# Count events
perf stat ./program
perf stat -e cycles,instructions,cache-misses ./program

# Sample call stacks
perf record -g ./program
perf report

# Live top-like view
perf top
perf top -p <pid>

Kernel Profiling

# Profile kernel functions
sudo perf top -a

# Record kernel activity
sudo perf record -a -g sleep 10
sudo perf report

# Trace specific kernel function
sudo perf probe vfs_read
sudo perf record -e probe:vfs_read -a sleep 5
sudo perf probe -d vfs_read  # Remove probe

Flamegraphs

# Generate data
sudo perf record -a -g sleep 30
sudo perf script > out.perf

# Create flamegraph (requires FlameGraph tools)
stackcollapse-perf.pl out.perf > out.folded
flamegraph.pl out.folded > flamegraph.svg

5. strace - System Call Tracing

Basic Usage

strace ./program
strace -e openat,read,write ./program   # Filter syscalls
strace -f ./program                      # Follow forks
strace -c ./program                      # Summary statistics
strace -T ./program                      # Show time in syscall
strace -o trace.log ./program            # Output to file

Tracing Running Process

strace -p <pid>
strace -p <pid> -e read,write

Useful Patterns

# What files does it open?
strace -e openat ./program 2>&1 | grep -v ENOENT

# What's it waiting on?
strace -e poll,select,epoll_wait ./program

# Network activity
strace -e socket,connect,sendto,recvfrom ./program

6. Memory Debugging

KASAN (Kernel Address Sanitizer)

Detects use-after-free, out-of-bounds access.

# Enable in config
CONFIG_KASAN=y
CONFIG_KASAN_INLINE=y

# KASAN will print detailed reports on violations:
# BUG: KASAN: use-after-free in my_function+0x42/0x100

KMSAN (Kernel Memory Sanitizer)

Detects uninitialized memory reads.

CONFIG_KMSAN=y

KMEMLEAK

Detects memory leaks.

CONFIG_DEBUG_KMEMLEAK=y

# Scan for leaks
echo scan > /sys/kernel/debug/kmemleak
cat /sys/kernel/debug/kmemleak

SLUB Debugging

# Boot with debug options
slub_debug=FZPU

# Per-slab debugging
echo 1 > /sys/kernel/slab/<cache>/sanity_checks

7. Lockdep - Lock Debugging

Enable Lock Debugging

CONFIG_PROVE_LOCKING=y
CONFIG_DEBUG_LOCK_ALLOC=y
CONFIG_LOCKDEP=y

What Lockdep Detects

  • Deadlocks (circular dependencies)
  • Lock ordering violations
  • Incorrect lock usage (spinlock in sleepable context)
  • Double locks

Lockdep Output Example

======================================================
WARNING: possible circular locking dependency detected
------------------------------------------------------
swapper/0/1 is trying to acquire lock:
 (&mm->mmap_lock){++++}-{3:3}, at: do_page_fault+0x123/0x456

but task is already holding lock:
 (&sb->s_type->i_mutex_key#5){++++}-{3:3}, at: vfs_read+0x78/0x90

which lock already depends on the new lock.

8. Common Debugging Scenarios

Scenario 1: Module Won’t Load

# Check for errors
dmesg | tail -20

# Common issues:
# - Version mismatch: rebuild against running kernel
# - Symbol not found: check CONFIG options
# - Invalid module format: check architecture

Scenario 2: Kernel Panic

# Get panic info
dmesg | grep -A 20 "Kernel panic"

# Decode stack trace (if symbols available)
scripts/decode_stacktrace.sh < panic.log

# With GDB
(gdb) list *0xffffffff81234567  # Address from trace

Scenario 3: Soft Lockup

# Soft lockup = CPU stuck in kernel for too long
# Look for "watchdog: BUG: soft lockup" in dmesg

# Often caused by:
# - Infinite loop
# - Spinlock held too long
# - Interrupt disabled too long

Scenario 4: Memory Corruption

# Enable debugging
CONFIG_DEBUG_SLAB=y
CONFIG_DEBUG_PAGEALLOC=y
CONFIG_KASAN=y

# Use GDB watchpoints
(gdb) watch *(int*)0xaddress

Scenario 5: Performance Issue

# Profile with perf
sudo perf record -a -g sleep 10
sudo perf report

# Check for lock contention
sudo perf lock record
sudo perf lock report

# Check memory bandwidth
sudo perf stat -e LLC-load-misses,LLC-store-misses ./program

9. Debugging Checklist

Before asking for help, verify:

  • dmesg output checked for errors
  • Module builds without warnings (make W=1)
  • Correct kernel version (uname -r matches build)
  • Debug config options enabled
  • Latest kernel log captured
  • Reproduction steps documented

Information to Include in Bug Reports

1. Kernel version: uname -a
2. Module/code version
3. Exact steps to reproduce
4. Expected vs actual behavior
5. Full dmesg output
6. Config options (if relevant)
7. Hardware info (if relevant)

10. Additional Resources

Documentation

Books

  • Linux Kernel Debugging by Kaiwan N. Billimoria
  • Linux Device Drivers, 3rd Ed - Chapter 4 (Debugging)

Tools


Quick Troubleshooting

SymptomLikely CauseDebug Steps
Instant rebootKernel panic earlyBoot with earlycon, check serial
System hangDeadlock or infinite loopEnable lockdep, use NMI watchdog
Random crashesMemory corruptionEnable KASAN, SLUB debug
Slow performanceLock contentionUse perf lock, check /proc/lock_stat
Module fails to loadSymbol/version mismatchCheck dmesg, rebuild module

Back to top

LKP: (Advanced) Linux Kernel Programming (Spring 2026) - Huaicheng Li