Project 2: Scheduler Profiling

  • Release Date: 2/26/2025
  • Due:
    • Part 1: 3/7/2025 11:59pm
    • Part 2: 3/23/2025 11:59pm
    • Part 3: 3/30/2025 11:59pm

Introduction

The goal of this project is to design a CPU profiling tool as a single kernel module. Once loaded, the module will track statistics for each task running on the system, including total CPU time and the number of times a task was scheduled in/out.

The profiling results will be displayed via the proc file system. This project is structured in multiple parts, but you will continuously update the same kernel module created in Part 1 throughout Project 2.


Part 1: Counting Per-Task Scheduling (15 points)

Part 1.1: Setting Up kprobe (5 points)

Prerequisite: Write a kernel module named perftop. perftop should create a /proc file named perftop. You can reuse part of the proc code in Ex1.

The profiler results should be displayed using the /proc file system. Your first task is to set up a /proc file for the profiler. Next, we will use kprobe to count how many times cat /proc/perftop is invoked.

Tasks

  • Understand the Kprobe API.
  • Set up a kprobe that triggers every time cat /proc/perftop is executed.
  • The event handler should increment a counter.
  • The counter value should be displayed when running cat /proc/perftop.
  • Take a screenshot of all three invocations in the same terminal window.

Part 1.2: Counting Task Scheduling by PID (10 points)

Now, we will count the number of times each PID has been scheduled.

Tasks

  • Set up a hash table with PIDs as keys and scheduling counts as values.
  • Attach a kprobe hook to pick_next_task_fair().
  • Use the x86_64 calling convention to retrieve the task_struct pointer.
  • Extract the PID from task_struct.
  • Update the hash table:
    • If the PID exists, increment its count.
    • Otherwise, add a new entry with an initial count of 1.
  • Modify the proc file’s open() function to display PIDs and their scheduling counts.
  • Take a screenshot of all three invocations in the same terminal window.

Submission

Submit ${YourPID}-lkp25-p2-part1.tar.gz on Canvas.

  • Source code: .c, .h, Makefile
  • A screenshot of the output (p2-part1-1.jpg, p2-part1-2.jpg, …).

Part 2: Tracking the 20 Most Scheduled Kernel Call Stacks (30 points)

Part 2.1: Storing Kernel Stack Traces (10 points)

Modify your module to track task stack traces instead of PIDs. The hash table key will now be a hash of the stack trace, and the value will store the scheduling count.

Tasks

  • In the kprobe event handler:
    • Capture the stack trace for a kernel task using stack_trace_save().
    • Capture the stack trace for a user task using save_stack_trace_user().
  • Modify the hash table to store stack traces as keys instead of PIDs.
  • Increment the schedule count for each unique stack trace.
  • Modify proc file output to display stack traces and their scheduling counts.
  • Take a screenshot of the output.

Part 2.2: Measuring CPU Time for Each Task (10 points)

Modify the module to track the CPU time spent by each scheduled task.

Tasks

  • Modify the kprobe event handler to measure task execution time:
    • A task gets scheduled out, and a new task gets scheduled in.
    • Calculate the elapsed CPU time for the scheduled-out task.
    • Store cumulative time per task in the hash table.
    • Use rdtsc for time measurement.
  • Modify the proc file output to display stack traces along with their cumulative CPU time.
  • Take a screenshot of the output.

Part 2.3: Using an RB-Tree for the 20 Most Scheduled Tasks (10 points)

Modify the module to maintain an RB-tree that tracks the top 20 most scheduled tasks.

Tasks

  • Store cumulative CPU time as the RB-tree key.
  • Store stack traces as the RB-tree values.
  • When a task is scheduled out:
    • Remove its previous entry from the RB-tree.
    • Insert its updated cumulative time and stack trace.
  • Modify the proc file output to display the top 20 tasks:
    • Print the rank, Jenkins hash of the stack trace, total CPU time, and the stack trace (max depth of 4 frames).

Submission

Submit ${YourPID}-lkp25-p2-part2.tar.gz on Canvas.

  • Source code: .c, .h, Makefile
  • Screenshot of the outputs (p2-part2-1.jpg, p2-part2-2.jpg, …).

Part 3: Printing Function Names for the 20 Most Scheduled Tasks (55 points)

Modify the module to display function names instead of raw return addresses for kernel tasks.

Tasks

  • Update the proc file output to:
    • Print function names instead of raw addresses.
    • Display the top 20 most scheduled tasks with function names.
    • Take a screenshot of the output.

Note: Assume you already have the virtual address for the functions, use %pB to print out the corresponding function name with offset in stack. Check out Documentation/printk-formats.txt.

Don’t over think about it. Part 3 is a low-hanging fruit if part 2 is done correctly (literally tens of lines of code). The 55% points assigned to part 3 are to motivate you to make part 2 work well.

Submission

Submit ${YourPID}-lkp25-p2-part3.tar.gz on Canvas.

  • Source code: .c, .h, Makefile
  • Screenshot of the outputs (p2-part3-1.jpg, p2-part3-2.jpg, …).

Good luck! 🚀


Back to top

LKP: (Advanced) Linux Kernel Programming (Spring 2025) - Huaicheng Li