Project 2: Scheduler Profiling
- Release Date: 2/26/2025
- Due:
- Part 1: 3/7/2025 11:59pm
- Part 2: 3/23/2025 11:59pm
- Part 3: 3/30/2025 11:59pm
Introduction
The goal of this project is to design a CPU profiling tool as a single kernel module. Once loaded, the module will track statistics for each task running on the system, including total CPU time and the number of times a task was scheduled in/out.
The profiling results will be displayed via the proc file system. This project is structured in multiple parts, but you will continuously update the same kernel module created in Part 1 throughout Project 2.
Recommended Background Reading
- Kprobe documentation, Examples
- x86_64 calling conventions
- Stack trace: Source code, Example
- Spinlock: API
- Jenkins hash: API
- Time measurement (
rdtsc): API - Symbol lookup: API
Part 1: Counting Per-Task Scheduling (15 points)
Part 1.1: Setting Up kprobe (5 points)
Prerequisite: Write a kernel module named perftop. perftop should create a /proc file named perftop. You can reuse part of the proc code in Ex1.
The profiler results should be displayed using the /proc file system. Your first task is to set up a /proc file for the profiler. Next, we will use kprobe to count how many times cat /proc/perftop is invoked.
Tasks
- Understand the Kprobe API.
- Set up a
kprobethat triggers every timecat /proc/perftopis executed. - The event handler should increment a counter.
- The counter value should be displayed when running
cat /proc/perftop. - Take a screenshot of all three invocations in the same terminal window.
Part 1.2: Counting Task Scheduling by PID (10 points)
Now, we will count the number of times each PID has been scheduled.
Tasks
- Set up a hash table with PIDs as keys and scheduling counts as values.
- Attach a
kprobehook topick_next_task_fair(). - Use the x86_64 calling convention to retrieve the
task_structpointer. - Extract the PID from
task_struct. - Update the hash table:
- If the PID exists, increment its count.
- Otherwise, add a new entry with an initial count of
1.
- Modify the proc file’s
open()function to display PIDs and their scheduling counts. - Take a screenshot of all three invocations in the same terminal window.
Submission
Submit ${YourPID}-lkp25-p2-part1.tar.gz on Canvas.
- Source code:
.c,.h,Makefile - A screenshot of the output (
p2-part1-1.jpg,p2-part1-2.jpg, …).
Part 2: Tracking the 20 Most Scheduled Kernel Call Stacks (30 points)
Part 2.1: Storing Kernel Stack Traces (10 points)
Modify your module to track task stack traces instead of PIDs. The hash table key will now be a hash of the stack trace, and the value will store the scheduling count.
Tasks
- In the
kprobeevent handler:- Capture the stack trace for a kernel task using
stack_trace_save(). - Capture the stack trace for a user task using
save_stack_trace_user().
- Capture the stack trace for a kernel task using
- Modify the hash table to store stack traces as keys instead of PIDs.
- Increment the schedule count for each unique stack trace.
- Modify proc file output to display stack traces and their scheduling counts.
- Take a screenshot of the output.
Part 2.2: Measuring CPU Time for Each Task (10 points)
Modify the module to track the CPU time spent by each scheduled task.
Tasks
- Modify the
kprobeevent handler to measure task execution time:- A task gets scheduled out, and a new task gets scheduled in.
- Calculate the elapsed CPU time for the scheduled-out task.
- Store cumulative time per task in the hash table.
- Use
rdtscfor time measurement.
- Modify the proc file output to display stack traces along with their cumulative CPU time.
- Take a screenshot of the output.
Part 2.3: Using an RB-Tree for the 20 Most Scheduled Tasks (10 points)
Modify the module to maintain an RB-tree that tracks the top 20 most scheduled tasks.
Tasks
- Store cumulative CPU time as the RB-tree key.
- Store stack traces as the RB-tree values.
- When a task is scheduled out:
- Remove its previous entry from the RB-tree.
- Insert its updated cumulative time and stack trace.
- Modify the proc file output to display the top 20 tasks:
- Print the rank, Jenkins hash of the stack trace, total CPU time, and the stack trace (max depth of 4 frames).
Submission
Submit ${YourPID}-lkp25-p2-part2.tar.gz on Canvas.
- Source code:
.c,.h,Makefile - Screenshot of the outputs (
p2-part2-1.jpg,p2-part2-2.jpg, …).
Part 3: Printing Function Names for the 20 Most Scheduled Tasks (55 points)
Modify the module to display function names instead of raw return addresses for kernel tasks.
Tasks
- Update the proc file output to:
- Print function names instead of raw addresses.
- Display the top 20 most scheduled tasks with function names.
- Take a screenshot of the output.
Note: Assume you already have the virtual address for the functions, use %pB to print out the corresponding function name with offset in stack. Check out Documentation/printk-formats.txt.
Don’t over think about it. Part 3 is a low-hanging fruit if part 2 is done correctly (literally tens of lines of code). The 55% points assigned to part 3 are to motivate you to make part 2 work well.
Submission
Submit ${YourPID}-lkp25-p2-part3.tar.gz on Canvas.
- Source code:
.c,.h,Makefile - Screenshot of the outputs (
p2-part3-1.jpg,p2-part3-2.jpg, …).
Good luck! 🚀