Project 4: Research Project (5xxx)
- Release Date: 2/26/25
Due dates
- Project proposal: 3/7/25
- Midterm report: 4/8/25
- Final report and presentation: 5/16/25
Introduction
The goal of this final research project is to further explore Linux kernel related research which you are interested in. You should choose one of the research topics in below. You are expected to understand current design in Linux kernel and other alternative designs, and propose a new design. Ideally, the final report should be able to submitted to a top-tier system conference or workshop (e.g., SOSP/OSDI, ATC, HotStorage, ApSys) after some extension. Since time is limited, however, the above goal is hard to reach. Students that aim high will be rewarded even if they do not completely succeed. However, a student should make sure some aspects of your work are completely done. This is a solo project. A project MUST have one or more kernel components involved.
Project Topics
You are supposed to choose one of the following research topics. Ideally, the topic will be related to your current research interests. However, if there is an overlap between this course project and your other on-going projects, you MUST disclose such overlap at the proposal stage and get a written approval (email) from the instructor.
Project and its report should include following four aspects:
- Analysis: Analyze an existing approach in Linux Kernel and compare it with other approaches (mostly in research papers)
- Design: Design of your novel extension of the Linux kernel component for a new problem, task, and application. Although you are supposed to extend a Linux Kernel component, it is okay to design and implement in userspace if you are using kernel extension features, such as
userfaultfd
,fuse
, andeBPF
. - Implementation: The proposed design should be implemented and the tarball of the implementation should be submitted.
- Evaluation: Experimentally compare your approach against the stock Linux kernel approach and other existing approaches.
Topic 1. Memory Management
Recently memory takes significant portion in any server cost so the best use of memory becomes super critical in Linux kernel design. In particular, MGLRU, DAMON, and memory tiering (using NUMA, NVM, or CXL-attached remote memory) have been actively discussed in the Linux kernel community. In this research topic, you are supposed to analyze the recently proposed advanced memory management techniques in Linux and extend it. Here are some starting point:
- MGLRU, https://docs.kernel.org/admin-guide/mm/multigen_lru.html
- Hot page sampling, https://www.kernel.org/doc/html/v5.17/vm/damon/index.html
- Linux NUMA Balancing and Tiering,
NUMA_BALANCING_MEMORY_TIERING
, Patches, Presentation - Research papers:
Topic 2. Scheduling
The best use of CPU is one of the primary responsibilities of OS. While CFS/EEVDF has been fairly successful for a while, there are still needs for improvement and customization. In this research topic, you are supposed to analyze the recently proposed advanced scheduling techniques in Linux (e.g., sched-ext) and extend it. Here are some starting point:
- Linux Extensible Scheduler Class
- OS Scheduling with Nest: Keeping Tasks Close Together on Warm Cores
- The Benefits and Limitations of User Interrupts for Preemptive Userspace Scheduling
- Developing Process Scheduling Policies in User Space with Common OS Features
- Enoki: High Velocity Linux Kernel Scheduler Development
- Achieving Microsecond-Scale Tail Latency Efficiently with Approximate Optimal Scheduling
- Syrup: User-Defined Scheduling Across the Stack
- Caladan: Mitigating Interference at Microsecond Timescales
- SFS: Smart OS Scheduling for Serverless Functions
Topic 3. Storage management
For a decade, storage device has been (almost) completely shifted to SSD from HDD. Today’s SSD provides very high bandwidth and extremely low latency. As the storage device becomes faster, the software overhead from Linux kernel takes more significant portion. Io_uring
has been proposed to address the linux kernel overhead in IO by adopting FlexSC-like system call approach. Moreover, recently new SSD standards, such as Flexible Data Placement (FDP) and Zoned Namespace SSD (ZNS), have been proposed, promising higher performance with lower price. Thus, a lot of work have been going on to better support FDP/ZNS SSD. In this research topic, you are supposed to analyze the recently proposed advanced storage management techniques in Linux (e.g., io_uring
and FDP/ZNS-optimized filesystems) and extend it. Here are some starting point:
- Towards Efficient Flash Caches with Emerging NVMe Flexible Data Placement SSDs, EuroSys’25
- Large block sizes (LBS)
- Ringing in a new asynchronous I/O API
- The rapid growth of io_uring
- What’s new with io_uring
- Zonefs: Features Roadmap
- SSDFS: ZNS SSD ready file system with zero GC overhead
- Improving data placement for Zoned Linux File systems
Topic 4. Safe Customization of Linux Kernel with eBPF
The promise of general-purpose operating system, like Linux Kernel, is getting harder to achieve. If one size does not fit all, the kernel should provide a safe mechanism – eBPF (extended Berkeley Packer Filter) – to specialize its behavior (e.g., policies) for a certain class of workloads. Already many advanced kernel features, such as MGLRU and ghOSt, start adopting eBPF and allow users providing their own policies. In this research topic, you are supposed to write (or extend) one useful eBPF extension (not a toy example). Following research papers will be a starting point of your exploration:
- eBPF, https://ebpf.io/
- DINT: Fast In-Kernel Distributed Transactions with eBPF
- Fast, Flexible, and Practical Kernel Extensions
- XRP: In-Kernel Storage Functions with eBPF
- eBPF-based FUSE
- Revisiting eBPF Seccomp Filters
- Overview of the BPF networking hooks and user experience in Meta
Topic 5. Make Linux Kernel Secure
One single memory bug (e.g., heap overflow) in Linux kernel can make the entire computer crash or allow an attacker secretly hack your computer. Recently there have been a couple of efforts to make the Linux Kernel more secure. These include 1) the integrity check of code pointers (CFI: control flow integrity), 2) address space isolation within a monolithic kernel (ASI), and 3) writing (a part of) kernel using a memory/type-safe language, Rust. In this research topic, you are supposed to extend one of such approach (or write a useful in-kernel Rust code).
- Linux Kernel Control-Flow Integrity Support
- Address Space Isolation in the Linux Kernel
- Mitigating speculative execution attacks with ASI
- A pair of Rust kernel modules
- Using Rust for kernel development
- Rust for Linux: Status Update
- Linux Rust NVMe Driver Status Update
Projects Milestones
Project Proposal (10 points)
You will need to turn in a two-page project proposal. This proposal should clearly contain the following information:
- What is the problem being addressed by the project?
- Why is this problem important?
- What is the end goal of your project?
- How will you solve this problem?
- How will you evaluate your project (e.g., performance)?
- What are deliverables of your project (e.g., kernel code, user library)?
- What is the role of each project member?
Project Mid Report (25 points)
You will need to turn in a five-page interim report and a tarball of the source code that you are developing on the current status of their research. A re-iteration of your proposed goals, with explicit discussion about what progress you have made to date on those goals and what your time-line is for accomplishing the rest of them by the end of the semester. The progress report should clearly contain the following information:
- A well-filled-out background and related work section citing the appropriate work from the literature.
- An overview of the development status of their project as related to the goals discussed in the initial proposal. The implementation progress should be explicitly described.
- Any information about whether your original plans have changed and an explanation as to why.
Final Report (50 points)
A student will need to turn in a eight-page final report and a tarball of source code that you developed. The final report should contain the following information:
- The problem, motivation, results, conclusions, and possible future work.
- A re-iteration of your proposed goals, with explicit discussion about what progress you have made to date on those goals.
- Explicit description on your implementation status.
- A discussion of the experimental results that you collected to evaluate their implementation.
- An outline of concrete tasks for future work to expand or improve your implementation.
Presentation (15 points)
A student will need to turn in a presentation slide and recorded presentation video (15 mins). The presentation video must include the demo of your project.
Formatting Guidelines
All reports should be written using the following format guidelines:
- U.S. letter-sized pages
- Two-column format
- Reasonable margins
- 10-point Times Roman or similar type on 12-point leading (single-spaced)
Please use the latex template here, or Overleaf if you prefer.