Projects:
The effect of Kernel prefetching on file system buffer cache
Designing effective block replacement algorithms to
minimize file system buffer cache misses is a challenging task.
Despite the well-known interactions between prefetching and caching,
almost all buffer cache replacement algorithms have been proposed and
studied comparatively without taking into account file system
prefetching which exists in all modern operating systems. In this
project we studied the effect of such kernel prefetching and showed
(SIGMETRICS'05) that it can have a significant impact on the relative
performance in terms of the number of actual disk I/Os of many
well-known replacement algorithms; it can not only narrow the
performance gap but also change the relative performance benefits of
different algorithms. The goal of the project is to demonstrate the
importance for buffer caching research to take file system prefetching
into consideration.
More information on this project is available at the
AccuSim webpage.
Predicting program behavior using program-counter
Borrowing
from the ideas in computer architecture research, it was determined
that the program counter can also serve as an indication of program
behavior for the operating system kernel. This insight was leveraged
in two applications, power management of hard disks (IEEE TC'06) and file system
buffer cache management (OSDI'04), with promising results. The
approach is implemented in the Linux kernel.
More information about the project can be found at the
PCOS webpage.
Peer-to-peer resource management
Modern resource sharing
systems comprise of thousands of resources, and peer-to-peer (p2p)
approaches can be used to provide resource self-organization in the
presence of failures. We designed two projects based on this
concept. First, we used p2p mechanisms to manage Condor pools
(SC'03, JPDC 66:1). Condor is a distributed system that allows sharing of
resources within an administrative domain. We developed an automatic
collaboration framework that allowed remote pools to discover each
other, and therefore enabled resource sharing across administrative
domains. Second, we applied the p2p approach to harness idle disk
space on nodes within academic and corporate setups (SC'04, JoGC'06). We
developed a distributed file system by extending the Network File
System (NFS), which allows sharing of idle disk space in a transparent
manner.
More information about the project can be found
here.
Ensuring fairness in resource sharing
We observed that in resource sharing
systems some users tend to only utilize resources without contributing
resources to the system. This creates an imbalance and results in the
eventual collapse of the system. We developed a DHT based
accountability and feedback mechanism that would allow users in the
system to determine "credit-worthiness" of other users (VM'04,
PPoPP'05, SC'05, JoGC'06).
This information can then be used to decide whether or not
to allow exchange of resources with a particular user. The project
solves the practical problem of fairness in sharing by providing a
distributed accountability mechanism. Greedy users can be quickly
identified and secluded, resulting in a robust system.
More information about the project can be found at the
GridCop webpage.
Query caching in peer-to-peer networks
We observed that p2p
query traffic exhibits temporal locality and can benefit from
caching. In the first part of this project, a query caching proxy was
installed at the boundary of an organization and queries originating
from inside the organization were cached. We refer to this as
forward-caching. Next, we cached the queries originating from outside
the organization and forwarded to inside (WCW9). We refer to this as
reverse-caching. We found that if the cache capacity is fixed in terms
of the number of cached query replies, the forward and reverse query
caching are equivalent; both in hit ratio and in bandwidth
savings. The project provided insight into caching for reducing p2p
traffic (which is now the most prevalent traffic on the Internet), and
improving bandwidth utilization.
More information about the project can be found
here.
Designing computational grid portals
Computational
grids provide computing power by sharing resources across
administrative domains. This sharing, coupled with the need to execute
untrusted code from arbitrary users, introduces security hazards. Grid
environments are built on top of platforms that control access to
resources within a single administrative domain, at the granularity of
a user. In wide-area multi-domain grid environments, the overhead of
maintaining user accounts is prohibitive, and securing access to
resources via user accountability is impractical. Typically, these
issues are handled by implementing checks that guarantee the safety of
applications, so that they can run in shared user accounts. We showed
(JPDC 63:10, IPDPS'02) that safety checks -- language-based,
compile-time, link-time or load-time -- currently implemented in most
grid environments are either inadequate or limit allowed grid users
and applications. A survey of various grid systems was done that
highlights the problems and limitations of current grid
environments. A runtime process monitoring technique was also
proposed. The approach allows setting-up an execution environment that
supports the full legitimate use allowed by the security policy of a
shared resource.
|