CS 4604 Course Project: Step 1
General Description
The goal of the class project is to implement a database system
application. This includes the following activities that
will be spread over the entire semester:
- Identifying an application area for which database systems would
prove beneficial,
- Determining the functionalities of the application,
- Data Modeling (Identifying the classes, objects, roles, relationships,
constraints etc.),
- Designing and perfecting the database schema,
- Populating the database and
- (most importantly) Writing the code needed to embed the database system in
the application
Projects should be done in groups of 2-5 students. No more than 5 members are
allowed per project, although less than 2 is acceptable. It is required that
at the end of the semester
the projects should have a web based interface (that will also make them
more cool). That is also part of the reason why you will do this project as a group;
to enable cute systems. :-) You are free to choose your own project members;
if you would like the instructor to assign you to a group, meet him during
his office hours.
Project Ideas
These are just a sampler. You are free to propose your own ideas.
Also realize that these are not complete descriptions; just themes. You
need to work on these more and develop your ideas more concretely.
Furthermore, don't
get intimidated by the examples that are linked from this web page. That
is just to give you a feel for the application domain. It is up to you
to narrowly define the scope of the application within the time-frame
of a semester long project. Also
don't forget that you are supposed to have fun! :-)
- Nobel Awards Database: There actually exists one that is
web-enabled here.
The goal is to model and populate information about the awards made in
the various fields (Physics, Chemistry, Physiology or Medicine Literature,
Peace and the Economic Sciences), the recipients, their countries,
their year of birth etc. Your system should be able to answer questions
such as "When was the first time an Asian won an award for the economic
sciences?" and such cute things (the answer to this particular question is
1998). You could also work on variants of this idea; such as the
recipients of the ACM awards (unfortunately, there is not too much information online about this). Interesting
queries then could be "Name people who have won at least two different
awards" (the answer would include Knuth, Thompson, Ritchie, Engelbart etc.)
Or the people "who were ACM Fellows before becoming Turing Award Winners"
and so on.
- Books Database: This is yet another popular domain. You just
need look at barnesandnoble.com,
amazon.com for
excellent examples. You could model entities such as books, their authors, topics (which may be a complex hierarchy). You may also model various attributes of
the authors, the institutions they belong to, etc. You can support a buy/sell service of used books, books used in specific university courses. A personal
profile of people (and the books they like) can be built and your
database application could form the basis for a "recommender system",
such as those supported by the commercial sites. The goal here is to
"cluster" similar preferences together and the system can then make
recommendations: "Since you liked Pride and Prejudice, I recommend
that you try Sense and Sensibility, too".
- Movies Database: There are several beautiful movie
resources on the web, such as the
hollywood.com movies site
or the Internet Movie Database.
You could model entities such as movies, their actors,
directors, genres, playing times, reviews. There are several sources
on the web
from which you could get data to populate such a database. You can support
various queries such as finding specific playing times, finding movies in
Blacksburg
directed by a given director. You can also support updates to the reviews
section of the database (e.g., viewers giving their own opinions). Another
functionality is to provide personal profiles of people (i.e., the movies they
like) and then try to recommend movies to them based on profiles of viewers
with similar tastes. Or you could go onto the OSCAR awards data or
Golden Globe nominations etc. ("Find all the sitcoms that have been nominated
three times in a row")
- Apartment Homes: Our friendly neighborhood web guide is
here. This domain would require modeling apartments and their attributes, areas of town and their various characteristics (e.g., BT bus lines, crime rate distance from various
landmarks). You would provide an interface for offering apartments for rent, finding apartments based on various requirements ("gas heating + pets allowed +
rent less than 500 + close to campus + BEV modem facility").
- Research Literature: This domain involves modeling
research publications. You need to identify the title of the publication,
the forum it was published in, the authors, topics, keywords and related
subtopic areas. This is a big business now (under the name of digital
libraries). For example, the ACM digital
library provides a beautiful searchable index (and retrievable
repository, but that is beyond our scope) of nearly all of the
publications of ACM (isn't that amazing). If you
use this domain, then there are a lot of available resources for you
to use. The ACM computing classification system provides a convenient hierarchial meta-index
that you can use to organize your class hierarchy etc. If you
are interested in a smaller domain, then the DBLP Bibliography Site
provides a searchable facility for publications related to the database
and programming communities.
At the end of the day, you could identify papers written by a particular
person at a particular place or ones in a narrowly defined area.
- Web Sites: How do you think web search engines such as
Google
model their
domain? You could think of them as a glorified database system where the
basic entities modeled are web sites. You could then model their
various properties: Topic, URL, domain name, other sites they link to,
color of their background etc. Retrieval could be for sites that
have similar characteristics and properties.
- Others: Of course, there are a whole host of other
ideas such as bank accounts, student records, NBA data, election
results, Florida ballots, Chad-databases,
senate demographics, car rentals, auto insurance,
calling plans (sifting through the confusing 10-10-220 and all that stuff),
consumer products, courses at Virginia Tech, hokie statistics,
"match-making services" (:-)) and so on.
Deliverables
Assigned: 08/30/2002
Due Date: 09/06/2002, in class, beginning of class
- Form a team first and decide on a good application. It is
preferable that every group focuses on a different application.
To overcome conflicts, I recommend that you send an email to the TA
(fmin@vt.edu)
about your application and he would maintain a dynamic list of
team members and their respective domains at this web page.
Before
sending him an email, make sure that your application is not
already taken!
- (100 points) Write a two page project proposal/description in the following
format: (i) Name of the project and list of team members,
(ii) What is the domain, (iii) What are the application specifications (i.e.,
what functionality will your completed system provide),
(iv) What will be modeled by your system (and what will not), (v) What is the
role of each project member in the project, (vi) What other "value-added"
facilities could your system support (but that you will not
build explicitly). The goal is for us to mutually agree on a do-able project.
If you have questions, meet the instructor or the TA during their office
hours.
Frequently Asked Questions
- Since this is a preliminary project proposal, how can we be sure
what the role of each project member will be? How detailed should
we get into this in our writeup?
Answer: This is a good question. You don't have to "commit"
to anything now; what we are looking for in this question is
to see if any of the group members brings special talents/experiences
to bear upon the project. For example, if you are building a botanical
database and there is a student from the biology
department in your group, (s)he can help identify bad design choices from
a biology point of view. If one of you has experience in web-based software
development, then that would be a good thing to mention. etc.
- Can we turnin the proposal as a Microsoft WORD document via
email?
Answer: No, please. I am glad you asked that before sending it
in. I do not use Windows or Macs. I use a UNIX workstation. I cannot read
documents in Word, Powerpoint, or one of those things. Please either give me
a hardcopy or send the document in a neutral/friendly format (such as postscript
or PDF or text or HTML).
- Do we have to draw some nice E/R modeling diagram or
give ODL listing for our application?
All within one week?
Answer: No. We just need an English document that says what you will
do and some details of what the functionality of the final system is.
Believe us, you will have ample opportunity to do E/R, and all that stuff
later on in the semester! :-)
- Can you explain what you mean by "value added facilities"?
Answer: For example, in the books domain, this would mean the
"recommender system" that makes selections of books for potential customers
based on buying trends. In other words, the recommender system is a facility
that will be enabled by the presence of a database system. This question
is intended to set you thinking in larger-scope, out-of-the-box, and if
you can explain potential advantages of databases from a user point
of view. If you are an IS person for a corporate organization, you will
frequently need to "justify" investing extra resources into developing
something like a database/web-system etc. One way to do that is for you
to think of what kinds of applications that a DB can "enable".
- In the part, "what will be modeled (and what will not)", what is
expected?
Answer: For example, in the books domain, you can write something
like "We will model books, authors, publishers, and printers" but not
"reviews, bookstores and sales figures". Just to be clear on what
will be the final outcome.