CS 4604 Homework #6

(the fun homework)

Date Assigned: Nov 13, 2002
Date Due: Nov 22, 2002, in class, before class starts
  1. (10 points) An indulgent (and appears-to-be-worthless) pasttime in the movie industry is to make "connections" among actors and actresses via the movies they have acted in (together). For example, to make a connection from Tom Hanks to Kevin Bacon, it just takes one step (Both of them have acted in "Apollo 13", for instance). But to make a link from Meg Ryan to Kevin Bacon, it takes two steps (they have to be "routed" through Tom Hanks, because they haven't acted in a movie together). So from Ryan, you get to Sleepless-in-Seattle, from there you go to Hanks, then you take Apollo 13 to Bacon. In fact, some UVA students have mined the Internet Movie database to create an "Oracle of Bacon"! Similar procedures exist in the academic world (Erdos number, where people are linked by the common publications they author) etc. Academicians and researchers then proudly flaunt their erdos number! Popular fables in India claim that with "six intermediaries", one can reach anybody else in the world! (So Kings' "power" used to be measured with this number; myth has it that nobody reached 6, because the gods ensured that no king became so powerful :-)) A full blown article on this appeared in the Dec 1998 issue of DISCOVER magazine.

    Anyways, back to CS 4604. Not to be outdone, we intend to use database technology to identify connections among people. Assume that you are given the following tuples in a two-column relation called Acts(star,movie):


    (Uma Thurman, Mad Dog and Glory)
    (Robert De Niro, Cape Fear)
    (Woody Harrelson, Natural Born Killers)
    (Dan Aykroyd, The Actor)
    (Judge Reinhold, Beverly Hills Cop)
    (Jamie Lee Curtis, True Lies)
    (Woody Harrelson, Palmetto)
    (Nick Nolte, 48 Hours)
    (James Russo, Beverly Hills Cop)
    (Eddie Murphy, 48 Hours)
    (Jamie Lee Curtis, Perfect)
    (Tommy Lee Jones, Natural Born Killers)
    (Dan Aykroyd, North)
    (Tara King, 48 Hours)
    (Robert De Niro, Sleepers)
    (Nick Nolte, Cape Fear)
    (Juliette Lewis, Natural Born Killers)
    (Jessica Lange, Cape Fear)
    (Eddie Murphy, Beverly Hills Cop)
    (Robert De Niro, Mad Dog and Glory)
    (Robert Downey Jr., Natural Born Killers)
    (Eddie Murphy, Trading Places)
    (Juliette Lewis, Cape Fear)

    Your task is to make a connection between Woody Harrelson and Eddie Murphy, using *ONLY* the data shown above. Of course, you realize that we are not really worried about how these two are related (in the totality of real-life data), but to see if you can effect the connection via an SQL query. So, use the diagram below; your goal is to fill out the missing two star names and three movie names. To do that, you need to write an SQL query and run it only on the data shown above (again, we are not interested in "complete and real" data, only the results on the data shown above).

    For full credit, submit your SQL query and the result that you get. To do this exercise, you need to actually sit in front of the computer, load the data, write the query and report on the answer (HINT: There is only one solution).

    FAQ: I found a "shorter path" between these two actors!!! Is that okay?

    Answer: No, that is not okay. Pl. realize that our goal is not to mine the relationships among actors but to learn some database query writing. Your answer should contain exactly three movies and two actors.

    FAQ: I found the answer, just by looking at it. Is that okay?

    Answer: [Sigh..] No, again. We need the SQL query.

  2. (20 points; from the Spring 2001 final) This exercise is to help you think out-of-the-box on the use of database programming for solving everyday problems. You are given the cryptarithmetic puzzle:
    
       SEND
     + MORE
    -------
      MONEY
    -------
    
    The goal of the puzzle is to substitute numbers (from zero to 9) for letters, so that the addition works out. The same number should be used for a given letter, throughout. So, if you guess "5" for the letter E, then E should get the value "5" everywhere. Explain how you will solve this puzzle by creating some database tables and writing a query. For full credit, show (i) what your tables look like (actually, only one table is enough), (ii) the SQL query that you write, and (iii) the answer you get (for the puzzle). Write enough to indicate that you know what you are talking about.

  3. (20 points) Consider the following example instance of the AKnowsB relation: it is a veritable list of who-knows-who.
    ------------------------------------
    this person knows | this other person
    ------------------------------------
    (Little Jimmy     | Bill Clinton)
    (Tommy Hanks      | Bill Clinton)
    (Bill Clinton     | Bill Clinton)
    (Tommy Hanks      | Kofi Annan)
    (Little Jimmy     | Kofi Annan)
    (Bill Clinton     | Kofi Annan)
    (Kofi Annan       | Bill Clinton)
    (Tommy Hanks      | Little Jimmy)
    ------------------------------------
    
    Thus, the data means that Little Jimmy knows Bill Clinton, Tommy Hanks knows Bill Clinton and so on. Notice that sometimes people are listed as knowing themselves. For example, Bill Clinton knows Bill Clinton. Sometimes they are not listed like this (like Kofi Annan doesn't seem to know himself).

    A celebrity is defined as somebody who is known by everybody other than himself/herself. So, Bill Clinton is an example of a celebrity since he is known by everybody else (Little Jimmy, Tommy Hanks, and Kofi Annan). Kofi Annan is also a celebrity since he is known by everybody else (Little Jimmy, Tommy Hanks, and Bill Clinton). Never mind that he doesn't know himself (that is not involved in the definition of a celebrity). But Little Jimmy is not a celebrity since he is known only by Tommy Hanks.

    Write a query in SQL (using only constructs given in Sections 6.1 and 6.2 and perhaps VIEWS) to find all the celebrities in a given instance of AKnowsB. Be careful about the condition when some people are listed as knowing themselves and some are not listed as knowing themselves (we don't know in advance who these people are).

    The above is just example data; your query should work for all possible inputs.

    Also, we are only looking at "direct knowings". So, if P1 knows P2 and P2 knows P3, this doesn't mean that P1 knows P3 (unless this is listed separately).

    FAQ: Is there a master list of "all people" that I can use as a reference?

    Answer: Not really. You can take apart the two columns of the AKnowsB relation and union them together to get a master list.

    FAQ: You say that a celebrity is somebody who is known by everybody other than himself (or herself). So, the definition of the celebrity would be different for everybody (for Bill Clinton, it means "known by everybody other than Bill Clinton"; for Little Jimmy, it means "known by everybody other than Little Jimmy", and so on). How can we express this in SQL? Do we have to write different constructs for each person?

    Answer: You don't need to and you shouldn't. There is a way to write the query so that it works for everybody, without hardwiring the names of anybody.


Return Home