Program Testing

Problem

Develop an effective suite of test cases for a program that inputs a "sentence" and outputs a diagram of the sentence. A diagram of a sentence is a textual represention of the grammatical structure of the sentence, including parts such as the subject, verb, and object. A simple grammar is used by the program. This grammar describes an infinite number of simple sentences using a very limited vocabulary and a restricted set of grammatical constructions. The effectiveness of the suite of test cases is judged by two criteria:

While not an explicit criterion, a "good" suite of test cases is also minimal, meaning that valid, but redundant, test cases are not present.

Grammar

The extended BNF grammar for simple English sentences used by the sentence diagramming program is as follows:.

<sentence>    --> <subject> <verb_phrase> <object>
<subject>     --> <noun_phrase>
<verb_phrase> --> <verb> | <verb> <adv>
<object>      --> <noun_phrase>
<verb>        --> lifted | saw | found
<adv>         --> quickly | carefully | brilliantly
<noun_phrase> --> [<adj_phrase>] <noun> [<prep_phrase>]
<noun>        --> cow | alice | book
<adj_phrase>  --> <adj> | <adj> <adj_phrase>
<adj>         --> green | lean | mean
<prep_phrase> --> <prep> <noun_phrase>
<prep>        --> of | at | with

This grammar can generate an infinite number of sentences. One sample is:

    mean cow saw carefully green alice with book

(For simplicity, we ignore articles, punctuation, and capitalization, including Alice's name or the first word of the sentence.)

Program Input and Output

The sentence diagramming program reads "candidate sentences", one per line, and attempts to interpret the line's contents (a whitespace-separated list of words) as a sentence according to the program's grammar (see above). 

The program produces a "diagrammed" version of the input string, which means a sentence in properly parenthesized form. "Properly parenthesized" means that each nonterminal appearing in the input string now has parentheses around it (omitting all redundant parentheses). For instance, the input string:

    alice found mean green book

would be parenthesized as

    (("alice") ("found") (("mean" "green") "book"))

The parentheses show that "mean" and "green" form an adjective phrase (<adj_phrase>) which together with "book" forms a noun phrase (<noun_phrase>) or object (<object>). The parentheses also shows that "alice" is the subject <subject> of the sentence through the sequence of grammer rules (<noun> --> <noun_phrase> --> <subject>). Similarly, "found" is the verb (<verb_phrase>) in the sentence.

A more complicated example is

 "mean cow saw carefully green alice with book"

which returns

    ((("mean") "cow") ("saw" "carefully") (("green") "alice" ("with" ("book"))))

In addition, there are two distinct error conditions. First, if a given string does not consist of valid tokens, the program produces the output:

    Input has invalid tokens.

Second, if the input consists of valid tokens but it is not a legitimate sentence according to the grammar, the program produces the output:

    Input is not a sentence.

Note that the "invalid tokens" message takes priority over the second message; the "not a sentence" message can only be issued if the input string consists entirely of valid tokens.

Test Case Format

Test cases are written in plain text in a stylized format. Each test case is made up of two main parts: one line of input  and a corresponding line of output that is expected to be produced by a correct program.

The test file format uses "//" at the beginning of a line to identify lines with special meaning. Specifically, any line that starts with the character sequence "//==" denotes the start of a test case. Any text on the remainder of the line serves as a "name" or label for the test case (for the purposes of identification if that test case fails). Lines following this marker are input lines. A later line starting with "//--" marks the end of the series of input lines (for this assignment, each test includes only one input line) and the start of the corresponding output line(s). Any other line starting with "//" are treated as comments and are ignored.

For example, here is a simple test case based on the example above:

//== Testing the alice/book example.  Here is the input:
alice found mean green book
//-- This is the expected output:
(("alice") ("found") (("mean" "green") "book"))

This single test case contains one line of input, followed by one line of output. A test case can have as many (or as few) lines of input, and as many (or as few) lines of output as you would like. For this assignment, however, each test should contain exactly one line of input and one line of output.

Your test case file can have as many test cases as you like. Just place them one after another. Be careful of blank lines, however--these will treat them as significant lines in the input or output section of some test case. If you wish to separate your test cases within the file, use // comment lines instead, since they will be ignored by the script.

Submitting Your Test Cases

Place your test cases in a single text file with a .txt extension. Submit that single text file electronically through Web-CAT for automatic evaluation and feedback. Be sure to include comments at the beginning of the file that identifies all of the team members submitting this work.

To submit to Web-CAT, login using your Virginia Tech PID and PID password here: http://web-cat.cs.vt.edu/login.html. After logging in, you will see your Web-CAT home page. This assignment should be listed there under "Assignments Accepting Submissions". On the right-hand side of the line listing this assignment you will see a submission icon (). If you hover you mouse over this action icon, the tooltip says "submit to this assignment." Click this icon to go to the upload page for this assignment, and submit your file. Follow the prompts, and you will see the results of running your tests in your browser.

After making your first submission, you'll notice that Web-CAT will show a summary of the results from running your tests, indicating which (if any) have the incorrect expected output. Fix those test cases. Web-CAT will also give you a measure of how thoroughly your test set covers all of the possibilities for a solution to this sentence diagramming problem. If your test coverage is less than 100%, that means there are some conditions in this problem for which you have not yet written tests. Look back at the grammar and think about what kinds of sentences--or what kinds of errors--you may have overlooked. Try to be systematic in covering all the possibilities. Don't be discouraged if it gets harder and harder to think of useful test cases: the first 85% is easy, but it gets progressively harder and harder as coverage gets close to 100%.