Here are some more clarifications on Project 1.
How to deal with analyzing the libraries?
To ease the confusion over the different answers that have been
given to the question "how to deal with libraries?", let's decide
to do an incomplete RTA in project 1, and not to analyze the
libraries. If you execute SOOT without the option "analyze-context"
then this is what occurs because the library code is not made
available to your analysis by SOOT. If you do not analyze libraries
then the call graph output will be considerably smaller (we will see
how much smaller in project 2), and thus much easier to grade.
What are the SOOT options? Type java soot.Main to see all the possible options for running SOOT.
How to deal with constructors?
Constructors will appear in the Jimple as functions with the name
<init>. To get this function name into XML you cannot use
the angle brackets because they are interpreted by XML as
metacharacters with their own meaning. Instead you should put the
characters & l t ; into the println string argument because they will
result in a "<" being printed in the XML output. Similarly, you should
put the characters & g t ; into the println string argument for ">".
How to deal with native code?
You cannot analyze native code with your RTA algorithm. In an acutal
compiler either stubs are created for native code with the appropriate
behaviors modelled, or worst case assumptions about the dataflow being
computed have to be made. We are going to ignore native code.
How to deal with class initializers?
Static class initializers are codes called when a class is touched for
the first time
(e.g., one of its methods are called or one of its fields are
referenced). This initializer represents an implicit entry point into the
Java code (besides your main()). It appears with the name
<clinit> in the Jimple and must be added to the call graph
for the same reason we add the constructors, that is, other user methods may
be called by these methods. Note: when you add these functions to your
call graph, the graph may become disconnected, because these functions
are not necessarily reachable from main().
Why can't I reproduce the RTA answer on the in-class example? The functions used to inquire about Jimple calls in SOOT actually also sometimes optimize the method calls. The SOOT system is able to tell if only one object reaches a call site (discovered as part of the translation process of bytecode to Jimple) and therefore the system can resolve that call site more precisely than we expect from RTA. This happens for the RTA example in the class notes where we had two functions f() and g() each of which contained calls to foo(). The call in g() only calls C.foo() according to the Jimple functions although by RTA (as explained in class) the call could also resolve to B.foo(). We will just have to live with this anomaly.
What do i include in my documentation?
If you make any
assumptions while doing the project, which are not covered by these
clarifications, you should include them in your documentation of your
project.
You will receive an email about how to obtain the Java Spec data that you need to run your RTA analysis on.
To clarify what is meant by the March 5th note about libraries: if a class extends a library class, then you will, of course, have to include that library class in your call graph. However, if a library method is called from a user method, and there is a call chain of some length into the library, you should not report those calls in your call graph. This is simply to make the grading of the assignment easier to accomplish.
You will need to include the following in your tar file for project 1:
Please include only user code in the call graphs you are determining with your Project 1 implementation. This means you can ignore library functions, constructors (some of which are called implicitly), class initializers, etc. (Note: a true call graph construtor could not ignore these functions.) Remember that you need to consider Java interfaces as well as classes in your RTA algorithm. The original RTA algorithm was developed for C++ and therefore did not mention interfaces.
The XML format for your output can be described by two files: 1.xml gives an example of an output created from a small Java program, and callgraphformat1.xsd defines the rules that govern the tags in this XML file. For example, these rules forbid parallel edges in the resulting call graph. The output in 1.xml can be seen as a tagged listing of methods, showing for each method, the other methods it can call. Notice that each method has a unique ID assigned by your program. (This XML schema was written with the help of program XMLSPY, which is available for trial running online.) These files should be tested and ready for class use on or before March 6, 2003.
Click here for clarification of our RTA plus inheritance discussion in class.