Notes on SAFE Thread-specific, Flow-insensitive, Context-insensitive Points-to analysis B.G. Ryder, January 2006 As in the previous note (points-to-unsafe), we will be coding a flow-insensitive, context-insensitive Andersen-style points-to analysis using an on-the-fly call graph, derived from our analysis results. We will be doing this analysis using a worklist algorithm to drive the entire process and restricting our attention to reference assignments that occur within those methods reachable from the initialization of a specific thread. The difference between this safe analysis and the previously described unsafe one is that we will be accounting for the possible effects of other threads on the points-to relations of objects accessed by the thread in focus in our analysis. The difference shows up on how we process the reference assignment statements affecting the points-to relations as discussed below. Intuitively, we have to keep information about whether or not the object may escape the current thread, in which case we need to use its global points-to set (calculated by a whole program version of the Andersen-style analysis) rather than our computed thread-specific points-to. Details: First we have to describe how to estimate whether or not an object may escape a thread.If we assign an object to a static field, then that object 'escapes' (as do all the objects reachable from it along points-to edges in the final points-to solution). If we assign an object O to a field of an escaping object, then object O (and all objects accessible from it) can escape. Note that any object of Runnable type (or any objects pointed to by that object) may escape. Also, the receiver thread object at the topmost level of execution (and any object it may point to) may escape. Second, we assume that we do a whole-program Andersen-style points-to analysis before starting our thread-specific analysis; we will call this 'global points-to'. Third, we need to describe how the rules for points-to effects of each assignment statement are different when there might be escaping objects. So during our algorithm, objects will be marked as "escaped" as we discover this information. We have to deal with 4 types of reference assignment statements 1. p = new X(); 2. p = r; 3. p = r.f; 4. r.f = p; By using temporaries, we can convert r.f=p.g into one of these forms. Let's discuss the points-to side effects of each of these statements. 1. p = new X(); this creates a new object OX. in the points-to graph, OX corresponds to a node and p corresponds to a node (these nodes need to be created if they are not already in the graph). after this statement, the node for p points to the node for OX. 2. p = r; this causes the node representing p to point to all the nodes that the node representing r points to. 3. p = r.f; this causes a lookup of all the objects currently pointed to by the node representing r to be put into set S. For each object in S, if it is labelled "escaped", then use the global points-to relation. Follow the edge labelled 'f' from that object and make the node representing p point to the target of this edge. If it is no labelled "escaped", then use the thread-specific points-to relation, follow the edge labelled 'f' from that object, and make the node representing p point to the target of this edge. 4. r.f = p; this causes a lookup of all the objects currently pointed to by the node representing r, set R. we also must lookup all the objects pointed to by the node representing p, set P. now we must make each element of R, OR, have an outgoing points-to edge labelled 'f', that points to each element of P, OP. Note: if OR is marked "escaped", then OP (and any object reachable from OP) must also be marked "escaped". Recall that when there is a call involving reference arguments, then pseudo assignments are setup between each reference argument and formal parameter in the target method. These are then processed the same as described above.