Capturing the Reference Behavior
of Linked Data Structures

Gregory Kulczycki
Murali Sitaraman
Clemson University
gregwk,murali@cs.clemson.edu



William F. Ogden
The Ohio State University
ogden@cis.ohio-state.edu



Joseph E. Hollingsworth
Indiana University Southeast
jholly@ius.edu




Copyright ©2002 by the authors. All rights reserved.1

May 3, 2002

Abstract:

The use of references in programs can complicate reasoning and break modularity. Language designers need to identify the predominant reasons for employing references in high-level languages and offer safer and more manageable alternatives. One area where references are used is in the implementation of linked data structures such as lists and trees. This paper presents a concept--Location_Linking_Template--that captures the behavior and performance benefits of traditional pointers and is especially useful for implementing linked data structures. The concept is used in conjunction with a simple component design methodology to enable programmers to isolate the complex reasoning about reference behavior to a small portion of an overall program while preserving program modularity. Through this concept, the conceptual model of pointers is presented in a new way that students and programmers may find easier to grasp than traditional explanations of pointers.

Introduction

References in high-level languages are useful for implementing software efficiently, but the mechanisms that provide reference behavior in popular programming languages are subject to abuse and routinely break modular reasoning.

A partial solution to this problem involves an efficient mechanism for data movement and parameter passing that does not introduce aliasing, such as swapping [3,17]. Swapping eliminates the need to reason about aliasing for the vast majority of components in a software system, but there are a few situations where the ability to use aliasing is necessary for efficient computing despite the added complexity. In particular, the implementation of linked data structures such as lists, trees, and graphs requires the use of aliasing for efficiency. The Location_Linking_Template introduced in this paper provides an alternative to implementing these data structures with traditional pointers.

Section 2 of this paper describes the problem that has continued to face language designers since references were first introduced into high-level languages. It explains the traditional benefits of references and also discusses why they present problems. Section 3 offers a two part solution to the problem. The first part--providing mechanisms for efficient data movement that do not introduce aliasing--is discussed only briefly and the reader is referred to other papers for an depth explanation. The second part is the main focus of this paper. It involves a general strategy for component construction centered on the Location_Linking_Template, a component that effectively captures the reference behavior of linked data structures. Section 4 gives an informal introduction to the component and section 5 covers its formal specification in some detail. Finally, section 6 gives an example of its use in the implementation of a List component. Appendix A contains the specification for Location_Linking_Template, Appendix B contains the List_Template and its realization, and Appendix C contains a realization for List_Template that presents possible syntactic sugar for the operations in Location_Linking_Template.

Problem

Ever since their introduction into high-level languages over 30 years ago, there has been a steady stream of warnings about the dangers of references. Tony Hoare called their introduction ``a step backward from which we may never recover'' [4] and presented a detailed list of reasons for that claim in his paper on recursive data structures [5]. Around the same time, Dick Kieburtz continued to explain why we should be programming without reference variables [8], and Steve Cook published his seminal paper on the soundness and relative completeness of Hoare logic [2] in which he identified aliasing of arguments to calls (even in a language without reference variables) as the key technical impediment to modular verification. There have been recent papers [13,14] showing how it is technically possible to overcome such problems, but apparently only at the cost of even further complicating the programming model that a language presents to a software engineer.

Benefits of References

Despite these warnings, references continue to be used routinely in today's popular programming languages. Why? One reason may be the notorious claim that ``Any problem in Computer Science can be solved with another level of indirection.''2 This catchy saying has become a rule of thumb for many programmers, and since references represent a prime source of indirection in computer programs, the adage seems to imply that references are the primary mechanism for solving tricky programming problems. But vague claims such as this offer little help in deciding what specific problems references help to solve. Certainly at higher levels of abstraction any so-called indirection in a system will be a product of the system design and would have nothing to do with whether the implementation language supported references.

Specific areas in which high-level imperative languages benefit from references almost always involve efficiency--efficient data movement, efficient parameter passing, and efficient implementations of linked data structures.3 Reference variables provide efficient data movement by allowing a programmer to copy an object's reference rather than its value, since value copying is expensive for non-trivial objects. Call-by-reference parameter passing is more efficient than call-by-value for the same reason. When implementing linked data structures, references allow a programmer to insert or remove objects efficiently from arbitrary locations in the structure.

Problems with References

References tend to be confusing for students to learn and difficult for programmers to use. This is partly why Andrew Koenig and Barbara Moo argue for a rethinking of how C++ is taught in [9]. Koenig's introductory C++ textbook introduces vectors, strings, and structures before even mentioning pointers because ``pointers are a slippery subject to master, and beginners have a much easier time with the value-like classes.''

The difficulty in reasoning about references informally is no doubt related to the difficulty in reasoning about them formally. The problem is not with references themselves, but with the aliasing they can introduce. A simple description of the problem may be found in [6]. Consider the Hoare triplet

{x = true} y := false {x = true}
At first glance it seems trivial to prove that if x is true before the assignment to y, x will still be true after the assignment to y. But this reasoning is sound only if x is not aliased to y. If aliasing is permitted it forces the programmer to consider that modifying a variable y may potentially modify any other variable of the same type. If these other variables are not local to the procedure or even the facility where the modification occurred, the programmer must possess knowledge of the global program state in order to understand the effects of the change. When a programmer must reason globally about a change that occurred locally, modular reasoning becomes impossible.

References are beneficial because they allow us to implement software efficiently, but aliasing can complicate reasoning and break modularity. Is there a way to get the efficiency of references without the reasoning complexity commonly introduced by aliasing? The next section attempts to answer this question.

Position

The section above describes three distinct areas of imperative programming languages that can benefit from the efficiency of references: data movement, parameter passing, and the implementation of linked data structures. Can we get the efficiency of references without the reasoning complexity that aliases introduce in each of these areas? For data movement and parameter passing the answer is yes, because there are mechanisms that can move data efficiently without introducing aliases. In particular, the swapping operation is described in [3] as an efficient alternative to assignment that does not introduce aliasing, and call-by-swapping is described in [17] as an efficient alternative to call-by-reference that does not introduce aliasing.

Unfortunately, the answer to the aliasing question as it applies to implementations of linked data structures is no--we cannot get efficient implementations without complicating reasoning. This is because the efficiency in implementing linked data structures depends on a programmer's ability to work with aliases. The good news is that the reference behavior needed to implement linked data structures can be encapsulated in a single component. The fact that aliasing may be desirable in programs would probably not surprise most programmers, but the fact that aliasing is desirable so rarely might surprise them. The big picture looks hopeful: The vast majority of software components can be implemented efficiently without introducing aliases, and only a small minority of components need programmers to use aliases for efficient implementations. If one further consider that this small minority of components will be comprised of such basic and well understood data structures as lists, trees, graphs and graph algorithms, the outlook for programs with a diminished alias presence is even more hopeful, for these components need only be implemented once and distributed as part of a language's component library. As an example, the Java component library already includes a list component that programmers can use instead of implementing their own. It is not unreasonable to assume that complex and efficient software systems can be built using a sufficiently rich component library without any of the programmers ever having to worry about effects of aliasing.

The remainder of this paper describes the Location_Linking_Template, the component that captures the reference behavior of linked data structures.

Informal Introduction

The concept Location_Linking_Template provides all the functionality necessary for implementing linked data structures such as lists, trees, graphs, and networks. As a prelude to introducing its specification, we will informally describe the organization, the properties, and the actions permitted in an arbitrary system of linked locations. The informal description given here differs from traditional explanations of pointers, and may be easier to grasp for students new to the notion of references, but it is also straightforward to programmers already comfortable with pointers and linked data structures.

Organization

A system of linked locations is made up of a finite number of locations and one-way connections between them called links. Locations contain information and a fixed number of links. The kind of information and the precise number of links the locations contain is determined by the client when she creates the system. Each location is either free or taken. Locations start off as free and are taken by the client one at a time as she needs them. The client can only manage locations that she has taken; free locations are only of use in that they are available to be taken. When a client no longer needs a (taken) location she can abandon it, at which time it is returned to the pool of free locations. Since there are only a finite number of locations available, a smart client will be careful to abandon the locations that she no longer intends to use.

The client manipulates locations through her workers. All workers reside at some location in the system and serve as representatives of the location they occupy. If the client wishes to alter some aspect of a location, she must do so through a worker. For example, the client may wish to redirect the third link at Mark's location toward Gary's location. Rather than change the link herself, she must direct Mark to change it.

Mark, redirect the third link at your location toward Gary's location.
The practical result of this command structure is that even if a location has been taken, the client cannot modify its information or links unless the location is occupied by a worker. Therefore, the smart client is careful to ensure that she can get a worker to any location she wants to update. A location that can be reached by a worker is said to be accessible.

Every system has a special location named Void that serves as a default location. The default location is perpetually free--it cannot be taken by the client. Unlike other free locations, however, the Void location is a useful part of the system. When a client recruits a new worker, he is automatically located at the Void location. Furthermore, the links of every location always point to Void until the client modifies them.

Figure 1: A system of linked locations. Small circles are locations, arrows are links, Greek letters are information, and roman letters are workers. The circle with a slash through it is the Void location, and the large circle is the pool of free locations.


\resizebox*{4.5in}{!}{\includegraphics{link-system}}

Figure 1 shows a diagram of an example system where Greek letters are the information and where each location has exactly one link. The seven locations in the circle on the left are free, the six locations on the right with Greek letters are taken, and the location with the slash through it is the Void location. This figure is slightly inaccurate because all locations--even free ones--have information and links. The information in free locations is the default information (represented here by the Greek letter \( \phi \)), and the links of free locations always point to Void. We omit these details in the inerest of making the diagram less cluttered.

Actions

The actions presented here are those that a client may perform on a system. They include administrative actions, such as establishing the system and recruiting workers, and management actions, such as changing information and redirecting links. Examples of each action are presented followed by a short description and comments.

Establish a system of linked locations that hold account information and have 1 link. Before the client can manage her system she must create it, and this action establishes the specified system. As previously stated, the information and links in free locations are not observable by the client. There are a finite number of locations but the client has no control over the number.

Recruit Mark and Gary. The client manages locations through workers, so she needs a means to recruit workers for the system. Here she is recruiting two workers who--like all newly recruited workers--begin their careers at the Void location.

The following actions enable the client to manage the system. Since locations are managed through the worker, the actions will typically take the form of commands to a worker or workers.

Mark, take a new location. A client must take a location before she can manipulate it, but the taking can only be performed under certain conditions. First, Mark must not occupy a taken location. If Mark is a new recruit, this is not a problem because he resides at the perpetually free location Void. Second, there must be at least one free location to take. Once Mark takes the location, its status changes to taken. A taken location remains taken until the client abandons it.

Mark, abandon your location. For this action to be performed Mark must occupy a taken location. After Mark abandons the location, he relocates to Void and the location is added to the pool of free locations. Recall that all free locations have default information and default lins, so a side effect of this action is that the information in the location is changed back to the default and all links in the location are redirected to Void. A problem may occur if two workers reside at the same location when the client abandons it. For example, suppose Mark and Gary reside at the same location. Both workers are representatives of the location, so the client can direct either of them to abandon it. If she tells Mark to abandon the location, Mark will relocate to Void, but Gary will find himself occupying a free location. If the client does sloppy bookkeeping she may try to manipulate Gary's location, causing unpredictable results.

Mark, relocate to Gary's location. This is one way a client can move Mark from one location to another. In this case the new location must already have a worker at it, but the locations may be free or taken. Thus, this action may be used to move workers to or from Void. A companion action to this enables the client to ask Mark if he and Gary are colocated.

Mark, follow your location's third link. If Mark is at a taken location, the client can move him to a new location by directing him to follow any of the links there. A location is accessible if it is occupied by a worker or if it can be reached by a worker who follows a series of links. For example, if Mark occupies a location that contains a link to a second location, both locations are accessible regardless of whether the second location is occupied by a worker, because Mark can reach the second location by following the link from his original location. When a location is first taken it is accessible because a worker resides there, but actions that relocate a worker can potentially change the accessibility of the system. In an ideal setting a client would like to ensure that all taken locations remain accessible. A taken but inaccessible location presents a problem: The location is not free so the client cannot take it, and the location is not accessible so the client cannot use it. The total number of locations is finite so if enough locations fall into this state the client may find herself trying to take a new location from an empty pool. Note that it is technically possible for the Void location to be inaccessible, but practically the client need only recruit a new worker and Void becomes accessible.

Mark, exchange the information at your location with messenger Bob's information. The client needs a way to modify the information contained in locations and this action provides it. Implicit in this action is an assumption that the client has a way to manage information, including the ability to recruit messengers who carry that information. This action can only be performed on taken locations.

Mark, redirect the third link at your location toward Gary's location. If Mark occupies a taken location, any of its links can be redirected toward Gary's location. Gary's location need not be taken, so a link at Mark's location can be directed toward Void. This action can potentially change the accessibility of the system.

Formal Specification

This section describes the formal specification of the Location_Linking_Template which appears in Appendix A. Most of the relationships between the mathematical objects in the concept and the notions introduced in the system above are straight forward.

Locations

The Defines clauses at the beginning of Location_Linking_Template indicate that any implementation of this concept must provide definitions for the mathematical type Location, the mathematical object Void, and the mathematical object Taken_Location_Displacement. Though we expect that objects of type Location will somehow be tied to a machine's memory addresses, we don't want to presume how the implementer will model them. In particular, the specification is flexible enough so that Locations that are free or inaccessible (or both) need not correspond to real memory locations.

Objects of type Location in the concept correspond to the notion of locations in the system of linked locations described above. In the concept, the type parameter, Info, indicates the type of information that a location contains, while the second parameter, k, indicates the number of links that a location contains. The three conceptual variables near the beginning of the concept are functions that take locations as arguments: Contents(q) returns the information at a given location q, Target(q,on i) returns the location targeted by the i-th link of q, and Is_Taken(q) returns true if q is taken and false if q is free.

The facility constraints ensure that the distinguished location Void is always free, and that all free locations have default information and default links. The facility initialization clause ensures that immediately after this concept is instantiated all locations are free.

Displacement

The value of Taken_Location_Displacement is the amount of space that a newly taken location occupies in memory. This value depends on the type Info, k, and the implementation.

The last part of the facility constraints ensures that the total number of locations is at least as large as the total amount of memory in the system divided by the amount of memory that a newly taken location occupies. In other words, the number of mathematical locations is always greater than the number of locations that can be stored in real memory. The operation Take_New_Location has as part of its requirement that the amount of real memory available (Remaining_Memory_Capacity) is greater than the memory that must be allocated to a newly taken location. The client has the ability to check this condition through the operation Location_Size supplied by this module and a global operation that returns the amount of available memory.

A discussion of comprehensive performance specifications can be found in [10]. For complete performance specifications--those encompassing both duration and displacement--we envision a separate performance profile for each implementation. The displacement information described here merits special treatment because it represents displacement in the system that is permanent until it is explicitly deallocated.

Positions

The concept exports the programming type Position. Position variables are the workers described in the system above. The declaration

Var Mark, Gary: Position;
corresponds to the action Recruit Mark and Gary described above. Note that although position variables correspond to workers, their mathematical values correspond to locations. For example, the initialization ensures clause asserts that p = Void. Since the symbol p that occurs here is the mathematical value of 'the programming variable p' rather than 'the programming variable p' itself, the assertion is interpreted as The location of the worker named p is Void, or simply Worker p resides at Void.

The predicate Is_Accessible is a variable rather than a constant (like most defined objects) because its value depends not only on the value of its parameter, q, but on the state of the program--the same location may be accessible in one program state and inaccessible in the next. The definition states that a location is accessible if there is a series of links to that location starting from a location represented by an in-scope Position variable. The predicate Position.Has_Active_Variable(q) indicates whether the location q is represented by a Position variable that is currently in scope.

Operations

The management actions described above correspond directly to the operations given in the concept and its enhancements. One important primary operation that was not described above it Swap_Locations. The operation is similar to Redirect_Link except it also relocates the worker from the new target location to the old target location (see Figure 2). In effect, the specified link and the specified worker are swapping locations. This operation has the desirable property that it does not effect the accessibility of the system. Furthermore, using Swap_Locations and Relocate one can implement both Redirect_Link and Follow_Link, effectively making them secondary operations.

Figure 2: The procedure call Swap_Locations(p, 1, q) redirects the first link of p's location toward q, and relocates q to the link's original target.


\resizebox*{3.5in}{!}{\includegraphics{swap-loc}}

Through its operations, the Location_Linking_Template provides the reference behavior needed to efficiently implement linked data structures. In particular, the client can reap the benefits of aliasing by positioning two or more workers at the same location. But the concept also allows the client to fall into the traditional traps involving references: dangling references and memory leaks. A dangling reference occurs when a location is free but remains accessible, and a memory leak occurs when a location is taken but not accessible.

Dangling References

The following code segment creates a dangling reference.

Var x, y: Position;
Take_New_Location(x);
Relocate(y, x);
Abandon_Location(x);
When x abandoned his location, the location's status changed from taken to free. Though x was relocated to Void, y remained at the location, so the location continues to be accessible. Position variables are effectively bound to the type of Info during instantiation, so there is no danger of inadvertently modifying (through the dangling reference) the contents of a memory location that is being used by another variable somewhere else in the program. Real memory locations on a machine are limited, so the specification permits implementations that can reclaim memory even if a dangling reference existed for them.

The Is_Usable operation (provided as an extension to the concept) effectively tells the client whether a worker is a dangling reference. Since a worker resides at the location in question, the location is accessible. If the location is taken, it is usable by the client. But if the location is free, the client cannot affect it.

Memory Leaks

The following code segment creates a memory leak.

Var x, y: Position;
Take_New_Location(x);
Relocate(x, y);
The location that was taken by x continues to have a taken status but has become inaccessible. Real memory locations are limited, so a proliferation of memory leaks is a serious problem. There are traditionally two ways to deal with memory leaks. The first is to avoid them by keeping a careful accounting of the locations in the system and explicitly abandoning a location before it becomes a leak. The second is to do periodic garbage collection. The operation that performs garbage collection, Abandon_Inaccessible, is provided in an extension to the concept.

Extensions

An extension differs from an enhancement in that its operations cannot in principle be implemented as a secondary operation. Extensions are separate from the main concept because they cannot be implemented without costly performance penalties. Default realizations do not implement extensions, but custom realizations may. Both Abandon_Inaccessible and Is_Usable are extensions. A garbage collection implementation of Location_Linking_Template would additionally provide a procedure for the Abandon_Inaccessible operation. A client may then choose to ignore the Abandon_Location operation and periodically invoke the Abandon_Inaccessible operation instead.

Application

Using Location_Linking_Template to implement linked data structures will be familiar to anyone who has implemented a linked list in a language with pointers such as C or Pascal. This section presents a List concept and its implementation using Location_Linking_Template. The specification for the list--List_Template--is given in the appendix. The mathematical model for the type List consists of two strings--a left string and a right string. When an item is inserted into the list it is inserted at the beginning of the right string; when an item is removed from the list it is removed from the beginning of the right string.

Figure 3: The first system depicts an empty list and the second system depicts a list with three items $\alpha $, $\beta $, and $\gamma $ where the insertion point is between $\beta $ and $\gamma $.


\resizebox*{3.5in}{!}{\includegraphics{list-examples2}}

The implementation of List_Template (Location_Linking_Based) appears after List_Template in Appendix B. The list is implemented as a record with three Position fields and two Integer fields. The field head perpetually resides at a dummy location at the beginning of the list. This location serves as a sentinel--it will not be used to hold information, but it allows the implementer to ignore certain boundary conditions. The field pre is always located at the end of the left string; if the left string is empty, pre is colocated with head. The field last is always located at the end of the right string; if the right string is empty, last is colocated with pre.

The conventions in the type declaration assert invariants that must hold before and after each procedure. The first three conjunctions in this convention describe how the fields of the record (head, pre, last, left_length, and right_length) are related. From these conjunctions (along with the correspondence) we can show that Void is accessible from all locations that hold items in the list. The last conjunction effectively prohibits dangling references and memory leaks.

Figure illustrates how a system is updated during the Insert operation.

Figure 4: The Insert operation can be thought of as consisting of four main sections. Recruiting a worker named post to reside at the location targeted by pre, creating a new location with the desired information (the left diagram), redirecting links so that the new location is positioned correctly in the list (the right diagram), and ensuring that the worker last resides at his correct location in the updated system.


\resizebox*{4.5in}{!}{\includegraphics{list-insert2}}

Related Work and Conclusion

The problems involving references in high-level languages has been around for a long time and therefore a good deal has been written on the subject. Hoare critiqued them early on [4,5] and others followed suit [8,2]. Specifiers were quick to single out aliasing as the main problem caused by references that breaks modular reasoning [2,6]. Modern practical programmers like Koenig have described the difficulty that students have understanding pointers [9], and Stepanov cites value semantics (along with efficiency) as one of the key principles behind the development of the C++ Standard Template Library [15]. An overview of the problem with reference behavior as it relates to reusable components is given in [11].

The desire for both efficiency and value semantics is what led Harms to consider swapping as an alternative form of data movement in [3], and the same mechanism is applied to parameter passing in [17]. Ownership types are introduced in [1] as a way to control aliasing, and dynamic dispatch is offered as way to eliminate aliasing altogether during parameter passing in [12].

Components that capture the functionality and performance of references were suggested by Hollingsworth in [7]. The Location_Linking_Template described in this paper has gone through many iterations, some of which are presented in [16].

Pointers and reference variables present a problem in high-level languages because of their tendency to introduce aliasing, which makes reasoning more complex and often breaks modularity. Though aliasing can often be avoiding by the use of efficient data movement mechanisms such as swapping, high-level languages also need mechanisms that allow programmers to use aliasing for those rare occasions when it cannot be avoided. Such mechanisms should not break modular reasoning. The Location_Linking_Template introduced in this paper satisfies these conditions.

Along with this mechanism, a method of software development should be employed that ensures the need for aliasing will be rare. Language designers must recognize that many linked data structures and their algorithms are understood well enough that they can and should be provided to programmers as integral parts of a rich and reusable component library.

Appendix A

Concept Location_Linking_Template(type Info; evaluates k: Integer);
       uses String_Theory, Std_Boolean_Facility, Std_Integer_Fac;

       Defines Location: \( \mathbf{S} \)et;
       Defines Void: Location;
       Defines Taken_Location_Displacement: \( \mathbf{N}^{+}; \)

       Var Target: \( \mathrm{Location}\times [1..\mathrm{k}]\rightarrow \mathrm{Location} \);
       Var Contents: \( \mathrm{Location}\rightarrow \mathrm{Info} \);
       Var Is_Taken: \( \mathrm{Location}\rightarrow \mathbf{B} \);

       Facility Constraints \( \neg \textrm{Is}\_\textrm{Taken}(\textrm{Void}) \) and
             \( \forall \textrm{q}:\textrm{Location} \), \( \textrm{Is}\_\textrm{Taken}(\textrm{q}) \) and \( \textrm{Info}.\textrm{Is}\_\textrm{Initial}(\textrm{Contents}(\textrm{q})) \) and
             \( \vert\vert\textrm{Location}\vert\vert\geq \mathbf{Total}\_\mathbf{Memory}\_\mathbf{Capacity}/\textrm{Taken}\_\textrm{Location}\_\textrm{Displacement} \);
       Facility Initialization ensures \( \forall \mathrm{q}:\mathrm{Location},\, \, \neg \mathrm{Is}\_\mathrm{Taken}(\mathrm{q}) \);

       Type Family Position is modeled by Location;
             exemplar p;
             initialization ensures p = Void;
             Definition Variable Is_Accessible(q: Location): \( \mathbf{B} \) =
                   \( \exists \mathrm{p}:\mathrm{Position},\, \, \exists \mathrm{h}:[1..\mathrm{k}],\, \, \exists \rho :\mathrm{Str}(\mathrm{Location}\times [1..\mathrm{k}])\ni \)
                   \( \mathrm{Position}.{\mathrm{Has}\_\mathrm{Active}\_\mathrm{Variable}}(\textrm{p}) \) and \( \langle (\textrm{p},\textrm{h})\rangle \, \, \textrm{Is}\_\textrm{Prefix}\, \, \rho \) and
                   \( \forall \mathrm{u},\mathrm{v}:\mathrm{Location},\, \, \forall \mathrm{i},\mathrm{j}:[1..\mathrm{k}] \), if \( \langle (\textrm{u},\textrm{i})\rangle \circ \langle (\textrm{v},\textrm{j})\rangle \, \, \textrm{Is}\_\textrm{Substring}\, \, \rho \) then
                   \( \mathrm{Target}(\mathrm{u},\mathrm{i})=\mathrm{v} \) and \( \langle (\textrm{q},1)\rangle \, \, \textrm{Is}\_\textrm{Suffix}\, \, \rho \);
             finalization updates Is_Accessible;

       Operation Take_New_Location(updates p: Position);
             updates Is_Taken, Is_Accessible;
             requires \( \neg \mathrm{Is}\_\mathrm{Taken}(\mathrm{p}) \) and
                   \( \textrm{Taken}\_\textrm{Location}\_\textrm{Displacement }\leq \mathbf{Remaining}\_\mathbf{Memory}\_\mathbf{Capacity} \);
             ensures \( \neg \char93 \textrm{Is}\_\textrm{Taken}(\textrm{p}) \) and \( \neg \char93 \textrm{Is}\_\textrm{Accessible}(\textrm{p}) \) and
                   \( \forall \textrm{j}:[1..\textrm{k}],\, \, \textrm{Target}(\textrm{p},\textrm{j})=\textrm{Void} \) and
                   \( \forall \mathrm{q}:\mathrm{Location}, \) if \( \textrm{q }\neq \textrm{p} \) then \( \textrm{Is}\_\textrm{Taken}(\textrm{q})=\char93 \textrm{Is}\_\textrm{Taken}(\textrm{q}) \);

       Operation Abandon_Location(clear p: Position);
             updates Target, Contents, Is_Taken, Is_Accessible;
             requires Is_Taken(p);
             ensures \( \forall \mathrm{q}:\mathrm{Location}, \)
                   \( \left( \textrm{Is}\_\textrm{Taken}(\textrm{q})=\left\{ \begin{array}{ll}
\mat...
...(\mathrm{q}) & {\mathbf{otherwise}}
\end{array}\right. \right) {\mathbf{and}} \)
                   if \( \textrm{Is}\_\textrm{Taken}(\textrm{q}) \) then \( \textrm{Contents}(\textrm{q})=\char93 \textrm{Contents}(\textrm{q}) \) and
                   \( \forall \mathrm{n}:[1..\mathrm{k}],\, \, \mathrm{Target}(\mathrm{q},\mathrm{n})=\char93 \mathrm{Target}(\mathrm{q},\mathrm{n}) \);

       Operation Relocate(updates p: Position; restores new_location: Position);
             ensures p = new_location;

       Operation Check_Colocation(preserves p, q: Position;
                                                 replaces are_colocated: Boolean);
             ensures if ((Is_Taken(p) or p = Void) and (Is_Taken(q) or q = Void))
                   then are_colocated = ( p = q );

       Operation Swap_Locations(preserves p: Position; evaluates i: Integer;
                                                 updates new_target: Position);
             updates Target;
             requires \( 1\leq \mathrm{i}\leq \mathrm{k} \) and \( \textrm{Is}\_\textrm{Taken}(\textrm{p}) \);
             ensures \( \forall \mathrm{q}:\mathrm{Location},\, \, \forall \mathrm{j}:[1..\mathrm{k}], \)
                   \( \left( \textrm{Target}(\textrm{q},\textrm{j})=\left\{ \begin{array}{ll}
\char...
...,\mathrm{j}) & {\mathbf{otherwise}}
\end{array}\right. \right) {\mathbf{and}} \)
                   \( \mathrm{new}\_\mathrm{target}=\char93 \mathrm{Target}(\mathrm{p},\mathrm{i}) \);

       Operation Swap_Contents(preserves p: Position; updates I: Info);
             updates Contents;
             requires Is_Taken(p);
             ensures \( \mathrm{I}=\char93 \mathrm{Contents}(\mathrm{p}) \) and \( \forall \textrm{q}:\textrm{Location} \),
                   \( \mathrm{Contents}(\mathrm{q})=\left\{ \begin{array}{ll}
\char93 \mathrm{I} & ...
...har93 \mathrm{Contents}(\mathrm{q}) & {\mathbf{otherwise}}
\end{array}\right. \);

       Operation Is_At_Void(preserves p: Position): Boolean;
             ensures Is_At_Void = ( p = Void );

       Operation Location_Size(): Integer;
             ensures Location_Size = ( Taken_Location_Displacement );

end Location_Linking_Template;

Enhancement Other_Operations for Location_Linking_Template;

       Operation Redirect_Link(preserves p: Position; evaluates i: Integer;
                                           preserves new_target: Position);
             updates Target, Is_Accessible;
             requires \( 1\leq \mathrm{i}\leq \mathrm{k} \) and \( \textrm{Is}\_\textrm{Taken}(\textrm{p}) \);
             ensures \( \forall \mathrm{q}:\mathrm{Location},\, \, \forall \mathrm{j}:[1..\mathrm{k}], \)
                   \( \mathrm{Target}(\mathrm{q},\mathrm{j})=\left\{ \begin{array}{ll}
\char93 \mat...
...thrm{Target}(\mathrm{q},\mathrm{j}) & {\mathbf{otherwise}}
\end{array}\right. \);

       Operation Follow_Link(update p: Position; preserves i: Integer);
             updates Is_Accessible;
             requires \( \mathrm{Is}\_\mathrm{Taken}(\mathrm{p}) \) and \( 1\leq \textrm{i}\leq \textrm{k} \);
             ensures \( \mathrm{p}=\mathrm{Target}(\char93 \mathrm{p},\mathrm{i}) \);

end Other_Operations;

Extension Usability_Checking_Capability for Location_Linking_Template;

       Operation Is_Usable(preserves p: Position): Boolean;
             ensures Is_Usable = ( Is_Taken(p) );

end Usability_Checking_Capability;

Extension Cleanup_Capability for Location_Linking_Template;

       Operation Abandon_Inaccessible();
             updates Is_Taken, Contents, Target;
             ensures \( \forall \mathrm{q}:\mathrm{Location}, \)
                   \( \mathrm{Is}\_\mathrm{Taken}(\mathrm{q})=(\char93 \mathrm{Is}\_\mathrm{Accessible}(\mathrm{q}) \) and \( \char93 \textrm{Is}\_\textrm{Taken}(\textrm{q})) \) and
                   if \( \textrm{Is}\_\textrm{Taken}(\textrm{q}) \) then \( \textrm{Contents}(\textrm{q})=\char93 \textrm{Contents}(\textrm{q}) \) and
                   \( \forall \mathrm{i}:[1..\mathrm{k}],\, \, \mathrm{Target}(\mathrm{q},\mathrm{j})=\char93 \mathrm{Target}(\mathrm{q},\mathrm{j}) \);

end Cleanup_Capability;

Appendix B

Concept List_Template(type Entry);

       Defines List_Unit_Displacement: \( \mathbf{N}^{+} \);

       Type Family List is modeled by Cart_Prod
                   Left: Str(Entry);
                   Right: Str(Entry);
             end;
             exemplar S;
             initialization ensures \( \vert\mathrm{S}.\mathrm{Left}\vert=0 \) and \( \vert\textrm{S}.\textrm{Right}\vert=0 \);

       Operation Insert(alters E: Entry; updates S: List);
             requires List_Unit_Displacement \( \leq \) Remaining_Memory_Capacity;
             ensures \( \mathrm{S}.\mathrm{Left}=\char93 \mathrm{S}.\mathrm{Left} \) and \( \textrm{S}.\textrm{Right }=\langle \char93 \textrm{E}\rangle \circ \char93 \textrm{S}.\textrm{Right} \);

       Operation Remove(replace R: Entry; updates S: List);
             requires \( \vert\mathrm{S}.\mathrm{Right}\vert>0 \);
             ensures \( \mathrm{S}.\mathrm{Left}=\char93 \mathrm{S}.\mathrm{Left} \) and \( \char93 \textrm{S}.\textrm{Right}=\langle \textrm{R}\rangle \circ \textrm{S}.\textrm{Right} \);

       Operation Advance(updates S: List);
             requires \( \vert\mathrm{S}.\mathrm{Right}\vert>0 \);
             ensures \( \mathrm{S}.\mathrm{Left}\circ \mathrm{S}.\mathrm{Right}=\char93 \mathrm{S}.\mathrm{Left}\circ \char93 \mathrm{S}.\mathrm{Right} \) and
                   \( \vert\mathrm{S}.\mathrm{Left}\vert=\vert\char93 \mathrm{S}.\mathrm{Left}\vert+1 \);

       Operation Advance_To_End(updates S: List);
             ensures \( \vert\mathrm{S}.\mathrm{Right}\vert=0 \) and \( \textrm{S}.\textrm{Left }=\char93 \textrm{S}.\textrm{Left }\circ \char93 \textrm{S}.\textrm{Right} \);

       Operation Reset(updates S: List);
             ensures \( \vert\mathrm{S}.\mathrm{Left}\vert=0 \) and \( \textrm{S}.\textrm{Right }=\char93 \textrm{S}.\textrm{Left }\circ \char93 \textrm{S}.\textrm{Right} \);

       Operation Swap_Rights(updates S1, S2: List);
             ensures \( \mathrm{S}1.\mathrm{Left}=\char93 \mathrm{S}1.\mathrm{Left} \) and \( \textrm{S}2.\textrm{Left }=\char93 \textrm{S}2.\textrm{Left} \) and
                   \( \mathrm{S}1.\mathrm{Right}=\char93 \mathrm{S}2.\mathrm{Right} \) and \( \textrm{S}2.\textrm{Right }=\char93 \textrm{S}1.\textrm{Right} \);

       Operation Left_Length(restores S: List): Integer;
             ensures Left_Length = \( (\, \, \vert\mathrm{S}.\mathrm{Left}\vert\, \, ) \);

       Operation Right_Length(restores S: List): Integer;
             ensures Right_Length = \( (\, \, \vert\mathrm{S}.\mathrm{Right}\vert\, \, ) \);

       Operation Unit_Size(): Integer;
             ensures Unit_Size = ( List_Unit_Displacement );

end List_Template;

Realization Location_Linking_Realiz for List_Template;
       uses Location_Linking_Template;

       Facility Std_Pointer_Fac is Location_Linking_Template(Entry, 1)
            realized by Std_Realiz;

       Definition List_Unit_Displacement = Taken_Location_Displacement;

       Definition Variable Next(p: Location): Location = \( \mathrm{Target}(\mathrm{p},1) \);

       Type List = Record
                   head, pre, last: Position;
                   left_length, right_length: Integer;
             end;
             conventions \( \mathrm{S}.\mathrm{left}\_\mathrm{length}\geq 0 \) and \( \textrm{S}.\textrm{right}\_\textrm{length }\geq 0 \) and
                   \( \mathrm{S}.\textrm{pre}=\left\{ \begin{array}{ll}
\textrm{S}.\textrm{head} & ...
...rm{length}}(\textrm{S}.\textrm{head}) & \mathbf{otherwise}
\end{array}\right. \)and
                   \( \mathrm{S}.\textrm{last}=\left\{ \begin{array}{ll}
\textrm{S}.\textrm{pre} & ...
...hrm{length}}(\textrm{S}.\textrm{pre}) & \mathbf{otherwise}
\end{array}\right. \)and
                   \( \textrm{Next}(\textrm{S}.\textrm{last})=\textrm{Void} \) and
                   \( \forall \textrm{q}:\textrm{Location},(\, \, \textrm{Is}\_\textrm{Accessible}(...
...trm{q})\, \, \mathbf{iff}\, \, \textrm{Is}\_\textrm{Taken}(\textrm{q})\, \, ) \);
             correspondence
                   Conc.S.Left = $\mathrm{\displaystyle\prod_{k = 1}^{S.left\_length} \langle Contents(Next^{k}(S.head)) \rangle }$ and
                   Conc.S.Right = $\mathrm{\displaystyle\prod_{k = 1}^{S.right\_length} \langle Contents(Next^{k}(S.pre)) \rangle }$;

             initialization
                   Take_New_Location(S.head);
                   Relocate(S.pre, S.head);
                   Relocate(S.last, S.head);
             end;

       Procedure Insert(alters E: Entry; updates S: List);
             Var post, new: Position;
             Relocate(post, pre);
             Follow_Link(post, 1);
             Take_New_Location(new);
             Swap_Contents(new, E);
             Redirect_Link(S.pre, 1, new);
             Redirect_Link(new, 1, post);
             If S.right_length = 0 then
                   Follow_Link(S.last, 1);
             end;
             S.right_length := S.right_length + 1;
       end Insert;

       Procedure Remove(replace R: Entry; updates S: List);
             Var p: Position;
             Relocate(p, pre);
             Follow_Link(p, 1);
             Follow_Link(p, 1);
             Swap_Locations(S.pre, 1, p);
             Swap_Contents(p, R);
             Abandon_Location(p);
             If S.right_length = 1 then
                   Relocate(S.last, S.pre);
             end;
             S.right_length := S.right_length - 1;
       end Remove;

       Procedure Advance(updates S: List);
             Follow_Link(S.pre, 1);
             S.left_length := S.left_length + 1;
             S.right_length := S.right_length - 1;
       end Advance;

       Procedure Advance_To_End(updates S: List);
             Relocate(S.pre, S.last);
             S.left_length := S.left_length + S.right_length;
             S.right_length := 0;
       end Advance_To_End;

       Procedure Reset(updates S: List);
             Relocate(S.pre, S.head);
             S.right_length := S.right_length + S.left_length;
             S.left_length := 0;
       end Reset;

       Procedure Swap_Rights(updates S1, S2: List);
             Var post: Position;
             If S.right_length \( \neq \) 0 then
                   Relocate(post, S1.pre);
                   Follow_Link(post, 1);
                   Swap_Locations(S1.pre, 1, post);
                   Swap_Locations(S2.pre, 1, post);
                   S1.last :=: S2.last;
             end;
       end Swap_Rights;

       Procedure Left_Length(restores S: List): Integer;
             Left_Length = S.left_length;
       end;

       Procedure Right_Length(restores S: List): Integer;
             Right_Length = S.right_length;
       end;

       Procedure Unit_Size(): Integer;
             Unit_Size = Location_Size();
       end;

end Location_Linking_Realiz;

Appendix C

The realization below makes use of special syntax provided for the Location_Linking_Template. Workers in the system will be of type Node. A Node is a Position for a system in which locations contain information of type Entry and a single link, labeled next. An up-arrow (\( \uparrow \)) before a worker indicates that he is taking a new location, and a down-arrow (\( \downarrow \)) in front of a worker indicates that he is abandoning his location. An asterisk (*) in front of a worker denotes the information at that worker's location. A hat (^) between a location and a link name denotes the link at that location, and an arrow (\( \rightarrow \)) between locations denotes movement of a worker or link (at the foot of the arrow) to a new location (at the head of the arrow). So x \( \rightarrow \) y is syntactic sugar for Relocate(x, y), x \( \rightarrow \) x^next is syntactic sugar for Follow_Link(x, 1), and x^next \( \rightarrow \) y^next is syntactic sugar for a secondary operation that redirects the first link at x's location to the location pointed to by the first link of y's location. The double arrow (\( \leftrightarrow \)) indicates simultaneous movement in both directions.

Realization Location_Linking_Based for List_Template;
       uses Std_Pointer_Fac;

       (* Facility already imported through the uses clause. *)
       (* Definitions are the same as above. *)

       Type Node = ^Entry(next);

       Type List = Record
                   head, pre, last: Node;
                   left_length, right_length: Integer;
             end;
             (* Conventions and correspondence are the same as above. *)
             initialization
                   \( \uparrow \)S.head;
                   S.pre \( \rightarrow \) S.head;
                   S.last \( \rightarrow \) S.head;
             end;

       Procedure Insert(alters E: Entry; updates S: List);
             Var new: Node;
             \( \uparrow \)new;
             *new :=: E;
             new^next \( \rightarrow \) S.pre^next;
             S.pre^next \( \rightarrow \) new;
             If S.right_length = 0 then
                   S.last \( \rightarrow \) S.last^next;
             end;
             S.right_length := S.right_length + 1;
       end Insert;

       Procedure Remove(replace R: Entry; updates S: List);
             Var old: Node;
             old \( \rightarrow \) S.pre^next;
             S.pre^next \( \rightarrow \) S.pre^next^next;
             *old :=: R;
             \( \downarrow \)old;
             If S.right_length = 1 then
                   S.last \( \rightarrow \) S.pre;
             end;
             S.right_length := S.right_length - 1;
       end Remove;

       Procedure Advance(updates S: List);
             S.pre \( \rightarrow \) S.pre^next;
             S.left_length := S.left_length + 1;
             S.right_length := S.right_length - 1;
       end Advance;

       Procedure Advance_To_End(updates S: List);
             S.pre \( \rightarrow \) S.last;
             S.left_length := S.left_length + S.right_length;
             S.right_length := 0;
       end Advance_To_End;

       Procedure Reset(updates S: List);
             S.pre \( \rightarrow \) S.head;
             S.right_length := S.right_length + S.left_length;
             S.left_length := 0;
       end Reset;

       Procedure Swap_Rights(updates S1, S2: List);
             If S.right_length \( \neq \) 0 then
                   S1.pre^next \( \leftrightarrow \) S2.pre^next;
                   S1.last \( \leftrightarrow \) S2.last;
             end;
       end Swap_Rights;

       (* Remaining procedures are the same as above. *)

end Location_Linking_Based;

Bibliography

1
D. G. Clarke, J. M. Potter, and J. Noble.
Ownership types for flexible alias protection.
In OOPSLA '98 Conference Proceedings, pages 48-64. ACM Press, 1998.

2
S. A. Cook.
Soundness and completeness of an axiom system for program verification.
SIAM Journal of Computing, 7(1):70-90, 1978.

3
D. E. Harms and B. W. Weide.
Copying and swapping: Influences on the design of reusable software components.
IEEE Transactions on Software Engineering, 17(5):424-435, May 1991.

4
C. A. R. Hoare.
Hints on programming language design.
In C. A. R. Hoare and C. B. Jones, editors, Essays in Computing Science. Prentice Hall, New York, 1989.

5
C. A. R. Hoare.
Recursive data structures.
In C. A. R. Hoare and C. B. Jones, editors, Essays in Computing Science. Prentice Hall, New York, 1989.

6
J. Hogg, D. Lea, A. Wills, D. deChampeaux, and R. Holt.
The Geneva Convention on the treatment of object aliasing.
OOPS Messenger, 3(2):11-16, 1992.

7
J. E. Hollingsworth and B. W. Weide.
Engineering "unbounded" reusable Ada generics.
In Proceedings of the 10th Annual National Conference on Ada Technology, pages 82-97. ANCOST, 1992.

8
R. B. Kieburtz.
Programming without pointer variables.
In Proceedings of the SIGPLAN '76 Conference on Data: Abstraction, Definition, and Structure. ACM Press, 1976.

9
A. Koenig and B. Moo.
Teaching standard C++.
JOOP, 11(7):11-17, 1998.

10
J. Krone, W. F. Ogden, and M. Sitaraman.
Modular verification of performance correctness.
In OOPSLA 2001 SAVCBS Workshop Proceedings, 2002.
http://www.cs.iastate.edu/~leavens/SAVCBS/papers-2001/index.html.

11
G. Kulczycki.
Efficient reusable components with value semantics.
In ICSR 2002 Young Researchers Workshop, 2002.
http://www.info.uni-karlsruhe.de/~heuzer/ICSR-YRW2002/program.html.

12
G. T. Leavens and O. Antropova.
ACL--eliminating parameter aliasing with dynamic dispatch.
Technical Report 98-08a, Department of Computer Science, Iowa State University, 1998.

13
K. R. M. Leino and G. Nelson.
Data abstraction and information hiding.
Technical Report 160, Compaq SRC, 2000.

14
P. Müller and A. Poetzsch-Heffter.
Modular specification and verification techniques for object-oriented software components.
In G. T. Leavens and M. Sitaraman, editors, Foundations of Component-Based Systems. Cambridge University Press, Cambridge, United Kingdom, 2000.

15
D. R. Musser, G. J. Derge, and A. Saini.
STL Tutorial and Reference Guide.
Addison-Wesley, Boston, 2nd edition, 2001.

16
W. F. Ogden.
The Proper Conceptualization of Data Structures.
The Ohio State University, Columbus, OH, 2000.

17
M. Sitaraman, G. W. Kulczycki, W. F. Ogden, B. W. Weide, and G. T. Leavens.
Understanding and minimizing the impact of reference-value distinction on software engineering.
Technical report, Department of Computer Science, Clemson University, 2002.

About this document ...

Capturing the Reference Behavior
of Linked Data Structures

This document was generated using the LaTeX2HTML translator Version 99.2beta8 (1.42)

Copyright © 1993, 1994, 1995, 1996, Nikos Drakos, Computer Based Learning Unit, University of Leeds.
Copyright © 1997, 1998, 1999, Ross Moore, Mathematics Department, Macquarie University, Sydney.

The command line arguments were:
latex2html -split 0 -white main_paper

The translation was initiated by Gregory W. Kulczycki on 2002-05-10


Footnotes

... reserved.1
This research funded in part by NSF grant CCR-0113181.
... indirection.''2
This saying has been attributed to Butler Lampson.
... structures.3
References are also used for memory management, but a discussion of this topic is beyond the scope of this paper.

Gregory W. Kulczycki 2002-05-10