Larry Latour, Ling Huang, and Tom Wheeler
Software Engineering Laboratory, Dept. of Computer Science
University of Maine
222 Neville Hall
Orono, Maine 04469 USA
Phone: +1 207 581 2260
Fax: +1 207 581 4977
We say that Interface "Pragmatics" are the unstated assumptions about the way a module interface and it's implementation is used, but that still must be understood in order for us to properly use that module. Such pragmatics can range from simple restrictions placed on the module to complex architectural assumptions about how the module can and must be used. Program integration is the process of combining modules in such a way that inconsistencies between module assumptions are avoided, and the use of pragmatic information can go a long way toward supporting this process. We look specifically at formalizing pragmatic information through the capture and analysis of patterns of use, first by using regular expressions to capture interface patterns, and then by using CSP (Communicating Sequential Processes) [Hoare85] to capture and analyze interaction patterns between using and providing modules.
KeywordsProgram composition, program integration, interface pragmatics, regular expressions, CSP
Paper Category: technical paper
Experience teaches us that the cornerstone of good software engineering practice is the concept of separation of concerns. This, along with it's companion support mechanisms of information hiding and modularization, drives the program development and understanding process, allowing us to manage the complexity of a large system. Each concern is encapsulated and modeled with an interface, and these interfaces are then combined into a working system using one of a variety of existing composition techniques.
Unfortunately, problems arise when systems are recomposed in this manner. We argue that many of these problems are due to the incomplete information typically provided by interfaces.
According to David Parnas [Parnas77], an interface is the set of "Assumptions that a using module makes of a providing module" and "a providing module makes of using module". But we need to analyze and elaborate such assumptions. They are not captured by just the types and accessing operations, as is usually taken for granted. There is more to it than that. The code that is written to deal with unstated assumptions of ordering, multiple module interactions, obligations, etc. of the operations shows that there is something else that is needed besides the operation definitions and the function of the module(s) being written. The goal of this research is to explore what this "something else" is.
If a module (or an object of an ADT) is thought of as defining a language then the thought pattern of "syntax, semantics, pragmatics" can be applied to module interfaces:
For our purposes pragmatics are the unstated assumptions about way an interface is used, but that still must be understood in order to properly use that interface. In natural language understanding, they are best shown by an example - "I'm beside myself". Pragmatics make things more natural and smooth in natural language, but actually seem to be necessary in formal language.
The following is a definition of Semantics and Pragmatics, taken from the Natural Language section of an online AI course [Cawsey94]:
"… semantics and pragmatics, are concerned with getting at the meaning of a sentence. In the first stage (semantics) a partial representation of the meaning is obtained based on the possible syntactic structure(s) of the sentence, and on the meanings of the words in that sentence. In the second stage, the meaning is elaborated based on contextual and world knowledge."Interface pragmatics can range from simple restrictions place on abstract data type operations to complex architectural assumptions about how an interface can and must be used.
If we make an effort to enumerate and formalize unstated assumptions about the use of a module, we might be able to use the results to:
We've developed a number of examples using both regular expressions and CSP (Communicating Sequential Processes) [Hoare85]. It seems that this technique shows promise for formalizing and analyzing patterns of use, and as a result provides more information to the user of a component about how to properly integrate that component into a system.
Let's begin with a simple file interface, with operations open, read, write, close, and endfile. As straightforward as this abstraction is, there are still ordering restrictions that must be obeyed in order for a using program to work properly. For example, typically open must be used before reading or writing, the file "must" be closed (either explicitly or implicitly), and endfile must be checked before reading. We can represent these ordering restrictions with a regular expression, and then show three simple using programs, one that matches the expression, and two that don't.
What we typically consider as the interface here is really a "low level" interface - the specification of each independent operation in the interface. What we're missing are the generic, "higher level" patterns of use of the combined operations, or what we'll call "high level" interfaces. These generic patterns can either be derived from restrictions based on the interface specification of the operations or on implied restrictions based upon an implementation of the operations. For example, in figure 1 above, the regular expression describing the pattern of use of the file operations can be derived as a theorem from the specifications of the file interface operations, whereas in other cases patterns of use might be semantically correct but more or less efficient dependent on the implementation choices of the operations.
Consider the following simple example of an iterator class:
Class iteratorIn the case of this iterator, the forced pattern of use, or the high level interface of the iterator, is:
GetFirst (item)It is these patterns of use that caused us to specifically partition the low level interface into getFirst and getNext to begin with.
Process the item
While we typically say low level interfaces are composed into a higher level module, we might better say that high level interfaces are integrated into that higher level module. The term "integration" better suggests that a good deal of important activity is going on during the integration process. It is the instantiation and integration of high level interfaces that is the value added in the higher level module.
1. There is no explicit intermediate step of instantiating a generic "formal" high level interface with an "actual" high level usage pattern. The programmer implicitly moves directly to an integrated pattern that instantiates two or more high level interfaces together in an integrated pattern. Our Solution: there needs to be both a formal language for describing a high level interface and an explicit instantiation of this interface to an actual usage pattern before the step of integrating high level patterns togetherOne of the mindsets that contribute to this problem is that we tend to view systems concerns on a single level. For example, if we separate concerns in a system using OO methods, we might come up with the decomposition in figure 3:
2. Many times usage patterns from high level interfaces "tangle" together in a way that leads to undue code complexity and subsequent confusion. Gregor Kiczales's Aspect Oriented Programming [Kiczales97] also addresses this type of problem. Our Solution: the tangling rules between usage patterns derived from high level interfaces need to be formalized and analyzed using an integration language.
3. Often high level interfaces are incompatible and need to be redesigned. Dave Garlen's Architectural Mismatch paper [Garlen95] discusses the ramifications of this. Our solution: this incompatibility needs to be expressed in terms of mutual exclusion restrictions between elements of the offending usage patterns, derived from offending underlying high level interfaces.
The arrows represent the "uses" relation, i.e., which objects call which operations of which other objects.
Unfortunately, a concern is not completely abstracted away in an object.
It is replaced by not only its low level interface, but more critically,
its high level interface. A using module has to deal with the restrictions
place on it by the high level interfaces of each of the objects it uses.
The concerns of integrating these high level interfaces are meta-concerns,
and tend to be more global/integral in nature. For example, consider
the concerns of modules B,C, and D integrated in A in figure 4.
A more complex example is the module interface of a FIFO TCPIP network “Connection”. Typically it only provides specifications of individual operations, duch as:
Socket: define the type of communication protocol
Bind : assign a name to a socket
Listen :notice of willingness to receive a connection
Accept : wait for an actual connection
Close : close a connection
Send : send data across a connection
Receive: receive data across a connection
The patterns of use /restrictions on the TCP/IP operations can be described informally as:
All "legal" programs must match this regular expression pattern, including the following simple client program:
Int sockfd,newsockfd;Note that the actually using pattern of the TCP/IP operations in this case does match the regular expression. Note also that we used an informal argument to derive the regular expression pattern of use from the informal specifications of each of the TCP/IP operations. This is one of those cases where the pattern of use is a theorem that can be proven from the specification of the individual TCP/IP operations, and we can thus benefit from a more formal specification of these operations.
If (bind(sockfd,5) <0)
doit(newsockfd); /* process request with sends/recieves */
A number of efforts have been made to formalize these high level usage patterns of modules and their interconnections, and an interesting body of work of great interest to us is being done by Allen, Garlen, and Ivers [Allen98]. Specifically, we borrow a number of concepts from their work extending CSP (Communicating Sequential Processes) to deal with architectural patterns. Their resulting formal language, Wright, allows for the specification and analysis of module interactions by formalizing and verifying the interaction patterns.
We utilize their ideas in a slightly different way. Our premise is that we can use CSP processes to define both the formal and actual patterns of use of a module interface. We can then use formal techniques to verify that the "actual parameter" using pattern is consistent with the "formal parameter" providing pattern of the high level module interface. One can view the formal and actual usage patterns as two communicating sequential processes that must properly synchronize in order to make progress. We can also view the two usage patterns as formal expressions whose consistency might be checked with a formal tool such as FDR (Failures, Divergence, Refinement) [FDR97].
From figure 5 we can see that Allen, Garlen, and Ivers take an architectural point of view, defining client and server roles and then "connecting" them together with a glue event process, all written in CSP. Our view is to provide formal patterns of use in the high level interface of the TCP/IP module. In both cases the actual client and server are verified for correctness, but in our view correctness has already been established through the verification of the TCP/IP specification. Note that the TCP/IP module interface (figure 5) provides two patterns of use, one for a simple client and one for a simple server. These patterns are assumed to be correct when used by the actual client and server modules.
A simple server formal usage pattern might look like:
The above CSP expressions can be thought of either as a single sequential process or as an event pattern, the way we would like to view them. The CSP process x -> P says "execute x and then behave like process P. So, the expression "Server_Pattern" begins by executing CreateSocket and then behaving like the remainder of the expression. The remaining expression in turn executes Bind and then behaves like Listen. When executed in tandem with another CSP process, like operations must be executed together in order to collectively make progress. We will see how this plays out when the formal CSP event pattern is "executed" together with the actual pattern derived from the actual server program described earlier..
Here is what a simple client formal usage pattern might look like:
From the legal program we can derive an actual usage pattern in CSP:
Now, to verify that this actual pattern "matches" the formal client usage pattern, we symbolically execute the formal and actual CSP event patterns. If they are consistent then they will "synchronize" and both always "make progress".
Here is the synchronized "execution":
As for our earlier example, the formal and actual server and client patterns can actually be proven as a theorem of the underlying "low level" TCP/IP interface. This is a subject for further study - the formulation and proof of higher level event patterns from interface specifications. But what about those event patterns that are not derived explicitly from interface restrictions, but that influence how the interface is designed or its implementation chosen? For example, when using graph algorithms we typically consider their relationship with the interface and implementation of their underlying graph abstract data type. In this case we can think of the algorithm and its underlying graph services existing symbiotically in a system. One is not thought of as "coming before" the other. We can actually say that the collection of application event patterns influence ones understanding of the underlying services they use. In this case, patterns of use of the using program, while they are not strictly part of the high level interface of the providing module, are part of the pragmatics of that module. In order to truly understand the meaning of the providing module, one needs to understand the collection of applications that cause it to be the way it is.
The origins of this work come from a paper by Garlen, Allen, and Ockerbloom, "Architectural Mismatch: Why reuse is so hard" [Garlen95]. While that paper enumerates a number of fascinating issues concerning the mismatch between COTS components in a complex system development project, it ultimately raises more questions than answers. A very nice followup paper by Allen,Garlen, and Ivers, "Formal Modeling and Analysis of the HLA Component Integration Standard" [Allen98] gave us the idea of using CSP [Hoare85] to model the relationship between formal interface descriptions and actual interface usage. We borrowed their idea of using event patterns to model component connectors from a "top-down", architectural perspective in order to model interface obligations from a "bottom-up", component point of view. In both cases event patterns are used to verify the correct interconnection between to components, but in our case we're more interested in verifying that a using program uses the services of a providing module correctly.
In order to analyze the "correctness" of their module interconnections,
Allen, Garlen, and Ivers use a CSP model checker called FDR [FDR297].
Although we haven't explored formal analysis to any great depth, this looks
like a promising direction to explore.
We propose that each component consist of both a low level, traditional, interface, and a high level interface consisting of those pragmatics that can be formalized and analyzed. Patterns of use are one of the formal objects of this high level interface. Furthermore, some of these patterns are properties of the traditional interface, whereas other patterns might be user level patterns that have influenced that traditional interface to provide the services it provides.
We looked at a number of formalisms for capturing interface patterns of use, specifically mentioning here regular expressions and CSP. CSP seems to be a good candidate for capturing patterns of use because (1) it is a formal notation, (2) it can be symbolically executed to show that a using and providing module are consistent, and (3) theorem provers can be used to verify correctness.
We would like to compare our notion of interface pragmatics to the techniques other researchers have used to try to formalize "higher level" architectural information, especially from the point of view of a component. It seems to us that the exploration of this "stuff in the ether floating above a component" is what will help us to get a better handle on system integration issues.