The Pragmatics of Integrative Programming

Larry Latour, Ling Huang, and Tom Wheeler
Software Engineering Laboratory, Dept. of Computer Science
University of Maine
222 Neville Hall
Orono, Maine 04469 USA

{larry.latour,ling.huang,tom.wheeler}.umit.maine.edu
Phone: +1 207 581 2260
Fax: +1 207 581 4977
URL: http://www.umcs.maine.edu/~pbrick

Abstract

We say that Interface "Pragmatics" are the unstated assumptions about the way a module interface and it's implementation is used, but that still must be understood in order for us to properly use that module. Such pragmatics can range from simple restrictions placed on the module to complex architectural assumptions about how the module can and must be used. Program integration is the process of combining modules in such a way that inconsistencies between module assumptions are avoided, and the use of pragmatic information can go a long way toward supporting this process. We look specifically at formalizing pragmatic information through the capture and analysis of patterns of use, first by using regular expressions to capture interface patterns, and then by using CSP (Communicating Sequential Processes) [Hoare85] to capture and analyze interaction patterns between using and providing modules.
Keywords
Program composition, program integration, interface pragmatics, regular expressions, CSP
Paper Category: technical paper
Emphasis: research

1. Introduction

Experience teaches us that the cornerstone of good software engineering practice is the concept of separation of concerns. This, along with it's companion support mechanisms of information hiding and modularization, drives the program development and understanding process, allowing us to manage the complexity of a large system. Each concern is encapsulated and modeled with an interface, and these interfaces are then combined into a working system using one of a variety of existing composition techniques.

Unfortunately, problems arise when systems are recomposed in this manner. We argue that many of these problems are due to the incomplete information typically provided by interfaces.

According to David Parnas [Parnas77], an interface is the set of "Assumptions that a using module makes of a providing module" and "a providing module makes of using module". But we need to analyze and elaborate such assumptions. They are not captured by just the types and accessing operations, as is usually taken for granted. There is more to it than that. The code that is written to deal with unstated assumptions of ordering, multiple module interactions, obligations, etc. of the operations shows that there is something else that is needed besides the operation definitions and the function of the module(s) being written. The goal of this research is to explore what this "something else" is.

If a module (or an object of an ADT) is thought of as defining a language then the thought pattern of "syntax, semantics, pragmatics" can be applied to module interfaces:

Pragmatics Layer
Syntax Layer
Semantics Layer (Model)

Syntax is handled in programming languages by the structure of access routines. Semantics is the meaning of each of these access routines. But what are pragmatics? This concept is talked about with respect to natural languages, but never with respect to abstract module interfaces.

For our purposes pragmatics are the unstated assumptions about way an interface is used, but that still must be understood in order to properly use that interface. In natural language understanding, they are best shown by an example - "I'm beside myself". Pragmatics make things more natural and smooth in natural language, but actually seem to be necessary in formal language.

The following is a definition of Semantics and Pragmatics, taken from the Natural Language section of an online AI course [Cawsey94]:

"… semantics and pragmatics, are concerned with getting at the meaning of a sentence. In the first stage (semantics) a partial representation of the meaning is obtained based on the possible syntactic structure(s) of the sentence, and on the meanings of the words in that sentence. In the second stage, the meaning is elaborated based on contextual and world knowledge."

Interface pragmatics can range from simple restrictions place on abstract data type operations to complex architectural assumptions about how an interface can and must be used.

2. The Problem

Often mismatch problems occur between modules, and the process of composing multiple modules can cause tangled code. A good case study illustrating some of these mismatches is described in [Garlen95]. In addition, design decisions that are usually thought of as being in the domain of the using module are actually dictated by the often unstated or poorly stated assumptions of the providing module.

3. The Position

If we make an effort to enumerate and formalize unstated assumptions about the use of a module, we might be able to use the results to:

better understand the meaning of a component's syntax, semantics, and pragmatics.
better understand the process of system integration.
better design and verify the correctness of the using module.
better analyze and verify the correctness of the system.

4. Justification

We've developed a number of examples using both regular expressions and CSP (Communicating Sequential Processes) [Hoare85]. It seems that this technique shows promise for formalizing and analyzing patterns of use, and as a result provides more information to the user of a component about how to properly integrate that component into a system.

Let's begin with a simple file interface, with operations open, read, write, close, and endfile. As straightforward as this abstraction is, there are still ordering restrictions that must be obeyed in order for a using program to work properly. For example, typically open must be used before reading or writing, the file "must" be closed (either explicitly or implicitly), and endfile must be checked before reading. We can represent these ordering restrictions with a regular expression, and then show three simple using programs, one that matches the expression, and two that don't.

Figure 1

What we typically consider as the interface here is really a "low level" interface - the specification of each independent operation in the interface. What we're missing are the generic, "higher level" patterns of use of the combined operations, or what we'll call "high level" interfaces. These generic patterns can either be derived from restrictions based on the interface specification of the operations or on implied restrictions based upon an implementation of the operations. For example, in figure 1 above, the regular expression describing the pattern of use of the file operations can be derived as a theorem from the specifications of the file interface operations, whereas in other cases patterns of use might be semantically correct but more or less efficient dependent on the implementation choices of the operations.

Consider the following simple example of an iterator class:

Class iterator
GetFirst (item)
GetNext (item)

In the case of this iterator, the forced pattern of use, or the high level interface of the iterator, is:

GetFirst (item)
While successful
Process the item
GetNext (item)

It is these patterns of use that caused us to specifically partition the low level interface into getFirst and getNext to begin with.

While we typically say low level interfaces are composed into a higher level module, we might better say that high level interfaces are integrated into that higher level module. The term "integration" better suggests that a good deal of important activity is going on during the integration process. It is the instantiation and integration of high level interfaces that is the value added in the higher level module.

Figure 2

Part of the reason that programmers tend to practice bottom-up programming (as opposed to purely top-down functional decomposition) is that they have developed and refined skills for integrating high level interfaces into a blending of concerns in a using module. But a number of problems arise with higher level interface integration that work against the programmer:

1. There is no explicit intermediate step of instantiating a generic "formal" high level interface with an "actual" high level usage pattern. The programmer implicitly moves directly to an integrated pattern that instantiates two or more high level interfaces together in an integrated pattern. Our Solution: there needs to be both a formal language for describing a high level interface and an explicit instantiation of this interface to an actual usage pattern before the step of integrating high level patterns together
2. Many times usage patterns from high level interfaces "tangle" together in a way that leads to undue code complexity and subsequent confusion. Gregor Kiczales's Aspect Oriented Programming [Kiczales97] also addresses this type of problem. Our Solution: the tangling rules between usage patterns derived from high level interfaces need to be formalized and analyzed using an integration language.
3. Often high level interfaces are incompatible and need to be redesigned. Dave Garlen's Architectural Mismatch paper [Garlen95] discusses the ramifications of this. Our solution: this incompatibility needs to be expressed in terms of mutual exclusion restrictions between elements of the offending usage patterns, derived from offending underlying high level interfaces.

One of the mindsets that contribute to this problem is that we tend to view systems concerns on a single level. For example, if we separate concerns in a system using OO methods, we might come up with the decomposition in figure 3:

Figure 3

The arrows represent the "uses" relation, i.e., which objects call which operations of which other objects.

Unfortunately, a concern is not completely abstracted away in an object. It is replaced by not only its low level interface, but more critically, its high level interface. A using module has to deal with the restrictions place on it by the high level interfaces of each of the objects it uses. The concerns of integrating these high level interfaces are meta-concerns, and tend to be more global/integral in nature. For example, consider the concerns of modules B,C, and D integrated in A in figure 4.

Figure 4

A more complex example is the module interface of a FIFO TCPIP network “Connection”. Typically it only provides specifications of individual operations, duch as:

     Socket: define the type of communication protocol
     Bind : assign a name to a socket
     Listen :notice of willingness to receive a   connection
     Accept : wait for an actual connection
     Close : close a connection
     Send : send data across a connection
     Receive: receive data across a connection

The patterns of use /restrictions on the TCP/IP operations can be described informally as:

Socket must be called before bind
Bind must be called before listen
If called, listen must be called before accept
Accept must be called before send/recv
Close must be called after accept and before next accept

From these informal sematics we can extract a regular expression into the high level interface of

Socket,Bind,[listen],(accept,(send|recv)*,close)*

All "legal" programs must match this regular expression pattern, including the following simple client program:

Int sockfd,newsockfd;
If (sockfd=socket(…)<0)
        err_sys(“socket error”);
If (bind(sockfd,5) <0)
       err_sys(“socket error”);
For(;;) {
        newsockfd=accept(sockfd,…);/*blocks*/
        if (newsockfd<0)
               err_sys(“socket error”);
        doit(newsockfd); /* process request with sends/recieves */
        close(newsockfd);}

Note that the actually using pattern of the TCP/IP operations in this case does match the regular expression. Note also that we used an informal argument to derive the regular expression pattern of use from the informal specifications of each of the TCP/IP operations. This is one of those cases where the pattern of use is a theorem that can be proven from the specification of the individual TCP/IP operations, and we can thus benefit from a more formal specification of these operations.

A number of efforts have been made to formalize these high level usage patterns of modules and their interconnections, and an interesting body of work of great interest to us is being done by Allen, Garlen, and Ivers [Allen98]. Specifically, we borrow a number of concepts from their work extending CSP (Communicating Sequential Processes) to deal with architectural patterns. Their resulting formal language, Wright, allows for the specification and analysis of module interactions by formalizing and verifying the interaction patterns.

We utilize their ideas in a slightly different way. Our premise is that we can use CSP processes to define both the formal and actual patterns of use of a module interface. We can then use formal techniques to verify that the "actual parameter" using pattern is consistent with the "formal parameter" providing pattern of the high level module interface. One can view the formal and actual usage patterns as two communicating sequential processes that must properly synchronize in order to make progress. We can also view the two usage patterns as formal expressions whose consistency might be checked with a formal tool such as FDR (Failures, Divergence, Refinement) [FDR97].

figure 5

From figure 5 we can see that Allen, Garlen, and Ivers take an architectural point of view, defining client and server roles and then "connecting" them together with a glue event process, all written in CSP. Our view is to provide formal patterns of use in the high level interface of the TCP/IP module. In both cases the actual client and server are verified for correctness, but in our view correctness has already been established through the verification of the TCP/IP specification. Note that the TCP/IP module interface (figure 5) provides two patterns of use, one for a simple client and one for a simple server. These patterns are assumed to be correct when used by the actual client and server modules.

A simple server formal usage pattern might look like:

The above CSP expressions can be thought of either as a single sequential process or as an event pattern, the way we would like to view them. The CSP process x -> P says "execute x and then behave like process P. So, the expression "Server_Pattern" begins by executing CreateSocket and then behaving like the remainder of the expression. The remaining expression in turn executes Bind and then behaves like Listen. When executed in tandem with another CSP process, like operations must be executed together in order to collectively make progress. We will see how this plays out when the formal CSP event pattern is "executed" together with the actual pattern derived from the actual server program described earlier..

Here is what a simple client formal usage pattern might look like:

From the legal program we can derive an actual usage pattern in CSP:

Now, to verify that this actual pattern "matches" the formal client usage pattern, we symbolically execute the formal and actual CSP event patterns. If they are consistent then they will "synchronize" and both always "make progress".

Here is the synchronized "execution":

As for our earlier example, the formal and actual server and client patterns can actually be proven as a theorem of the underlying "low level" TCP/IP interface. This is a subject for further study - the formulation and proof of higher level event patterns from interface specifications. But what about those event patterns that are not derived explicitly from interface restrictions, but that influence how the interface is designed or its implementation chosen? For example, when using graph algorithms we typically consider their relationship with the interface and implementation of their underlying graph abstract data type. In this case we can think of the algorithm and its underlying graph services existing symbiotically in a system. One is not thought of as "coming before" the other. We can actually say that the collection of application event patterns influence ones understanding of the underlying services they use. In this case, patterns of use of the using program, while they are not strictly part of the high level interface of the providing module, are part of the pragmatics of that module. In order to truly understand the meaning of the providing module, one needs to understand the collection of applications that cause it to be the way it is.

5. Related Work

The origins of this work come from a paper by Garlen, Allen, and Ockerbloom, "Architectural Mismatch: Why reuse is so hard" [Garlen95]. While that paper enumerates a number of fascinating issues concerning the mismatch between COTS components in a complex system development project, it ultimately raises more questions than answers. A very nice followup paper by Allen,Garlen, and Ivers, "Formal Modeling and Analysis of the HLA Component Integration Standard" [Allen98] gave us the idea of using CSP [Hoare85] to model the relationship between formal interface descriptions and actual interface usage. We borrowed their idea of using event patterns to model component connectors from a "top-down", architectural perspective in order to model interface obligations from a "bottom-up", component point of view. In both cases event patterns are used to verify the correct interconnection between to components, but in our case we're more interested in verifying that a using program uses the services of a providing module correctly.

In order to analyze the "correctness" of their module interconnections, Allen, Garlen, and Ivers use a CSP model checker called FDR [FDR297]. Although we haven't explored formal analysis to any great depth, this looks like a promising direction to explore.

6. Conclusion

Our premise is that the problem with wide-spread reuse of components (or the lack of it) is that unstated assumptions about the use of a component and its implementation often get in the way of system integration. Much as Garlen found in his "Architectural Mismatch..." paper [Garlen95], we see that unstated component assumptions, or pragmatics, are in many cases mismatched and need to be reimplemented or translated to solve the problem. Even if mismatch doesn't occur, many design decisions normally considered in the realm of the using program designer are actually obligations that are inherited from the modules used in that implementation - "often there ain't but more than one way to implement a module".

We propose that each component consist of both a low level, traditional, interface, and a high level interface consisting of those pragmatics that can be formalized and analyzed. Patterns of use are one of the formal objects of this high level interface. Furthermore, some of these patterns are properties of the traditional interface, whereas other patterns might be user level patterns that have influenced that traditional interface to provide the services it provides.

We looked at a number of formalisms for capturing interface patterns of use, specifically mentioning here regular expressions and CSP. CSP seems to be a good candidate for capturing patterns of use because (1) it is a formal notation, (2) it can be symbolically executed to show that a using and providing module are consistent, and (3) theorem provers can be used to verify correctness.

We would like to compare our notion of interface pragmatics to the techniques other researchers have used to try to formalize "higher level" architectural information, especially from the point of view of a component. It seems to us that the exploration of this "stuff in the ether floating above a component" is what will help us to get a better handle on system integration issues.

References

[Allen98]

R.J. Allen, D. Garlen, and J. Ivers. Formal Modeling and Analysis of the HLA Component Integration Standard. Proceedings of of the Sixth International Symposium on the Foundations of Software Engineering (FSE-6), 1998

[Cawsey94]

A. Cawsey. Databases and Artificial Intelligence 3 Artificial Intelligence Segment. Online course: http://www.cee.hw.ac.uk/~alison/ai3notes/section2_7_4.html 1994.

[FDR97]

Failures, Divergence, Refinement: FDR2 User Manual. Formal Systems (Europe) Ltd., Oxford, England, version 2.22 edition, October, 1997.

[Garlen95]

D. Garlen, R. Allen, and J. Ockerbloom. Architectural Mismatch: Why reuse is so hard. IEEE Software, November, 1995.

[Hoare85]

C.A.R. Hoare. Communicating Sequential Processes. Prentice Hall, 1985.

[Kiczales97]

G. Kiczales, J. Lamping, A. Mendhekar, C. Maeda, C.V. Lopes, J. Loingtier, J. Irwin. Aspect-Oriented Programming. Proceedings of the European Conference on Object-Oriented Programming (ECOOP), Finland. Springer-Verlag LNCS 1241. June 1997.

[Parnas77]

D.L. Parnas. Use of Abstract Interfaces in the Development of Software for Embedded Computer Systems, NRL Report 8047, 3 June 1977.