Design Issues Toward an Object Oriented RESOLVE

Roy Patrick Tan
Dept. of Computer Science
Virginia Tech
660 McBryde Hall, MS 0106
Blacksburg, VA 24061 USA

rtan@vt.edu

Abstract

Object-oriented programming is the most popular software development paradigm today. As such, the development of object- oriented support in RESOLVE may benefit the general programmer. As a first step, this paper surveys the various issues that must be considered in the creation of an object-oriented version of RESOLVE.

Keywords

Object-oriented design, RESOLVE, language issues

Paper Category: technical paper
Emphasis: research

1 Introduction

Object oriented programming is probably the most popular software development paradigm today. Although component-based, RESOLVE is not an object-oriented language. An object- oriented version of RESOLVE may benefit the programmer by giving a better mapping between the RESOLVE discipline and popular modern object-oriented languages, and making RESOLVE more accessible to those who are familiar with object-oriented software development.

To redesign RESOLVE as an object oriented language, careful consideration must be given to all issues regarding this transition. We need to explore issues surrounding questions such as:

What are classes, and do they relate to types and modules in RESOLVE?
Should an OO RESOLVE use reference or value semantics?
What is inheritance, and how is it used?
How should RESOLVE implement specification inheritance properly?
How should an OO RESOLVE support implementation inheritance?
What about multiple inheritance, should we support it fully, partially, or not at all?
How will object-orientation affect type checking?
How will exceptions be handled?

Fortunately, many people have traveled similar roads before, and we can learn their experiences. This paper presents a survey of pertinent issues that might arise in evolving RESOLVE to become an object-oriented language.

2. Classes, Types and Objects

2.1 What are Classes Anyway?

Classes are the primary level of data abstraction in mainstream object-oriented languages. In many object-oriented languages, classes are used to define two things at the same time: an encapsulation boundary (a module), and a type [Meyer98]. Typically, classes contain both the definition of a data structure, and functions or procedures (methods) related to the data structure. Thus a class defines a boundary for encapsulation -- similar to a RESOLVE module. Because every class typically defines only one data structure, the term class in most OO languages is also equivalent to the term type.

The addition of inheritance adds a new dimension: A class or type can be thought of either as one specific type, or as that specific type plus all other subtypes derived from it. Ada distinguishes between these two uses: a specific type T is what is properly called a "type" in Ada. In contrast, a "class" means "the union of all types in the tree of derived types rooted at T" [Wheeler97].

While RESOLVE currently does not fully support inheritance in the object-oriented sense, RESOLVE has in common with Eiffel and Ada a built in support for genericity. Parameterized types are often also called generic types (or generic classes), to distinguish them from regular types or classes.

2.2 Should RESOLVE Modules be Classes?

Because the class is the basic level of encapsulation in normal object-oriented languages, there is no distinction in most of such languages between type and module. That is, a class as viewed in a language like Java is like a RESOLVE module that exports only one type, and both module and type are called the same thing.

To implement OO-RESOLVE, we might want to follow the mainstream OO languages and restrict modules to export one type. In fact, in [Sitaraman94], it is already mentioned that every specification normally exports only one type, as it is usually possible to decouple two types into two separate components. However, it is also hinted that there are situations where having more than one type per component is preferable.

It is also possible to retain the module/type distinction, and allow multiple types (classes) to be defined within a module. Ada 95, for example, allows the definition of more than one type within a package.

2.3 Objects and Variables: Value or Reference?

Value versus reference semantics refer to what happens when one variable is assigned to another or is passed as a parameter. Scalar types in C and Java use value semantics assignments and parameter passing are always done by copying the contents of one variable to another. Copying is necessarily expensive when passing large amounts of data, so languages with value semantics like C may require the programmer to explicitly pass pointers instead.

Many object-oriented languages have reference semantics. Variables in a language with reference semantics hold only a reference to an object, rather than the object itself. In a sense, variables always contain pointers, and only the pointers get copied through assignment or parameter passing. With reference semantics, in a variableÕs lifetime, a variable can refer to more than one object. Another consequence of reference semantics is that two variables may refer to the same object at the same time.

One unique feature of RESOLVE is that its semantics are neither value semantics nor reference semantics. The primary method of data-transfer in RESOLVE is swapping. Because swapping allows programmers to think in terms of value semantics regardless of the actual implementation, the use of swapping in RESOLVE solves a number of problems that assignment (both by reference and by value) has [Harms91].

A related issue is object identity. Some language experts believe that for a language to be fully object oriented, there must be a mechanism for determining between two variables whether they pertain to two different objects or not. In the case of RESOLVE, it shares an implicit mechanism with languages with value semantics: at any point in time, two different variables will always refer to two different objects. This is potentially a problem, since it is merging the concepts of name and identity [Khoshafian86].

3 Inheritance

3.1 Many Kinds of Inheritance

Inheritance is at the heart of object-oriented programming languages. Perhaps one reason that object-oriented programming is popular is that it mimics strongly how people think about everyday objects. That is, one can describe an object by how different it is from another, a kind of programming by difference. Inheritance however, is used in many different, often conflicting ways [Edwards93]. One problem with inheritance is that it is a single mechanism that encodes different kinds of component relationships [Edwards97]. In fact, Meyer defines a taxonomy of 12 different uses of inheritance, but fails to see how splitting the single inheritance mechanism into several mechanisms would help [Meyer98].

Figure 1. Some different uses of inheritance

In typical OO systems, the inheritance mechanism usually makes a class inherit two potentially separable things. The first is interface inheritance where a subclass declares that it inherits all the method signatures of the superclass. The other type of inheritance is implementation inheritance. This is where a subclass not only inherits the interface of its superclass, it also inherits the code that implements the interface. Smalltalk derived languages such as C++ and Java share the same feature that implementation and interface inheritance are coupled together. However, RESOLVE need not follow their example, since there is a strong distinction in RESOLVE between interface (concepts) and implementation (realizations). It is thus possible to think of interface inheritance as inheritance between concepts (and from concepts to realizations), and implementation inheritance as inheritance between realizations.

3.2 Relationships Between Components

One important issue in inheritance is how derived types are related to their parent types. In strongly typed languages, it is commonly held that the "is a" relationship should hold: the derived type "is a" parent type. Under one interpretation, this relationship means that a variable declared as type T should be able to hold any object of the type T, or any other types derived from T. This is called the "Liskov substitution principle" [Liskov88].

The Liskov substitution principle is especially important in a specification language like RESOLVE. This is because if the "is a" relationship is to be enforced, it means that a componentÕs specifications must also hold for any component that derives from it. That is, to enforce the "is a" relationship the derived component must be a behavioral subtype of the parent [Liskov94]. However, inheritance is typically used for more than just behavioral subtyping.

In [Edwards97], three general component relationships for which inheritance is (sometimes) used are outlined: implements, uses, and extends. RESOLVE has some support for these component relationships Ð realizations implement concepts and use other components, and enhancements extend other concepts. However, these three relations are supported only in a limited, non object- oriented fashion, i.e. in extension, new operations can be specified, but no additional specification can be made for existing operations. Similarly, with the uses relationship, there is no possibility of reusing publicly inaccessible (i.e. private) data structures.

3.3 Specification Inheritance

One way to enforce behavioral subtyping is through specification inheritance. That is, if a concept B is a behavioral subtype of another concept A, B does not need to respecify its pre- and post-conditions and invariants. Instead only new constraints are added to B, then a few rules are applied to get the total constraint. Dhara and Leavens describe a method of specification inheritance [Dhara97].

In general, preconditions can be weakened, but postconditions can only be made stronger. So if component B is a behavioral subtype of A, then the precondition of a method in B is composed of a disjunction of the precondition defined at A, and the precondition defined at B. However, the postcondition of a method in B is a conjunction of two implications. The first is that the precondition defined in A implies the postcondition defined in A. The second is that the precondition defined in B implies the postcondition defined in B. Invariants are conjoined [Dhara97].

[Dhara97] also defines a form of weak behavioral subtyping, in which, with proper restrictions on the language, history constraints of the supertype need not be enforced. This allows, for example, mutable arrays to be a behavioral subtype of immutable arrays.

3.4 Contravariance and Covariance

A subset of the above problem of enforcing behavioral subtyping is the question of having covariant or contravariant parameter passing modes. The contravariance vs. covariance debate has been heatedly discussed between the Sather and Eiffel languages. The Eiffel FAQ gives a pretty clear explanation of the issues involved:

Consider the following situation: we have two classes PARENT and CHILD. CHILD inherits from PARENT, and redefines PARENT's feature 'foo'.
class PARENT
feature
    foo (arg: A) is ...
end

class CHILD
inherit PARENT redefine foo end
feature
    foo (arg: B) is ...
end
The question is: what restrictions are placed on the type of argument to 'foo', that is 'A' and 'B'? (If they are the same, there is no problem.)

Here are two possibilities:

B must be a child of A (the covariant rule - so named because in the child class the types of arguments in redefined routines are children of types in the parent's routine, so the inheritance "varies" for both in the same direction)

B must be a parent of A (the contravariant rule)

Eiffel uses the covariant rule. However, to be fully compliant with the Liskov principle, recall that for any variable of type T, the variable can hold any of TÕs derived classes. So if in the above example, we have a variable of type PARENT that holds an object of type CHILD, we must be able to call foo with an argument of type A. This is not possible if B is a derived type of A.

Sather uses the contravariant rule, that is all parameter types (aside from self) are contravariant, and the return value is covariant [Szyperski93]. Since the procedure interface is really a part of the behavioral specification of a component, this scheme is better for RESOLVE. More generally, in-mode parameter types should be contravariant, out-mode parameter types should be covariant (the return value being a special case of an out-mode parameter), and in/out mode parameter types must be invariant.

3.5 Implementation Inheritance

A second use of inheritance is inheritance between realizations. In many cases, implementation and behavioral inheritance are tightly coupled. However, some have argued that behavioral subtyping should be separate from implementation inheritance. For example, a doubly-ended queue can be a behavioral subtype of stack, but it is more convenient for stack to inherit its implementation from doubly-ended queue [Snyder86].

RESOLVE's strong separation of concepts from realizations may provide an opportunity to decouple implementation inheritance from behavioral subtyping. This leads to many more questions such as whether it is necessary to put restrictions on which realizations other realizations can inherit from.

Currently, RESOLVE enhancements provide some form of implementation inheritance by allowing the enhancement to inherit all the public interface of the original module, as well as the implementation for it. However, all new functionality must be in terms of the publicly callable functions of the original component. This type of inheritance is called public inheritance, or black-box inheritance.

There may be cases, however, when private or white-box inheritance is desirable. Perhaps some the underlying data structure of a realization might be useful for another component reusing it. Java and C++ implement this with the protected modifier, allowing subclasses to access otherwise restricted data members or methods. Edwards presents a case for this type of inheritance with the example of extending a list type with a swap_rights operator. In this case, the performance of the swap_rights operator is significantly improved if white-box reuse is allowed. The paper also describes a method of representation inheritance whereby white-box inheritance can be allowed while retaining formally specified properties [Edwards96].

3.6 Multiple Inheritance

Another important issue is multiple inheritance. Sometimes, a mechanism is needed whereby a class is derived from two classes instead of one. There are several approaches taken by different languages. Some languages like Smalltalk disallow multiple inheritance altogether, sacrificing flexibility for simplicity. Other languages like C++ and Eiffel allow multiple inheritance, which brings problems of its own, such as naming conflicts and the diamond problem discussed below.

Other approaches allow a limited form of multiple inheritance. Java, for example, allows classes to inherit implementation from only one class. However, using Java interfaces, a class may derive from and implement multiple interfaces. Another alternative to multiple inheritance are mix-ins, in which common methods across multiple classes can be defined.

3.6.1 The Diamond Problem

The classic example for the diamond problem involves the types Person, Student, Teacher, and Teaching_assistant. The Student and Teacher classes both derive from Person. The Teaching_assistant class, however, derives from both student and teacher. Several problems arise from this arrangement.

For common fields inherited from Person, should there be only one copy of field, or two? If a method is redefined in either the Student or the Teacher classes, which method does Teaching_assistant use? The solution lies in renaming, for example, common fields are shared by default but if replication is necessary, then the fields are renamed, so no naming clashes occur. [Meyer98] offers a thorough discussion of the problem of repeated inheritance, and offers solutions such as undefining one or more inherited methods.

3.6.2 Mix-ins

Frequently, a set of operations are applicable to a wide set of classes. Take for example any class that implements a "compare" method, such that when A.compare(B) is called, the method returns a value that distinguishes whether A is less than B, A equals B, or A is greater than B. Clearly an additional set of operations (greater_than, less_than, greater_than_equal_to, etc.) can be defined that will work in essentially the same way regardless of the class that implements the compare method. Mix-ins allow these methods to be defined in a separate module, and inherited when appropriate.

A mixin can be thought of as an abstract subclass that can be applied to may superclasses [Bracha90], or as specifying an extension, but deferring what type the extension is for. In C++, a mixin class can be implemented by having the superclass as the parameter of a template. ie: [Smaragdakis98]

//C++ mixin example
template <class SuperClass>

class Mixin : SuperClass {
    ... /* mixin body */
}

The question of how this approach might be useful for RESOLVE may be an interesting area to explore.

4 Other Issues

4.1 Runtime Type Issues and Casting

It will have to be decided whether it is necessary for RESOLVE to check for type errors at runtime or not. Variables are dynamically typed in languages such as Smalltalk and thus type errors must always be checked at runtime. Other languages such as Eiffel and Sather try to catch as much type errors at compile time. However, even in these languages there still are ways to abuse the type system, and thus runtime type checking is still done.

One of the most common reasons for runtime type errors in languages such as Java is downward casting. In turn, downward casts are rampant in Java because of the lack of genericity. For example, container types in Java all store items of the root Object type. When extracting an item from a container therefore, a Java program must subsequently cast the item down from Object to whatever the original type of the item is.

RESOLVE has ample facilities for generics, and thus will not require downward casts in this sense. However, RESOLVEÕs swapping semantics requires a facility for downward casts. For example, if A is a variable of type Parent and B is a variable of type Child, then what happens when we swap them? The upward cast from B to A is not a problem, but the downward cast from A to B, must be handled such that type errors can be statically checked. Downcasting may also be needed when there is a legitimate need to store objects of sibling types in a container.

An alternative to swap as data movement may be assignment by consumption. In the above example, we can do A := B. Where after the assignment, A contains the old value of B, and B is reinitialized. This method retains some of the qualities of swapping, but removes the necessity of a downward cast. There might be implications for memory management though, since the old value of A is "lost".

One possible method for doing a safe downcast is to use an approach inspired by ALGOL type unions [McGettrick78]. Recall the Ada notion of a class C as a union of all types in the type hierarchy rooted at C. Thus, C can be treated as a type union, and so to "downcast" C into one of its subtypes, we can use a variation of the ALGOL case statement. For example, we have a type Bag, with subtypes List and Stack. We can write something similar to:

case bag_var in
    (List) x: //do something with x
    (Stack) y: //do something with y
end case

This can ensure that the program is statically type-checkable, but perhaps making the progammer work a little bit more.

4.2 Exceptions

Most mainstream object oriented languages have some exception mechanism for cases where a method fails because of an exceptional condition. Exceptions are any unusual event that may require special processing. Exceptions may be used to detect errors due to precondition violations (such as the divide operator throwing a division-by-zero exception when the divisor is 0). However, exceptions are also used for unusual events that are not necessarily precondition violations. One example is I/O failure. I/O failures may not be expected in the normal course of processing, but might need special processing if they do occur [Sebesta02]. If RESOLVE is to have exception handling, it must be able to specify these exceptional conditions.

One formal specification language that supports exceptions is the Java Modeling Language (JML). JML works with exceptions by having a specification for normal behavior, and another specification for exceptional behavior. A correct implementation of a component must be able to satisfy both behaviors. That is if the precondition for the normal behavior applies, the implementation must satisfy the specification for the normal behavior, and if the precondition for the exceptional behavior applies, then the implementation must satisfy the specification for the exceptional behavior. [Leavens99]

VDM++ similarly extends VDM. Interestingly, VDM++ separates representation inheritance and method inheritance -- one can actually choose which superclass methods are inherited by the subclass. VDM++ also introduces the interesting construct called traces, which specifies the order in which methods may be called [Durr92].

An object-oriented RESOLVE would be different from other OO specification languages in the same way RESOLVE is different from non-OO specification languages. It is both a specification language and an implementation language. As such, it is in a unique position to disallow unsafe object-oriented development practices at the level of the implementation language.

Eiffel is an object-oriented language with facilities for generic types. EiffelÕs design by contract is provides a mechanism for invariant and pre- and post-condition checking. There are, however limitations to its type system, for example, the covariant/contravariant issue above. Sather is closely related to Eiffel, but is contravariant, and has a distinction between "subtype" (behavior) and "subclass" (implementation) inheritance. Both languages suffer from having the assertion as part of the implementation language, and thus discouraging the use of assertions that might be computationally expensive.

6 Conclusion

This paper surveys the many design issues to be considered for designing an object-oriented version of RESOLVE. While several issues remain contentious, the experiences of many people can guide the design process. Of course not all of the solutions deal exactly within the framework and unique perspective of RESOLVE.

While meant to touch on the important topics, the issues presented by this paper were not meant to be comprehensive. There are sure to be other problems and issues that need to be confronted as the new language is fleshed out. Taking up the endeavor of designing and implementing an object-oriented language based on RESOLVE promises yield many interesting research topics.

Acknowledgments

This research is funded in part by NSF grant CCR-0113181.

Bibliography

[Bracha90]: Gilad Bracha and William Cook. Mixin-based inheritance. In Proc. of the Joint ACM Conf. on Object-Oriented Programming, Systems, Languages and Applications and the European Conference on Object-Oriented Programming, Oct 1990.
[Dhara97]: Krishna Kishore Dhara and Gary T. Leavens. Forcing behavioral subtyping through specification inheritance. Proceedings of the 18th International Conference on Software Engineering (ICSE'18), Berlin, Germany, Mar 1996.
[Durr92]: E. H. Durr and J. van Katwijk. VDM++ --- a formal specification language for object- oriented designs. In Computer Systems and Software Engineering, Proceedings of CompEuro'92, pages 214--219. IEEE Computer Society Press, 1992.
[Edwards93]: Stephen H. Edwards. Inheritance: One mechanism, many conflicting uses. In Proceedings of the Sixth Annual Workshop on Software Reuse, Nov 1993
[Edwards96]: Stephen H. Edwards. Representation inheritance: A safe form of "white box" code inheritance. In Proceedings of the Fourth International Conference on Software Reuse, IEEE Computer Society Press, April, 1996
[Edwards97]: Stephen H. Edwards, David S. Gibson, Bruce W. Weide, and Sergey Zhupanov. Software component relationships. In Proceedings of the Eighth Annual Workshop on Software Reuse, March, 1997.
[Harms91]: Douglas E. Harms and Bruce W. Weide, Copying and Swapping: Influences on the Design of Reusable Software Components. IEEE Transactions on Software Engineering, 17, 5, May 1991.
[Khoshafian86]: Setrag N. Khoshafian and George P. Copeland. Object Identity. In Object-oriented programming systems, languages and applications: (OOPSLA) conference proceedings, SiIGPLAN notices, New York, 1986.
[Leavens99]: Gary T. Leavens, Albert L. Baker, and Clyde Ruby. JML: A Notation for Detailed Design. In Haim Kilov, Bernhard Rumpe, and Ian Simmonds (editors), Behavioral Specifications of Businesses and Systems, chapter 12, Kluwer, 1999.
[Liskov88]: Barbara Liskov, Data Abstraction and Hierarchy. SIGPLAN Notices 23,5, May 1988.
[Liskov94]: Barbara Liskov and Jeanette M. Wing. A Behavioral Notion of Subtyping. ACM Trans. Programming Languages and Systems, 16(6):1811-1841, November 1994.
[McGettrick78]: Andrew D. McGettrick. Algol68 A first and second course. Cambridge University Press, 1978.
[Meyer98]: Bertrand Meyer, Object-Oriented Software Construction, Prentice Hall, 1998
[Sebesta02]: Robert W. Sebesta. Concepts of Programming Languages, Fifth Edition, Addison- Wesley, 2002.
[Sitaraman94]: Murali Sitaraman and Bruce W. Weide, editors. Special feature: Component- based software using RESOLVE. Software Engineering Notes, 19(4):21-67, Oct 1994.
[Smaragdakis98]: Yannis Smaragdakis and Don Batory. "Mixin-Based Programming in C++", University of Texas at Austin CS Tech. Report 98-27.
[Snyder86]: Alan Snyder, Encapsulation and inheritance in object-oriented programming. ACM SIGPLAN Notices 21, Nov 1986
[Wheeler97]: David A. Wheeler, Ada 95: The Lovelace Tutorial, Springer Verlag, 1997