Good News and Bad News
About Software Engineering Practice

Bruce W. Weide
Dept. of Computer and Information Science
The Ohio State University
Columbus, OH 43210-1277 USA

weide.1@osu.edu
Phone: +1 614 292 1517
Fax: +1 614 292 2911
URL: http://www.cis.ohio-state.edu/~weide

Abstract

Good news: some language/design features that support effective software engineering practices (including some of those advocated by the RESOLVE group) have been appearing incrementally in commercial software technologies. Bad news: none of the RESOLVE-specific innovations is yet among them. In fact, despite incrementally embracing some good ideas, commercial software technologies overall have regressed in the sense that they have become so complicated that the ill effects of their complexity easily outweigh the isolated benefits offered by these incremental improvements. Significant additional progress might have to wait for existing approaches to collapse under their own weight.

Keywords

Commercial software technologies, C++, CORBA, Java, .NET, RESOLVE

Paper Category: position paper
Emphasis: research

1. Introduction

The popular vision of the bright future for information technology relies heavily on Moore's Law of hardware improvement: that memory capacity, communication bandwidth, and raw computing power double roughly every eigtheen months. Drowned in the wake of enthusiasm is "God's Law", expressed by William Buxton [Buxton02] as the fact that human capacity for understanding is essentially constant. Buxton was talking about the complexities faced by the human end-users of software systems, but the same principle applies to the human software engineers who design, build, and maintain them. If software engineers try to build systems that grow in complexity at anything like the pace of Moore's Law in an attempt to do something with all those available bytes and cycles (Buxton calls this tendency "Buxton's Law"), then those programs will quickly dwarf the abilities of software engineers to manage their scale.

The above observations imply that the technical barrier to "scaling up" in software engineering is that software design approaches must support compositional, or modular, reasoning techniques about software behavior. That is, a necessary condition for scalability is that the behavioral effects of software changes are localized and predictable [Weide95]. Put otherwise, component-based development is not necessarily scalable development. So, the overarching goal of the RESOLVE project has long been to provide a rigorous mathematical foundation for scalable software development, to put more science behind the engineering, in the hope of making it intellectually manageable for engineers to reason soundly about the behaviors of the software systems they build--even as those systems get bigger and bigger and offer more and more functionality and better and better performance.

The argument that ultimately it is necessary to be able to reason modularly about software system behavior has led Bill Ogden to characterize the key ideas underlying the RESOLVE work as "inevitable", i.e., virtually certain to be adopted by software engineering practitioners--eventually. An examination of current software engineering practice reveals that there is both good news and bad news on this front, which I discuss in support of the position stated in Section 2.

Buxton's analysis of Moore's Law vs. God's Law raises the question of whether we might already have exceeded the level of complexity that software engineers can be expected to deal with. The consequences of such a landmark event, especially in the case of embedded software and other mission-critical systems, are certain to be disastrous. Indeed I now believe that, regrettably, the only way we will see significant improvements in software engineering practice is if some truly catastrophic event(s) can be blamed on defective software. Yes, software engineering practice has improved over the years if you look at some of the details outlined in Section 2. But numerous "bad habits" that we try to teach our students to avoid also have become institutionalized in commercial software technologies (CSTs) such as Microsoft's .NET framework, the Java milieu, CORBA, OpenSource alternatives, etc. Moreover, there is just an astonishingly high level of overall intellectual complexity involved in dealing with CSTs. Microsoft, Sun, and apparently all their "competitors" are on the same bandwagon, obliging software developers to deal with new and greater complication at every turn. Here I see only bad news, as discussed in support of the position stated in Section 3.

2. We Have Had No Impact On SE Practice

Some of my friends have noted that my mood has been sort of "down" lately, and it's partly because of what a realistic analysis reveals about the impact so far of a research program lasting nearly 20 years:

None of the RESOLVE innovations has had the slightest influence on CSTs, even via their inevitability.

It is worth noting that several of the ideas that we have always touted in RESOLVE and its ancestors have now been more-or-less embraced by the purveyors of CSTs. We see clear evidence of the widespread recognition that these are good ideas. Unfortunately, we also see evidence that the problems addressed by these ideas and/or the recommended solutions are (to put it charitably) incompletely understood by those who have injected them into CSTs. The following table illustrates what I mean for several examples.

Idea Evidence of Recognition of Value Evidence of Incomplete Understanding

separating specifications from implementations interfaces are first-class units in Java, C#, IDLs interfaces are just signatures, with not even syntactic slots for behavioral specifications

allowing for multiple interchangeable implementations of a single specification design-to-interfaces is recommended or even required practice in all modern CSTs; design patterns to address the multiple-implementation issue, especially the abstract factory pattern, are widely used design-to-interfaces is not design-by-contract; design patterns are clumsy compared to relatively simple language mechanisms that could address the issue head-on

having a standard set of "horizontal", general-purpose, domain-independent components such as lists, trees, maps, etc., in a component library STL and java.util include such components and they are widely used designs of the components in the STL, java.util, etc., are subtly but importantly different from what we would have created, in that they do not support modular reasoning; no one has provided convincing empirical evidence that there is much value in not recreating such "simple" components from scratch, although any developer worth his salt would now readily testify to this despite the unaddressed reasoning problems

templates are a useful composition mechanism C++ added a template mechanism and eventually it "worked" on most/all C++ compilers because the STL required it; Java is supposed to get a template mechanism soon Java doesn't yet have templates, and when it does they will be a weak substitute for what is actually required; parameterized components aren't on the radar screen in the .NET literature

having value semantics is useful even for user-defined types STL users are advised to override the C++ assignment operator to make a deep copy [Musser01]; "clone" is an integral part of Java and the .NET framework; .NET languages use "boxing" to try to eliminate ugly syntax associated with Java's wrappers for value types no one except us has noticed that swapping is a much better alternative than deep copying to achieve value semantics; Java's approach to cloning was so hopelessly botched from the start that even the designers of the java.util classes never found a way to make it work; the .NET framework simply adopts the Java approach to cloning, flaws and all; eliminating ugly syntax in .NET languages does not eliminate ugly semantics, but rather makes it harder to notice that something funny is going on when combining value and reference types

reasoning about programs that use pointers/references is complicated and error-prone early hype about Java proclaimed that "[p]ointers are one of the primary features that enable programmers to put bugs into their code... Thus the Java language has no pointers." [Gosling96]; authors of a mainstream C++ textbook [Koenig00] argue that students find values much easier to deal with than references and pointers, and advocate teaching the latter as late as possible early Java hype was later recanted when someone realized that merely eliminating pointer syntax did not actually solve the fundamental problem with pointers [Weide01]; it's not known how many instructors have adopted the improved C++ pedagogy offered by [Koenig00], but it is clear that no one else has adopted the improved pedagogy offered by us [Sitaraman01]

problems related to storage management, such as memory leaks, are serious Java and .NET try to eliminate developer concern by mandating garbage collection as the solution; it appears that even the current GNU C++ compiler is implemented on top of a garbage-collecting C++ substrate reliance on garbage collection has some bad consequences for performance of interactive systems, renders mainstream CSTs useless or dangerous for building real-time applications, and seems to require the developer to know details of the garbage collector implementation in order to manage scarce resources other than memory

Idea	Evidence of Recognition of Value	Evidence of Incomplete Understanding
separating specifications from implementations	interfaces are first-class units in Java, C#, IDLs	interfaces are just signatures, with not even syntactic slots for behavioral specifications
allowing for multiple interchangeable implementations of a single specification	design-to-interfaces is recommended or even required practice in all modern CSTs; design patterns to address the multiple-implementation issue, especially the abstract factory pattern, are widely used	design-to-interfaces is not design-by-contract; design patterns are clumsy compared to relatively simple language mechanisms that could address the issue head-on
having a standard set of "horizontal", general-purpose, domain-independent components such as lists, trees, maps, etc., in a component library	STL and java.util include such components and they are widely used	designs of the components in the STL, java.util, etc., are subtly but importantly different from what we would have created, in that they do not support modular reasoning; no one has provided convincing empirical evidence that there is much value in not recreating such "simple" components from scratch, although any developer worth his salt would now readily testify to this despite the unaddressed reasoning problems
templates are a useful composition mechanism	C++ added a template mechanism and eventually it "worked" on most/all C++ compilers because the STL required it; Java is supposed to get a template mechanism soon	Java doesn't yet have templates, and when it does they will be a weak substitute for what is actually required; parameterized components aren't on the radar screen in the .NET literature
having value semantics is useful even for user-defined types	STL users are advised to override the C++ assignment operator to make a deep copy [Musser01]; "clone" is an integral part of Java and the .NET framework; .NET languages use "boxing" to try to eliminate ugly syntax associated with Java's wrappers for value types	no one except us has noticed that swapping is a much better alternative than deep copying to achieve value semantics; Java's approach to cloning was so hopelessly botched from the start that even the designers of the java.util classes never found a way to make it work; the .NET framework simply adopts the Java approach to cloning, flaws and all; eliminating ugly syntax in .NET languages does not eliminate ugly semantics, but rather makes it harder to notice that something funny is going on when combining value and reference types
reasoning about programs that use pointers/references is complicated and error-prone	early hype about Java proclaimed that "[p]ointers are one of the primary features that enable programmers to put bugs into their code... Thus the Java language has no pointers." [Gosling96]; authors of a mainstream C++ textbook [Koenig00] argue that students find values much easier to deal with than references and pointers, and advocate teaching the latter as late as possible	early Java hype was later recanted when someone realized that merely eliminating pointer syntax did not actually solve the fundamental problem with pointers [Weide01]; it's not known how many instructors have adopted the improved C++ pedagogy offered by [Koenig00], but it is clear that no one else has adopted the improved pedagogy offered by us [Sitaraman01]
problems related to storage management, such as memory leaks, are serious	Java and .NET try to eliminate developer concern by mandating garbage collection as the solution; it appears that even the current GNU C++ compiler is implemented on top of a garbage-collecting C++ substrate	reliance on garbage collection has some bad consequences for performance of interactive systems, renders mainstream CSTs useless or dangerous for building real-time applications, and seems to require the developer to know details of the garbage collector implementation in order to manage scarce resources other than memory

With the generally positive development that good ideas are making their way into CSTs, why is my first position so downbeat? To my knowledge, the RESOLVE work has never been cited by anyone responsible for introducing any of these ideas into CSTs; in fact, it's rarely been cited by anyone except us. So, what I mean by the first position is that, even if RESOLVE had never existed and even if we hadn't written a single paper about our work, CSTs would still be just what they are now.

One of the obvious problems we've always faced has been the all-or-none nature of RESOLVE. Almost any little part of our technology that you decide not to adopt is likely to result in the inability to do modular reasoning in some cases. The positive developments listed above were adopted incrementally, and--this is a key point--apparently without any explicit concern for whether they might improve support for modular reasoning. This is why I don't think any of the apparent progress is because inevitability has already kicked in. The inevitability argument isn't based on the notion that some of the ideas are "cool" in isolation, but that together they are indispensable for modular reasoning. This rationale hasn't sold at all.

3. CST Complexity Is Out Of Control

My second position also has a negative tone, I'm afraid:

The intellectual load imposed by current CSTs has already exceeded the ability of some software engineers (e.g., me) to cope with their complexity.

One of the most important perks of being a tenured professor is that (at some universities) you're eligible to take a sabbatical leave at reduced pay once every n years. Of course, the term "sabbatical" suggests that n = 7 is the appropriate choice. Ohio State chose n = 8 for some reason; but that's beside the point. I'm just glad there is a sabbatical program here and that "reduced pay" is still almost enough to live on (if you save during the other seven years and the markets are kind to you).

With seven out of every eight years spent exploring the state of the art in software engineering, especially in a formal-methods context as we do, it would be easy for me to get lost in the ivory tower and ignore what's happening in "the real world". I have therefore eagerly taken advantage of sabbatical leaves whenever I've been eligible in an attempt to avoid this hazard. Each time, an important personal objective has been to make sure that I have gained some understanding of the current state of the practice of software engineering. This has meant learning something about the CSTs of the day and actually using some of them. As noted above, today's CSTs include Microsoft's .NET framework and its COM/DCOM/COM+ predecessors, the Java language and libraries, CORBA, and OpenSource tools such as NetBeans (all of which became important only well after my last sabbatical). They also still include the C++ language and libraries (which were barely around eight years ago, as the STL was only marginally compilable at the time).

Well, this is one of those years--my third sabbatical. I almost titled this paper "What I Did On My Sabbatical". But, as you will see, this section is more about what I did not do.

On my first sabbatical in 1985-86, my CST experience involved working with Mike Stovsky (a graduate student at the time) to design and build a Macintosh application called MacSTILE, and a companion tool called the Part Protector. I wrote MacSTILE, and Stovsky wrote the Part Protector, which was a proof-of-concept for part of his Ph.D. dissertation work. We wrote these systems in C and used the (relatively new, at the time) "Macintosh toolbox". I considered this sabbatical successful in that I finished what I set out to accomplish and in the process learned in depth one of the most important CSTs of the day. Amazingly, MacSTILE and the Part Protector still run on the newest Macintoshes! The source code hasn't been touched in at least 12 years. Ah, the good old days.

On my second sabbatical in 1993-94, my CST experience involved doing some of the early development of the RESOLVE/C++ ideas that were pioneered by Steve Edwards and Sergey Zhupanov (graduate students at the time). I considered this sabbatical a success in that I learned a lot of details about object-oriented programming using one of the most important CSTs of the day. Although the software I wrote was subsequently replaced, by me and others, over the following couple years, the core of this project remains in place today and is used by about 1000 students each year in our CS1/CS2 sequence. The biggest problem we've faced recently involved upgrading to the new GNU C++ compiler and watching it collect its own garbage for minutes at a time while we were trying to compile a small program. (We believe this is the result of a compiler bug that is encountered on this particular program; at least, we hope so.) In summary, things were more complicated in C++ than in C, but I could still get my mind around them thanks to the expertise of Edwards and Zhupanov.

On my current sabbatical in 2001-2002, my CST experience has involved working with Paolo Bucci, Wayne Heym, and Tim Long on the next generation of the Software Composition Workbench tool. Bucci built a prototype version a few years ago as a Java application and it has been used by our students ever since for some of their CS1/CS2 assignments. The sabbatical year isn't over yet, but it is notable that we have made considerably less progress on this project than we had imagined we would. Why? I'm sure my colleagues will unselfishly blame themselves for some of the troubles we've endured, but this would be very unfair to them and would distract us from the real problem: the nearly unmanageable intellectual complexity of CSTs today.

We chose to build the new SCW using Java servlets. This was not a rash decision. We didn't want to select something that was too new and unstable (e.g., wait for .NET), or something that was too old and crusty although new since my last sabbatical (e.g., write a plain old Java application). I still don't think servlets were a bad choice in terms of the complexity of the particular CST compared to the alternatives. Especially after having attended a .NET workshop for the past two days, I am convinced that .NET wasn't what we needed; it is even more complicated!

What did we have to do to build a Web app?

Bucci had to install and configure Apache, the Tomcat servlet engine, and our chosen IDE, NetBeans. Fortunately, the rest of us didn't need to learn how to do this, and I still don't know how. Suffice to say that just from watching part of the process I could tell that it was a lot easier installing and configuring Symantec C on the Mac in 1985-86 or using emacs and gcc on Unix in 1993-94. Admittedly, all the pieces of this present-day CST are somewhat (in truth, only marginally) more "powerful" than the older technologies, but this also makes them more unwieldy. Witness, for example, the configurability profiles of Apache and Tomcat, both of which seem incomprehensible to someone like me who's not an expert in networking and security. And these are quite tame compared to NetBeans. Among programs I've used, only emacs is in the same league as NetBeans in terms of the complexity of configuration. I suppose that theoretically it's possible to use NetBeans "out of the box" just like emacs, but as a practical matter this doesn't work so well. For some reason, unless you turn off dozens of options ("modules") that you never use, NetBeans uses enough memory to put Microsoft Word to shame. And no matter what you do, NetBeans seems to garbage-collect so often and for so long on each occurrence that sometimes there is literally enough time to get yourself a cup of Java (er, coffee) before you can type in the next character.
We had to learn HTML at a much more serious level than we already knew it "by osmosis": frames, forms, hidden inputs, etc. Actually, this was the easy part. We also had to internalize the outrageously convoluted operational model of a web app, in which the servlet repeatedly handles a GET or POST request from an HTML form in the browser and sends back a new HTML page in response. The details of this interaction model are, to put it mildly, totally unnatural for someone like me who's used to reasoning about procedure calls. There seems to be no comprehensible way to write a web app that does even approximately what MacSTILE did rather easily in 1986 with the Macintosh toolbox, or what Bucci did with a standalone Java application with the prototype of the SCW tool. I believe designing the detailed structure of a web app is akin to writing assembly code for an instruction set with delayed branching, although I've not thought in depth about the possible parallels here.
There seems to be no standard way for web apps to interact with HTML forms that are even moderately complex and context-dependent, so we had to figure out how to do several different things that didn't appear to be at all straightforward: bring up new browser windows, conditionally close browser windows, etc. The only way to do this seems to be to embed JavaScript in the HTML responses from the servlet. This meant we had to learn JavaScript, too, and a lot of it because most of what we needed to do did not entail merely copying code out of a book. And the name notwithstanding, JavaScript is nothing like Java except in its ugly syntax. Even the object model is completely different.
Once we used JavaScript, we knew we were headed for testing trouble. Everything had to be tested under both Netscape and Internet Explorer, and many of the JavaScript features we thought we needed didn't work the same under both popular browsers.

Where this left us was, as of April 2002, not very far along. We decided to junk the web app idea and write a Java application. Fortunately, by following RESOLVE principles when designing the Java code of the servlet, it seems we will be able to use all the back-end components as-is and just unplug the web-based user interface code and plug in a new one using Java's Swing package. With luck, we'll be able to get a decent second prototype working Real Soon Now.

It might be argued that other CSTs would have caused fewer problems. I doubt it. My limited understanding of .NET as obtained from about twleve hours of instruction from an expert, for example, does nothing to instill any such confidence.

4. Conclusion

My railing against the complexity of CSTs should not be interpreted as unmitigated criticism of their purveyors. Of course, these folks have software to get out the door, they have competition (of sorts), and most important they have to make things at least partly backward-compatible with their previous offerings. Still, they've made some strange decisions because they just don't appear to understand God's Law. Yes, computing has changed. Some additional complexities over the way we used to do things are simply necessary to create a robust and general software technology for today. For example, I'd argue that having specifications is one necessary additional complexity that still is not in CSTs. Instead, behind nearly every new complexity that is in today's CSTs lies one or more of the following:

inadequate understanding of the problem to be solved;
ignorance of better solutions that had already been suggested; and/or
failure to elaborate the criteria for an acceptable solution--among which must be both support for modular reasoning, and understandability by reasonably competent software engineers.

Let me close on a more positive note. The incremental adoption of some important ideas in CSTs should give us some hope that other innovations we've been promoting will eventually appear in CSTs, too. And it could mean that our deeper understanding of some of the new CST features that practitioners now have to deal with (deriving from our explorations of their mathematical foundations and our experience with their use in combination with many other such features) might give us "hooks" into the practitioner's world that we might be able to leverage in the future to have some impact on software engineering practice. This suggests some questions for discussion at the workshop:

Which yet-do-be-adopted good ideas would we most like (or are we most likely) to see in CSTs, assuming that all such ideas are to be introduced incrementally?
What, if anything, can we do to better "market" RESOLVE ideas; e.g., can we show that any have value as incremental improvements even in the absence of arguments about modular reasoning?

About a decade ago, Joe Hollingsworth asked that we devote some RSRG meetings to developing a "business plan". It was a useful exercise, resulting (as I recall) in a plan to write what turned out to be the ACM Software Engineering Notes special section on RESOLVE [Sitaraman94], among other things. It's probably time to do that again, even though coming out of this sabbatical I now believe that no matter what we do, it is questionable whether we can really influence CSTs even via inevitability until the world experiences some serious software-caused disasters.

References

[Buxton02]

Buxton, W., "Less is More (More or Less)", in The Invisible Future, P. Denning, ed., McGraw-Hill, 2002, pp. 145-179.

[Gosling96]

Gosling, J., and McGilton, H. The Java Language Environment: A White Paper, Sun Microsystems, Inc., 1996; http://java.sun.com/docs/white/langenv/ viewed 8 May 2002.

[Koenig00]

Koenig, A., and Moo, B.E., Accelerated C++: Practical Programming by Example, Addison-Wesley, 2000.

[Musser01]

Musser, D.R., Derge, G.J., and Saini, A., STL Tutorial and Reference Guide, Second Edition, Addison-Wesley, 2001.

[Sitaraman94]

Sitaraman, M., and Weide, B.W., eds., "Special Feature: Component-Based Software Using RESOLVE", Software Engineering Notes 19, 4 (Oct. 1994), 21-67.

[Sitaraman01]

Sitaraman, M., Long, T.J., Weide, B.W., Harner, J., and Wang, C., "A Formal Approach to Component-Based Software Engineering: Education and Evaluation", Proceedings 2001 International Conference on Software Engineering, IEEE, 2001, 601-609.

[Weide95]

Weide, B.W., Heym, W.D., and Hollingsworth, J.E., "Reverse Engineering of Legacy Code Exposed", Proceedings 17th International Conference on Software Engineering, ACM Press, 1995, 327-331.

[Weide01]

Weide, B.W., and Heym, W.D., "Specification and Verification with References", Proceedings OOPSLA Workshop on Specification and Verification of Component-Based Systems, ACM, October 2001; http://www.cs.iastate.edu/~leavens/SAVCBS/papers-2001 viewed 8 May 2002.

Good News and Bad NewsAbout Software Engineering Practice