Integration and Conceptual Modeling


Thomas J Wheeler
University of Maine
Department of Computer Science
Orono, ME 04469
wheeler@umcs.maine.edu


Abstract

It is becomming increasingly common that research efforts, and the development of systems to support them, are being undertaken by multidisciplinary teams. Life science multidisciplinary research has become the norm, rather than an innovation. There are several reasons for this trend. Insights from several points of view provide a richer understanding of issues and more opportunities for solutions. Also insights in one discipline often come from thought patterns from another discipline. In many disciplines the research in part(s) of the domain has reached the stage where exploring issues and advances in adjoining parts, and in the interaction of parts, is warrented. In hierarchical systems, especially life, research at individual levels is different in kind from research at others, and integration across levels has become possible and desirable.

Multidisciplinary research presents a marvelous opportunity, but also creates serious problems. Merging of the disciplines' conceptualizations must occur, at least in the (separate)minds of the collaborators. To be effective, merging must leverage the expertise of individual discipline members, as well as that of general purpose designers.

Systems that support this type of research, are complex systems, with significant semantic mismatch problems.  The increasing use of computer databases for organizing disparit research results in data integrations problems. Models for each database or data source are designed independently, in accordance with a domain’s conceptual model. These models are further specialized to a particular research effort, then encoded using general purpose data models. The independence of development and the differing cultures of the fields, cause incompatibilities between models and programming interfaces. The notation's general purpose nature  loses(filters) insights and intuition from domains' natural illustrations and explanations of key models.

Within the software engineering strategy focussed on a designer's perspective of development, there are three classes of concerns that must be addressed in creating a system: developing a substantial understanding of a problem and its domain, designing a system concept, and architecting and realizing that system. This paper explores a mechanism and a methodology for all but realization, based on integration of multiple disciplines' models. It distills the inherent structure of each model, blends models to create the structure for the integrated domain and creates views of this blended structure for each participating discipline
.
The approach has four aspects. “Natural” graphic depictions and explanations are integrated with general purpose models. The underlying structure of the natural models is extracted by analysis of the metaphorical underpinnings of those models. Models are blended using the character of one to underlie semantics taken from others. A framework for visualization of the blended domain is created using the natural depictions, explanations and underlying metaphors.

This technique provides a framework for understanding, organizing and supporting interdisciplinary work. It improves the conceptual modeling process by integrating more domain intuition and insight into the process. We will illustrate the mechanisms and a methodology for use with excerpts from interdisciplinary projects in molecular biology and ecology.
 

Keywords

Integration, Model, Model Blending, Natural Graphic, Multidiscipline, Multidisciplinart Research

Paper Category: technical paper
Emphasis: research

1.  Introduction

Research by multidiscipline teams provides insights that are richer than any single discipline can have, but introduces cross-disciplinary semantic problems.  In addition, designing the complex systems needed to support multidisciplinary research surfaces a number of core informatics systems problems. The issues at the heart of these problems concern development of a unified conceptual model for the research, and integration of heterogeneous parts of systems developed by different disciplines. The first needs a strategy for merging individual discipline's models into a coherent, multiple discipline model.  It must preserve the insights of the individual disciplines, and create a framework for insights in the (new) merged discipline. The second must use general purpose formal models and discipline specific natural models. Integration requires general purpose (i.e. domainless) models, languages, and notation. Scientific depth and insight require leveraging the notations and thought patterns natural to specific disciplines. The notations and thought patterns natural to specific disciplines produce heterogeneous representations, even when encoded in general purpose notations.

Within the software engineering strategy focussed on a designer's perspective of development, there are three classes of concerns that must be addressed in creating a system[Guttag,Horning]. First, designers need to develop a substantial understanding of a problem to be solved, the domain in which it resides and the community that works in that domain. Second, they must design a system concept, by a creative process whose success appears to be based on insights into the central concepts and activities of the domain and its community. Third, they need to architect and realize that system. The process of developing the understanding and the insight is called system/domain analysis, the development of the system concept and the architecture of the system are called design and realization is called implementation. Intuition and insight in the analysis/design process come from discipline specific understanding and are captured using formal notations. This perspective on development has modeling as its central focus.

A number concepts that have emerged in cognitive science over the past decade can help with the domain understanding and system concept design concerns. First, understanding the structure and conceptual metaphor basis of human cognitive models provides a framework for understanding and design. Models, on which understanding and design are based, are captured in a more natural way. Second, there is a cognitive process for developing new meaning for concepts, when they are placed in different contexts. In the new analysis of this process, conceptual model "blends"  provide a basis for developing new meaning. An analog of this can support cross discipline and multidiscipline envisioning, which these type of systems need to exhibit. Third, explanation and understanding are related, with explanation reifying an understanding, while from the other direction, explanation develops understanding. This interplay of explanation and understanding provides leveraging during analysis. And fourth, the categorization concept of radial categories leads to a useful characterization of "natural" depictions used in explanations. This characterization places constraints on the amount and type of abstraction used in models derived from those explanations, providing guidelines for analysis.

The interoperation of heterogeneous data types requires transformation of data from its original representation to a standard representation (in a data warehouse architecture) or to a usage representation (in a federated architecture). The main technology developers addressing these types of issues come from the Computer Science/ General Purpose Modeling communities [e.g. Roth, Davidson]. Valid integration, however, depends on the expertise of scientific curators, understanding the source(s), and scientists who design and perform virtual experiments and analyses with the resulting merged information [Bult].

Central Ideas

Models underpin system design, perceptual interfaces, and programming interfaces to data sources, analysis programs and other subsystems. The idea here is that development of a unified conceptual model for the research be based on analysis of explanations and accompanying natural depictions. Their capture and analysis is based on analogy to the development of "natural" cognitive models and the blending  of models, by an individual, during creative insight.

To address architecture level concerns, general purpose formal and natural modeling languages must be combined. General purpose modeling languages and notations are needed as a basis for automation, but because they are general purpose, they must abstract away any discipline specific intuition and insight. General purpose modeling languages and notations are also formal. But, Scientists understand and explain concepts and issues in their field using notation and language natural to their discipline. Depth and insight require the notations and thought patterns natural to specific disciplines. The models emerging from their explanations and depictions need be combined with general purpose modeling languages and notations, such as XML and UML. The general purpose, formal models are structured by the natural notations for more valid models. The natural notation models are integrated into the general purpose notation models, to retain the discipline's insight into the domain.

The aim of this effort is to develop a mechanism and a methodology for integration of separate discipline’s models, based on distilling the inherent structure of each model, blending them to create the structure for the integrated domain and creating views of this blended structure for each participating discipline. This paper results from work on a number of life sciences projects, taking a systems' biology approach, which is naturally interdisciplinary, in areas from genomics to ecology. It looks at this type of system from a point of view which uses component based architectures, providing data integration through interfaces. The focus of this work is in integrating ideas about cognitive models and model blending from cognitive science, into a model based development process .
 

2.  The Problem

Background

It is becomming increasingly common that research efforts, and the development of systems to support them, are being undertaken by multidisciplinary teams. In the life sciences for instance, multidisciplinary research has become the norm, rather than an innovation. There are several reasons for this trend. The first is that insights from several points of view provide a richer understanding of issues and more opportunities for problem solutions. A related reason is that an insight in one discipline often comes from a thought pattern from another discipline. Third, in many disciplines, research in some parts has reached the stage where exploring issues and advances in adjoining parts, and in the interaction of parts, is warrented. Lastly, in hierarchical systems, especially life, research at individual levels is different in kind from research at others, and integration across levels has become possible and desirable.

While multidisciplinary team based research presents a marvelous opportunity, it also creates serious problems. In multidisciplinary research, blending of the disciplines' conceptualizations must occur, at least in the (separate)minds of the collaborators, but also in the resulting or supporting systems.

Researchers in the life sciences are increasingly taking on a multidisciplinary character, using the system's biology approach to understanding issues from molecular biology to ecology. They are finding that integrative issues create a barrier to progress in molecular biology[Paton] and that a complex system approach is essential in ecological systems[Wu,Marceau]. The multidisciplinary character of system's biology is changing the landscape of life science research.

Problem

Development of the complex systems needed to support multidisciplinary research surfaces a number of core informatics systems issues at two levels. At the System Level, development of systems to support different disciplines needs to address the different thought patterns and cultures of the disciplines. Conceptual models underpin the design, understanding and use of systems; presenting a model effectively to users from different disciplines must be in terms fitting that discipline's model of the domain.  At the Architecture/Component Level, development of parts of these systems by different disciplines, leads to heterogeneity and distribution problems in data sets, processing strategy, information presentation, and analysis. Heterogeneity leads to conceptual model (“impedance”) mismatch, at semantic and pragmatic levels, between different parts of the system.

Conceptual model mismatch problems exhibits themselves in system level problems such as (mis)interpretation of results displayed by a system, difficulty in development using software from another discipline, and difficulty in a multidisciplinary team developing complementary and integrated understandings of others' concepts. These problems come about because the designer's conceptual model creates the character of the system and its components, and that character is usually difficult to understand from the (different) point of view natural to the user.

At the architecture level, significant mismatch problem comes from the increasing use of computer databases for organizing research and its results leads to data integrations problems for multidisciplinary research/systems. The data model for each database, or other data source, is designed independently, in accordance with a domain’s conceptual model, specialized to a particular research effort, then encoded using general purpose data models. Because of the independence of development and the differing cultures of the fields, incompatibilities occur between models and at programming interfaces. Because of the general purpose nature of the notation for encoding the data models, the insight and intuition in each domain’s natural illustrations and explanations of its key models is lost.

A related problem occurs in the interplay of formal and natural notations and thought patterns. Models that underlie the interfaces of systems and subsystems, start in notations of, and are framed in terms of thought patterns of, specific disciplines; but must be encoded in terms of general purpose formal notations. The integration or translation that occurs at system and subsystem interfaces requires the use of general purpose languages, models and notation; but scientific depth requires leveraging the notations and thought patterns of specific disciplines.

Motivation

There is considerable evidence that people remember, understand and use concepts, systems, etc. better when they have a memorable structure[Lakoff, Pinkler, Chomsky, MacEachern].  Understanding seems to take place in terms of metaphor based structures[Lakoff, Johnson, Mandler]. Organizing takes place through placing concepts into categories and structures which enriches concepts and connects them to other related and helpful concepts[Fauconnier], making them easier to understand while making them more powerful and useful. Humans seem to organize concepts by logical, imagistic and  procedural patterns [MacEachern]; by categorization[Lakoff], grouping things that are similar in some way, and by fitting them into conceptual structures to provide larger concepts; classifying them as generalizations and specializations of some naturally understood "basic level" categories.

Everyday existence, thinking about and using familiar, commonplace objects and concepts is organized effectively by perceptual images and metaphor based mental models of the objects and their context [Fauconnier, Lakoff, Johnson, Mandler], so that humans can naturally deal with their everyday environment. These perceptions and concepts provide an organized understanding of the everyday environment. When one wants to function with similar ease and facility in an artificially created environment such as system development or interdisciplinary research, where one cannot have a naturally constructed framework in which to reason and act, one has to consciously create the organized understanding necessary for natural and effective action.

Software engineering has been a search for organization patterns for differing types of system concerns. Creating explicit organized representations for work products has provided guidance in organizing work as well as useful structuring of the results[Parnas, Guttag,Horning].  The effort reported on here is an effort to find and develop a set of organization patterns for system development in the situation where the concerns are complex and dissimilar in some way, and have different cultures, and the development participants, and their products, have to interact.

3.  The Position - Approach

Designers need to develop a substantial understanding of a problem to be solved(or an opportunity to be taken advantage of)[Guttag,Horning],  the domain in which it resides and the community that works in that domain. They must also develop a system concept, by a creative process whose success appears to be based on insights into the central concepts and activities of the domain and its community, developed while building that understanding. Only after these do they need to architect and realize that system.

Our approach is based on using results from cognitive science research, within a framework provided by software engineering, to provide a basis for system design, by capturing the models on which the design is based in a more natural way. This is done by explicit analysis of the natural models of each discipline, capturing the essence of each in a formal model which is then used for system design. The structure and conceptual metaphors used in explanations of the discipline's concepts are captured and analysed for use in formal models in the system design.

We also combine the use of formal and natural models in developing a cross-discipline or multidiscipline understanding of particular domains. This is based on an analysis of the cognitive process which develops new meaning for a concept when placed in a different context, using blended cognitive models and metaphorical mappings. An analog of this process can provide a basis for supporting the cross discipline and multidiscipline envisioning which these systms need to support.

The concept of radial categories characterizing "natural" depictions is used in analysing explanations, and developing the abstraction used in models derived from those explanations. "Natural" semantic support for creative insight in multidiscipline research is provided by the emergent structure of these blended cognitive models and metaphorical mappings of meanings in the user's discipline.The models developed are used to design the system, the perceptual interfaces provided by systems, and the abstract interfaces to data sources, analysis programs and other subsystems.

Criteria

The criteria by which this work is judged address three areas:
1. The models and interfaces to software components must be valid with respect to scientific experiments. The models must accurately reflect the concepts used in the design of experiments. The computer data must reflect the results of the scientific experiments by the disciplines creating the data. The interfaces, both program and perceptual, must portray the data, and inferences from it, in accordance with the models of the disciplines.
2. The models and interfaces must structurally, semantically, and pragmatically conform to each discipline's conceptual models and the blended models. The models and interfaces must address the thought patterns and activities of each discipline.
3.  The models and interfaces to software components must resonate with the intuitions and insights of each discipline and support development of new multidiscipline intiutions and creation of new multidiscipline insights.
 

4. Elaboration

Mechanism and Methodology

This paper develops a mechanism and a methodology for integration of separate discipline models, based on the process of distilling the inherent structure of each model and blending models to create the structure for the integrated domain. It addresses issues at the terminology, semantics, pragmatic and activity levels.

The mechanism consists of two parts; (1) analysis of idealized cognitive models and metaphorical semantics; and (2) synthesis supporting creativity, using cognitive model blending. Recent results show that people's (e.g. scientist's and developer's) cognitive models appear to be based on idealized abstract cognitive models (Idealized Cognitive Models (ICM's)[Lakoff/Johnson], Schemata[MacEachern], Conceptual Structures[Jackendorf]) and structural mappings among their elements [Fauconnier]. Some of these are learned at an early age, common to people in general, and unconciously applied. Others, specific to their (specialized) domain, are learned as an an adult, shared among the members of the discipline, and are skills, whose unconcious application is because of training and experience[MacEachern], or are conciously applied[Fauconnier].

Cognitive Models: Metaphors and Maps

It appears that we develop an understanding of something, or domain, by forming a perceptual image based[Lakoff, Marr] conceptual model [Norman], or knowledge scheme[MacEa, Pink] of it, which gives it form[Alexander], allowing us to mentally simulate[Fauconnier] properties and activities[Norman] of the thing or in the domain, and thus infer properties and behavior. The mental images that we form have a visual or kinesthetic character[Lakoff] allowing them to be directly understood[Lakoff] and to be spatially manipulated in the mind[Marr, MacEacern] so that relationships and interactions can be inferred. Use, activity and interaction of objects, relationships and concepts in a domain make sense via the structure of the mental models. They do this in terms of the functionality and constraints they afford[Norman], the mappings of these aspects of objects with those of other understood objects, and metaphors[Johnson] applied to them to give them meaning.

Metaphors provide a basis for compositional semantics of the natural and abstract world. Analysis of language use shows [Lakoff,Johnson, Lakoff&Johnson] the pervasiveness of conceptual metaphors for both primary concepts and for their composition.  Primary metaphors become part of our cognitive unconscious automatically, beginning in infancy[Lakoff&Johnson] providing experiential semantics for abstract concepts and activities. Complex concepts are structured by structural metaphors and mappings[Johnson] to included or associated primary or complex metaphors. As an example of metaphorical semantics from molecular biology (Figure 1) consider the following sentence: "a strand of DNA consists of pairs of bases"  where "strand" is metaphorically a path(or line) and "base pairs" are at the positions of steps(or points) along the path.

Mappings of various kinds between cognitive models appear to be at the heart of what we mean when we say we understand some concept[Fauc]. Projection mappings use the structure, and vocabulary, of one domain to understand some other domain. Function mappings structure correspondences, organizing the knowledge in a field. Schema mappings structure situations transfering concepts into new contexts. In the example in figure 1, there is a projection map from the domian of paths, a primary metaphor learned in infancy from (probably) crawling and/or actual observation of different things, animate and inanimate, moving along different paths.

Figure 1

Cognitive Model Blending

Cognitive model blends use the structure and behavior of one conceptual model, or domain, (the source) to create insight about another concept, domain or situation (the target). Cognitive models can be pictured as structures composed of concepts as its components. The concepts are not atomic, but rather are either metaphorically structured entities or other complexes. The components and the model have complex and open ended semantics, based on the activities they (metaphorically) participate in, the situations they (metaphorically) are found in and the behavior and character they (metaphorically) exhibit. Thus any conceptual model has a discernible structure and an enormous amount of emergent properties. The emergent properties come into the mind when the model is "run".

Figure 2

In a conceptual model blend(Figure 2), a person (say a scientist from domain 1) is trying to develop a conceptual model (model1 in Figure 2) of some subject matter. Another person (say a scientist from domain 2) explains the subject matter from her point of view, using a model (model2 in Figure 2) and terminology from her domain. There are some aspects of domain 2 and domain 1 which have a common, abstract semantic basis (modelg in Figure 2) and these serve to provide some abstract semantic anchors between the two people's concepts. But some of the concepts in model1 and model2 have the same metaphorical basis underpinning them (modelb in Figure 2), allowing(causing) the models to form a blend. The first person can understand model2 in the context of domain1 by use of the blend (modelb) and the vocabulary of model2.

Providing a view of a model, from one domain, in terms of a model in another domain is done by a similar technique. The semantics of the second domain are overlain on the information from the first domain. The interpretation in the second domain uses that domains thought patterns, expanded to include data from the first. We refer to this process as model morphing.

As an elaboration of example of metaphorical semantics from molecular biology above (Figure 1) consider the following further sentence: "The DNA 'zipper' (another metaphor) must attach itself to the gene in an area a certain distance unstream from the area to be 'unzipped', for transcription to take place another certain distance downstream". (Figure 3).

Figure 3

Here the geography metaphor is used to explain ( and model) the process of transcription. The geography model is (something like) a map of some terrain with a number of paths, with the DNA path being specialized to a stream flowing downhill in a valley. The DNA and the stream have the same shape. In an area on the map, "upstream" of  the start of a distributary (overlain by the unzipping metaphor) the unzipping occurrs. Following that (i.e. downstream from that place) transcription can take place, modeled as a distributary.

The methodology

The methodology has four aspects. The first consists of integrating the “natural” graphic depictions and explanations each discipline makes of its core concepts, with the general purpose models of their systems. The second extracts the underlying structure of the natural models based on analysis of the metaphorical underpinnings of the models. The third creates a blend of the models’ structures, using the character of one to underlie the semantics composed from elements from other models mapped onto slots of the core model. The fourth creates a framework for visualization of the blended domain by creating natural depictions and explanations from the blended structure of the blended domain.

Intuition and insight in the analysis process come from discipline specific understanding. This comes from direct or indirect experience in the domain. In systems which are primarily the product of an individual, the understanding comes from working in the field. In systems developed by a team, the understanding must be shared, through informal(conversations) or formal (meetings) verbal/visual interactions or documented representations; preferably all of these. The technique  we describe here provides a framework for developing this understanding.
 

5. Example


A research project to look into the development of a database for the Genome Spatial Information System (GenoSIS) Project  required an integrated genomic-spatial data model, which formalizes genomics(a computer analog of DNA molecular biology) along with metric, topological, and metrically uncertain properties and relationships among genome features. Such a genome spatial data model facilitates the powerful spatial reasoning and inferences that are part of spatial information science and thereby allows biologists to ask questions about the contextual and organizational significance of the spatial arrangement of genome features. These functional capabilities should, in turn, aid in the automation of repetitive analytical tasks associated with the mapping of genome features and drive the discovery of biologically significant aspects of genome organization and function.

We begin the analysis by attempting to characterize the biological processes we hope to model. We characterize the models and thought patterns in the domain both informally by working with the different disciplines and listening to their explanations; and formally by use of the mechanisms described in this paper. We formally characterize the models and thought patterns in the domain in two ways: by considering the natural graphics that are used within the domain among practitioners and by constructing a lexicon or ontology of the concepts which are essential in the domain. With these tools we develop a conceptual model, which can be formalized as the data model.

Figure 4

Some "natural" graphic depictions of the biological processes of interest to us are shown in Figure 4. First, in the lower left, a picture-like image grounds the conceptualization with a real(istic) image. There are a number of natural maps (natural isomorphisms) from that image. It is mapped onto a spirally wound tube, which is then unwound to produce a depiction as a ribbon with the 5' to 3' molecule strand on top. There are two further natural mappings, the upper one showing a simplified straight line depiction, with supplementary colored segments, and the other showing a blowup making sequence of the individual bases apparent. These natural depictions are used to illustrate explanations of the primary concepts in genomics.

The depictions and the accompanying explanations are part of the raw material for the analyses described above. Another part of the raw material is an analysis of the vocabulary in the explanations and from gossaries or ontologies. As an example of an explanation is as follows:
"An Organism is the largest category for our purposes here. We wish to compare different organisms in some analyses. Each organism has one or more Genomes. A genome is made up of one or more Chromosomes. The genome contains many Features, which we define to be recognizable functional elements. A feature may be simple or composite, that is, composed of other features making up a Feature Set. The genome and any feature within it are sequences of Base Pairs. The base pair sequence is the raw primary output of genome sequencing efforts. Features are determined by applying a number of algorithms, e.g. pattern matching, to the sequence. We indicate the Start and Stop positions of a feature as determined by the algorithm used to locate the feature. Since DNA is double stranded, for any feature on DNA we indicate which Strand contains the feature and how far along the strand it starts. Biologists interested in comparing organisms seek ... "

In this explanation, words denoting objects(concepts) in the model are boldfaced. Words signaling the use of a metaphor are italicized, and words useful in guiding the modeling are underlined, for instance "contains" signals the container metaphor, Strand and how far along signal the path metaphor specialized to a strand, while genome and any feature within it signal the structure of a genome sequence.

Figure 5

The UML model developed from the analysis is shown in Figure 5 (color coded to highlight the biological science parts and the conputer, genomics parts).

A part of the formal Abstract Interace for the database, using this model would look like:
Using an (object oriented) "XML++ "    (;-))   syntax :
______________________________________________________________

<!ELEMENT feature    <-- (2) -->
 (feature_type,start_coordinate,end_coordinate,strand
  feature_name,feature_symbol?,comment?,time_stamp?,
  transcript*)>    <--!Semantics:Structure-->
        <--!Metaphor:Part-Whole-->
......

<!ELEMENT gene1 is_a feature (annotation_list) >
 <--!Semantics:is_a = Structure Addition-->
 <--!Semantics:(..) = Structure-->

______________________________________________________________

Here, the structure is given by an XML Element definition (instead of BNF), the formal semantics is given by (a formal model from) a collection of formal models, and the "natural semantic basis is given by  (a conceptual metaphor from) a collection of (ground or complex) conceptual metaphors.

The automation of the semantics would be accomplished by pattern matching at the formal model and the conceptual metaphor levels.
 

6.  Related Work

This work is a relatively new endeavor, applying research work in cognitive science to software engineering  This effort is, in some sense, the development of an explicit "organized understanding" of the thinking, communication, activities, and documentation of a research or development project, in an attempt to provide a prescriptive approach to separating, and addressing, the concerns of such a project.

As this project is an outgrowth of software engineering research, its core concepts are about design/research organization; characterizing, making explicit and managing the work products of the design/research efforts; developing prescriptive methods for separating the concerns of the effort, and addressing interaction, interface and interoperation issues between disciplines and multi-domain software (sub)systems.

This work is related to some of the work in the reuse community [WISR, & ?] which addresses conceptual underpinnings or reuse and interoperability [Wileden, Porter, Simos, Capilla, Kiczales, Latour]. It is related to the software architecture community[Garlan&Shaw, ] whose work  is one of the major sources of organizing, and working at, the Abstract Implementation level in the model presented below. It is also related to, but addresses a different aspect of collaboration than the computer supported cooperative work community[CSCW, ECSCW] who focus mainly on computed mediated interaction, whereas we focus on the perceptual, cognitive and (human) communication aspects of the problem.

7.  Conclusion

This paper presented a mechanism and a methodology for integration of separate discipline’s models into a multi-discipline model and the development of program component interfaces to heterogeneous infirmation which conform to these models. The mechanism distills the inherent structure of each model by explicating the metaphors underlying the discipline's explanations and natural graphic depictions of its core concepts. Integration is by model blending to create the structure for the integrated domain. Presentation, both perceptually and programatically is accomplished by creating views of this blended structure for each participating discipline. The focus of this work is in integrating ideas about cognitive models and model blending from cognitive science, into a model based development process .

The technique creates models and interfaces to software components that are valid with respect to scientific experiments. They accurately reflect the concepts used in the design of experiments by capturing them from the most accurate and insightful representations available. They are structured and given semantics in terms of models isomorphic to those apparent in the minds of discipline members.

The models and interfaces structurally, semantically, and pragmatically conform to each discipline's conceptual models because they capture the essence of the discipline's explanations[Tuffte]. The multi-discipline models are blended by the same mechanisms used by discipline members. The models and interfaces use the thought patterns and activities of each discipline.

Because they capture the essence of the discipline's explanations, the models and interfaces to software components should resonate with the intuitions of each discipline. Because they are blended by the same mechanisms used by discipline member they should support development of new multi-discipline intiutions and creation of new multidiscipline insights.
 

References


C. Alexander, "Notes on the Synthesis of Form" Penguin Books, 1982

F. Belz, D. Suthers, and T. Wheeler, "Architecture Abstraction Hierarchy - Reference Model,"   IEEE Learning Technology (P1484) Guideline (P1484.1), 1997.

F. P. Brooks, The Mythical Man-Month, Reading, MA: Addison Wesley, 1975, 1996.

C. Bult, et.al. "Mouse Genome Informatics in a New Age of Biological Inquiry" Bio-Informatics and Biomedical Engineering (BIBE2000) Arlington VA Nov. 2000

N. Chomsky, “Linguistics and Adjacent Fields: A Personal View,” The Chomskyan Turn,(A. Kasher, ed.), New York: Blackwell, 1991.

G. M. Copper, The Cell: A Molecular Approach, Washington, D.C.: ASM Press, 1997.

S. Davidson, "BioKlesli: a Digital Library for Biomedical Research" Intl. J. Digit. Lib. 1(1) 1997

G. Fauconnier, "Mappings in Thought and Language" Cambridge Univ. Press 1997

C. Gallistel, Organization of Learning, Cambridge, MA: MIT Press, 1993

D. Garlan, R. Allen, and J. Ockerbloom, “Architectural mismatch, or, why it’s hard to build systems out of existing parts,” 17th International Conference on Software Engineering, ICSE 95, April 1995.

J. Guttag, J. Horning "Formal Specification as a Design Tool"Formal Specification Case Studies MIT Press 1989

D. Hester, D. Parnas, and D. Utter, "Using Documentation as a Software Design Medium," Bell System Technical Journal, V60, 1981.

R. Jackendorf, "Cognitive Architecture of Language" MIT Press, 1984

G. Kiczales, “Aspect-Oriented Programming,” Eighth Annual Workshop on Software Reuse, March 1997.

G. Lakoff, Women, Fire, and Dangerous Things-What Categories Reveal About the Mind, Chicago: University of Chicago Press, 1987.

G. Lakoff and M. Johnson, Philosophy in the Flesh-The Embodied Mind and Its Challenge to Western Thought, New York: Basic Books, 1999.

L. Latour, T. J. Wheeler,and B. Frakes, "Descriptive and predictive aspects of the 3C's model: SETA1 working group summary," First Symposium on Environments and Tools for Ada, Ada Letters, XI, 3, (Spring 1991).

J. Mandler "Preverbal Representation and Language" In Language and Space Bloom, Peterson, Nadel, Garrett Eds. MIT Press 1996

D. Marr, Vision: A Computational Investigation into the Human Representation and Processing of Visual Information, San Francisco: W.H. Freeman, 1982.

A.L. MacEachren, How Maps Work-Representation, Visualization, and Design, New York: The Guilford Press, 1995.

D. Norman "The Design of Everyday Things" Penguin 1986

N. Paton et.al. "Conceptual Modelling of Genomic Information"Bioinformatics V16,no.6 2000

S. Pinkler, The Language Instinct: How the Mind Creates Language, New York: William Morrow and Co., 1994.

M. I. Posner, ed., Foundations of Cognitive Science, Cambridge, MA: MIT Press, 1996.

M. Roth, F. Ozcan, L. Haas, "Don't Scrap it, Wrap it, A Wrapper Architecture for Legacy Data Sources" In Proc. VLDB Athens Greece Aug. 1997

M. Shaw and D. Garlan, Software Architecture: Perspectives on an Emerging Discipline, Upper Saddle River, NJ: Prentice Hall, 1996.
SIGSOFT, Fifth Symposium on Software Reusability, May 1999.

M. A. Simos, “Domain Envisioning: A Lightweight, Incremental Approach to Getting a Company Started with Systematic Reuse,” Ninth Annual Workshop on Software Reuse, January 1999.

J. F. Sowa, Knowledge Representation-Logical, Philosophical, and Computational Foundations, Cambridge, MA: Brooks/Cole, 2000.

R. E. Slavin, Cooperative learning: Theory, research, and practice. Englewood Cliffs, NJ: Prentice-Hall, 1990.
Spatial-Genomics Project, University of Maine

E. R. Tufte, Envisioning Information, Cheshire, CT: Graphics Press, 1990.

T. J. Wheeler and J. Richardson  "A Two Layered Interfacing Architecture," Journal of Standards & Interfaces, v.13, Elsevier-North Holland, 1991.

T. J. Wheeler, "Object Database Interface," DARPA Open Object Oriented Database Workshop, Dallas, Tx., 1992.

T. Wheeler, M. Dolan, and J. Richardson, A Framework for Interdisciplinary Collaboration Univ. of Maine CS Report, 2000

L. Wong, "Kleisli, its Exchange Format, Support Tools, and an Application in Protein Interaction Extraction" Bio-Informatics and Biomedical Engineering (BIBE2000) Arlington VA Nov. 2000

J. Wu, D. Marceau, "Modeling Complex Ecological Systems: an Introduction" Ecological Modelling 2002
 

Appendix

- Example Document Type Definition:

<!DOCTYPE organisms[
<!ELEMENT organisms (organism*)>                                        
                                                       <!-- !Model: Set -->
                                             <!-- !Semantics:Collection -->

<!-- ***************************************************************  -->

<!ELEMENT organism (kingdom,genus,species,subtype?,
                           common_name,comment?,genome+)>
                                                               <!-- (1) -->
                                                   <!-- !Model:Structure-->
                                 <!-- !Semantics:Whole/Part Construction-->
<!ATTLIST organism      id ID #REQUIRED         
                                              <!-- !Model:Unique Nat Num-->
                                      <!-- !Semantics:(Source-Path)-Goal-->

                        Name (#PCDATA)>
                                                        <!-- Model:Name -->
                                                  <!-- !Semantics:Symbol-->
<!ELEMENT kingdom (#PCDATA)>
                                                       <!-- !Model:Name -->
                                                  <!-- !Semantics:Symbol-->
<!ELEMENT genus (#PCDATA)>          
                                                       <!-- !Model:Name -->
<!ELEMENT species (#PCDATA)>                                            
                                                       <!-- !Model:Name -->
<!ELEMENT subtype (#PCDATA)>                                            
                                                       <!-- !Model:Name -->
<!ELEMENT common_name (#PCDATA)>                                        
                                                       <!-- !Model:Name -->
<!ELEMENT comment (#PCDATA)>                                            
                                                       <!-- !Model:ch*  -->
                                          <!-- !Semantics:Points_on_Line-->

<!-- ***************************************************************  -->

<!ELEMENT feature (feature_type,start_coordinate,            
        end_coordinate,strand,feature_name,feature_symbol?,
        comment?,time_stamp?,transcript*)>       
                                                              <!-- (2) -->
                                                  <!-- !Model:Structure-->
                                           <!-- !Semantics:Construction-->
<!ATTLIST feature id ID #REQUIRED>       
                                           ` <!-- !Model:Unique Nat Num-->
                                     <!-- !Semantics:(Source-Path)-Goal-->
<!ATTLIST feature idref IDREF #REQUIRED>                        
                                                        <!-- !Model:REF-->
                                     <!-- !Semantics:Source-(Path-Goal)-->

<!ELEMENT feature_type (#PCDATA)>                        
                                                      <!-- !Model:Name -->
                                                 <!-- !Semantics:Symbol-->  
<!ELEMENT start_coordinate (#PCDATA)>       
                                                   <!-- !Model:Nat Num -->
                                       <!-- !Semantics:Position_on_Line-->
<!ELEMENT end_coordinate (#PCDATA)>                 
                                                   <!-- !Model:Nat Num -->
                                       <!-- !Semantics:Position_on_Line--> 
<!ELEMENT strand (#PCDATA)>       
                                     <!-- !Model:Name(="plus","minus") -->
<!ELEMENT feature_name (#PCDATA)>                        
                                                      <!-- !Model:Name -->
<!ELEMENT feature_symbol (#PCDATA)>                    
                                                      <!-- !Model:Name -->
<!ELEMENT DNA_SEQUENCE (#PCDATA)>                
                                                      <!-- !Model:ch*  -->
                                         <!-- !Semantics:Points_on_Line-->
<!ELEMENT comment (#PCDATA)> )>                          
                                                      <!-- !Model:ch*  -->
<!ELEMENT time_stamp (#PCDATA)>                  
                                                      <!-- !Model:Time -->
<!-- !Semantics:Points_on_Line-->
                                                   
<!-- ***************************************************************  -->

<!ELEMENT gene1 is_a feature (transcript,annotation_list) >   <!-- (3) -->
                                  <!-- !Model: is_a = SubType (& deRef)-->
          <!-- !Semantics:Additional Construction & (Source-Path)-Goal -->
                                         <!-- !Model: (..) = Structure -->
                                          <!-- !Semantics:Construction -->

<!ELEMENT transcript (protein|enzyme) >         
                                                     <!-- !Model:Union -->
                                                 <!-- !Semantics:Choice-->
<!ELEMENT protein(sequence_length,amino_acid_sequence) >
                                             <!-- !Model:Structure-->
                                            <!-- !Semantics:Construction-->
<!ELEMENT sequence_length  (#PCDATA)> )>            
                                                    <!-- !Model:Nat Num -->
                                            <!-- !Semantics:Line_Segment-->
<!ELEMENT amino_acid_sequence>                         
                                                       <!-- !Model:ch*  -->
                                          <!-- !Semantics:Points_on_Line-->
<!ELEMENT annotation_list (annotation)*>      
                                                    <!-- !Model: annot* -->
                                          <!-- !Semantics:Points_on_Line-->
<!ELEMENT annotation(annot_type,annot_val)>
                                                  <!-- !Model:Structure-->
                                <!-- !Semantics:Part/Whole Construction-->
<!ELEMENT annot_type (#PCDATA)>                
                                                      <!-- !Model:Name -->
                                                 <!-- !Semantics:Symbol-->
<!ELEMENT annot_val (#PCDATA)>                 
                                                      <!-- !Model:ch*  -->
                                         <!-- !Semantics:Points_on_Line--> 
                                                    
<!-- ***************************************************************  -->

<!ELEMENT promoter is_a feature (annotation_list) >         
                                                              <!-- (4) -->
                                    <!-- !Model: is_a = SubType & deRef-->
          <!-- !Semantics:Additional Construction & (Source-Path)-Goal -->
                                         <!-- !Model: (..) = Structure -->
                                          <!-- !Semantics:Construction -->

<!ELEMENT annotation_list (annotation)*>      
                                                   <!-- !Model: annot* -->
                                         <!-- !Semantics:Points_on_Line-->
<!ELEMENT annotation(annot_type,annot_val)>
                                                  <!-- !Model:Structure-->
                                           <!-- !Semantics:Construction-->

<!ELEMENT annot_type (#PCDATA)>                
                                                      <!-- !Model:Name -->
                                                 <!-- !Semantics:Symbol-->
<!ELEMENT annot_val (#PCDATA)>                        <!-- !Model:ch*  -->
                                         <!-- !Semantics:Points_on_Line--> 

<!-- ***************************************************************  -->

<!ELEMENT gene2 is_a feature view_of(promoter, gene1) >     
                                                              <!-- (5) -->
                                    <!-- !Model: is_a = SubType & deRef-->
          <!-- !Semantics:Additional Construction & (Source-Path)-Goal -->
                                            <!-- !Model:view_of = View -->
                                            <!-- !Semantics:Surface_of -->
                                         <!-- !Model: (..) = Structure -->
                                          <!-- !Semantics:Construction -->

]>
<!-- ***************************************************************  -->
<!-- ***************************************************************  -->
<!-- INSTANCES:  -->
<!-- Eukaryota_Rodentia_Mus_musculus_GALT -->
                                                  <!--  dtd(1) -->
<ORGANISM Name=Mus musculus>
<KINGDOM>               Eukaryota               </KINGDOM>
<GENUS>                 Rodentia                </GENUS>
<SPECIES>               Mus musculus            </SPECIES>
<strain>                B6/CGAFIJ               </strain>
<db_xref>               taxon:10090             </db_xref>
<sex>                   female                  </sex>
<tissue_type>   liver                           </tissue_type>
<COMMON_NAME>   House Mouse                     </COMMON_NAME>
<annotation>    This reference sequence was provided by 
the Mouse Genome database (MGD).                </annotation>
<CHROMOSOME>    
<CHROMOSOME_NUMER>              4               </CHROMOSOME_NUMBER>
<CHROMOSOME_NAME>               chromosome 4    </CHROMOSOME_NAME>
<CHROMOSOME_STRUCTURE>  linear                  </CHROMOSOME_STRUCTURE>
<STRAND>                                plus    </STRAND>
<Symbol>                                GALT    <Symbol>
<Feature_Name> galactose-1-phosphate uridyl transferase </Feature_Name>
<cM_Position>                   19.9                    </cM_Position>
<MGI_Accession_ID>              M:96265                 <MGI_Accession_ID>
<FEATURE>
<FEATURE_TYPE>          source                  </FEATURE_TYPE>
<START_COORDINATE>      1                       </START_COORDINATE>
<END_COORDINATE>        13731                   </END_COORDINATE>
<FEATURE_NAME>          GALT                    </FEATURE_NAME>
<DNA_SEQUENCE>
        1 ttcagggtgg gtgggcgggg ggagacatgg aatggggcgc tcaccttgtg taccttaggt
       61 caattcgtgt ggcctcacgt cgcatagcga cgcgatcctg agcagcgcca cgaggcttca
      121 gaggcggacc gatggcagcg accttccggg cgagcgaaca ccagcatatt cgctacaacc
      181 cgctccagga cgagtgggtg ttagtgtcgg ctcatcgcat gaagcggccc tggcaaggac
      241 aagtggagcc ccagcttctg aagacagtgc cccgccacga cccactcaac cctctgtgtc
      301 ccggggccac acgagctaat ggggaggtga atccccacta tgatggtacc tttctgtttg
      361 acaatgactt cccggctctg cagcccgatg ctccggatcc aggacccagt gaccaccctc
      421 tcttccgagc agaggccgcc agaggagttt gtaaggtcat gtgcttccac ccctggtcgg
      481 atgtgacgct gccactcatg tctgtccctg agatccgagc tgtcatcgat gcatgggcct
      541 cagtcacaga ggagctgggt gcccagtacc cttgggtgca gatctttgaa aataaaggag
      601 ccatgatggg ctgttctaac ccccatcccc actgccaggt ttgggctagc agcttcctgc
      661 cagatatcgc ccagcgtgaa gagcgatccc agcagaccta tcacagccag catggaaaac
      721 ctttgttatt ggaatatggt caccaagagc tcctcaggaa ggaacgtctg gtcctaacca
      781 gtgagcactg gatagttctg gtccccttct gggcagtgtg gcctttccag acacttctgc
      841 tgccccggcg gcacgtgcgg cggctacctg agctgaaccc cgctgagcgt gatctcgcct
      901 ccatcatgaa gaagctcttg accaagtacg acaatctatt tgagacatcc tttccctact
      961 ccatgggctg gcatggggct cccacgggat taaagactgg agccacctgt gaccactggc
     1021 agctccacgc ccactactac cccccacttc tgcgatccgc aactgtccgg aagttcatgg
     1081 ttggaccgtg tacactggca gctcacgccc actacctacc cccacttctc ggatccgcaa
     1141 ctgtctatga aatgcttgcc caggcccagc gtgacctcac tcccgaacag gccccagaaa
     1201 gattaagggc gcttcccgag gtacactatt gcctggcgca gaaagacaag gaaacggcag
     1261 gatcaccatt gcttgactgt gaccacatca gggccttgaa tctttgtacc tgacagacct
     1321 gggacctgga gttcgggcag atgtgacatc aataaaactg cgtctcacat ttt
</DNA_SEQUENCE>
</FEATURE>

<!-- ***************************************************************  -->

                                                          <--  dtd(3) -->
<GENE1>                                                                 
<FEATURE_TYPE>          gene                    </FEATURE_TYPE>
<START_COORDINATE>      132                     </START_COORDINATE>
<END_COORDINATE>        1313                    </END_COORDINATE>
<Feature_Name>          GALT                    </Feature_Name>
<DNA_SEQUENCE>
      121             atggcagcg accttccggg cgagcgaaca ccagcatatt cgctacaacc
      181 cgctccagga cgagtgggtg ttagtgtcgg ctcatcgcat gaagcggccc tggcaaggac
      241 aagtggagcc ccagcttctg aagacagtgc cccgccacga cccactcaac cctctgtgtc
      301 ccggggccac acgagctaat ggggaggtga atccccacta tgatggtacc tttctgtttg
      361 acaatgactt cccggctctg cagcccgatg ctccggatcc aggacccagt gaccaccctc
      421 tcttccgagc agaggccgcc agaggagttt gtaaggtcat gtgcttccac ccctggtcgg
      481 atgtgacgct gccactcatg tctgtccctg agatccgagc tgtcatcgat gcatgggcct
      541 cagtcacaga ggagctgggt gcccagtacc cttgggtgca gatctttgaa aataaaggag
      601 ccatgatggg ctgttctaac ccccatcccc actgccaggt ttgggctagc agcttcctgc
      661 cagatatcgc ccagcgtgaa gagcgatccc agcagaccta tcacagccag catggaaaac
      721 ctttgttatt ggaatatggt caccaagagc tcctcaggaa ggaacgtctg gtcctaacca
      781 gtgagcactg gatagttctg gtccccttct gggcagtgtg gcctttccag acacttctgc
      841 tgccccggcg gcacgtgcgg cggctacctg agctgaaccc cgctgagcgt gatctcgcct
      901 ccatcatgaa gaagctcttg accaagtacg acaatctatt tgagacatcc tttccctact
      961 ccatgggctg gcatggggct cccacgggat taaagactgg agccacctgt gaccactggc
     1021 agctccacgc ccactactac cccccacttc tgcgatccgc aactgtccgg aagttcatgg
     1081 ttggaccgtg tacactggca gctcacgccc actacctacc cccacttctc ggatccgcaa
     1141 ctgtctatga aatgcttgcc caggcccagc gtgacctcac tcccgaacag gccccagaaa
     1201 gattaagggc gcttcccgag gtacactatt gcctggcgca gaaagacaag gaaacggcag
     1261 gatcaccatt gcttgactgt gaccacatca gggccttgaa tctttgtacc tga
</DNA_SEQUENCE>
<TRANSCRIPT>
<PROTEIN>
<protein_id>    AAA37658.1"                     </protein_id>
<db_xref>               GI:193422"              </db_xref>
<SEQUENCE_LENGTH>               109             </SEQUENCE_LENGTH>
<AMINO_ACID_SEQUENCE>
MAATFRASEHQHIRYNPLQDEWVLVSAHRMKRPWQGQVEPQLLKTVPRHDPLNPLCPG
ATRANGEVNPHYDGTFLFDNDFPALQPDAPDPGPSDHPLFRAEAARGVCKVMCFHPWS
DVTLPLMSVPEIRAVIDAWASVTEELGAQYPWVQIFENKGAMMGCSNPHPHCQVWASS
FLPDIAQREERSQQTYHSQHGKPLLLEYGHQELLRKERLVLTSEHWIVLVPFWAVWPF
QTLLLPRRHVRRLPELNPAERDLASIMKKLLTKYDNLFETSFPYSMGWHGAPTGLKTG
ATCDHWQLHAHYYPPLLRSATVRKFMVGPCTLAAHAHYLPPLLGSATVYEMLAQAQRD
LTPEQAPERLRALPEVHYCLAQKDKETAGSPLLDCDHIRALNLCT
</AMINO_ACID_SEQUENCE>
</PROTEIN>
</GENE1>



<!-- ***************************************************************  -->
                                                          <--  dtd(5) -->
<GENE2>

<GENE1>
<<FEATURE_TYPE>         gene                    </FEATURE_TYPE>
<START_COORDINATE>      132                     </START_COORDINATE>
<END_COORDINATE>        1313                    </END_COORDINATE>
<Feature_Name>          GALT                    </Feature_Name>
<DNA_SEQUENCE>
      121>          atggcagcg accttccggg cgagcgaaca ccagcatatt cgctacaacc
                                        ...

<PROMOTER>
<FEATURE_TYPE>          UAS                     </FEATURE_TYPE>
<START_COORDINATE>      13                      </START_COORDINATE>
<END_COORDINATE>        22                      </END_COORDINATE>
<DNA_SEQUENCE>  13> gggcgggggg                  </DNA_SEQUENCE>
</PROMOTER>
</GENE1>
</GENE2>

<!-- ***************************************************************  -->