Flexible Common Schema Study

Frank Manola
Object Services and Consulting, Inc.
fmanola@objs.com

2 December 1998

Executive Summary
Introduction
Overview
Observations/Recommendations

A Common Schema or Object Model is essential
Distinguish Semantics and Implementation, and Clarify Intent at Both Levels
Provide Enhanced Support for Technologies that Accommodate Changed or Heterogeneous Object Definitions

views and schema architectures
more flexible object models

ALP LDM and HTTP-NG
Genoa and XML

Organizational and Community Issues
References

Executive Summary

This report reviews the DARPA ISO Common Schema effort, and makes recommendations regarding its direction. The report pays particular attention to the modeling approach used in the Common Schema effort compared with object modeling approaches used in other ISO activities (e.g., ALP and Genoa), and related technologies for both supporting interoperability, and accommodating a reasonable degree of both modeling flexibility and heterogeneity.

The ISO Common Schema (also sometimes known as the Command and Control (C2) Schema) supports DARPA ISO's Advanced Information Technology System (AITS) Architecture, which, in turn, supports a number of ISO programs. The Common Schema is viewed as a central mechanism to insure semantic interoperability among the distributed services and applications of the architecture.

The Common Schema documentation reflects a general awareness of the scope, magnitude, and potential difficulties of the task (even though the program has, so far, not necessarily completely dealt with all of them), and has identified a number of outstanding issues. These issues are fairly typical of issues that have arisen in similar efforts within large organizations to develop reusable class libraries in support of large scale information systems, while simultaneously attempting to support and integrate ongoing system development activities. Hence, the existence of these issues does not necessarily reflect adversely on the Common Schema program, but rather indicates the essential difficulty of the problem being addressed.

The report makes a number of observations and recommendations:

The idea of a Common Schema, in the sense of the definition of common domain semantics to be shared and mutually understood by the components in a distributed object architecture, is essential for achieving semantic interoperability. The program directs attention to the problem of developing these domain semantics, provides a vehicle for getting members of the ISO community to think about these issues, and thus provides an important contribution.

The goals of the program could possibly be enhanced by more emphasis in a number of specific areas, all of which are somewhat interrelated, and which are described below. These areas have been addressed to a certain extent already, and some of them are research topics in their own right, but a renewed focus on these areas, based on the experience with the Common Schema effort now available, would be rewarding.

The Common Schema program needs to distinguish semantics and implementation more carefully, and clarify its intent at both levels. The Common Schema program effectively has the task of both capturing common domain semantics, and also reflecting these common semantics in terms of object definitions that can be used in real software. However, these things seem in some cases to be overly inter-mixed. Separating semantics from implementation allows alternative representations of domain objects (or object types) to be considered for use within the architecture, provided that common semantics are maintained. This separation can also improve the stability of the Common Schema definitions, in the sense that the semantics can remain stable while more implementation aspects, like the language and object modeling technologies used, may change independently. However, this separation can also complicate implementation, and hence the tradeoffs involved need to be carefully considered.

The Common Schema program should provide enhanced support for technologies that accommodate changed or heterogeneous object definitions. On the surface, the Common Schema program represents an attempt to support interoperability by reducing heterogeneity in the data and objects to be handled, by defining a common set of definitions that all participating components will use. However, the program should also consider the need for flexibility in the face of changed or specialized requirements. Hence, technologies that accommodate changed or heterogeneous object definitions need to be carefully considered as part of the context of the Common Schema program (even if they are not directly addressed by that program). These techniques include:

schema evolution techniques to deal with changes in the Common Schema definitions
schema architectures (use of views or component schemas, as in federated DBMSs or multi-tier object architectures) to allow components to specify their own requirements, and provide a way to manage well-defined mappings between common object definitions and those needed in particular applications or component federations
more flexible object modeling techniques
more flexible messaging techniques that allow components to be more resilient in adapting to changes in object definitions

Knowledge of the Common Schema activity, and serious interaction with it, seems to be uneven within the ISO community. Obtaining the cooperation of the various programs involved in the architecture, and active interaction among the people involved, will be crucial to the success of the program. One aspect of the program (and, in fact, of ISO programs in general) which could be significantly improved is the ease of access to program information. At the same time, the Common Schema activity needs to pay more attention to DoD standards, terms and semantics used in other programs (and in practice), and other DoD programs developing object model standards(e.g., DMSO simulation data models), and establish relationships with these programs so as to avoid both possible duplication of effort and inconsistent specifications.

1. Introduction

The purpose of this report is to review the DARPA ISO Common Schema effort, as described on the Object Model Working Group's Web page <http://ics.les.disa.mil>, and make recommendations regarding its direction. The review comments on overall aspects of the Common Schema effort. However, a particular emphasis is to comment on the modeling approach used in the Common Schema effort compared with object modeling approaches used in other ISO activities (e.g., ALP and Genoa), and related technologies for both supporting interoperability, and accommodating a reasonable degree of both modeling flexibility and heterogeneity.

In performing this review, I reviewed the program documentation and technical literature cited in the reference section (as well as other program documentation not explicitly referenced). I also accessed Common Schema interfaces maintained in the Common Schema activity's ICS tool, as well as interfaces in the DMSO HLA Object Model Data Dictionary System and Object Model Library <http://triton.dmso.mil/hla/data_sup/>. In addition, I had phone conversations and email interactions with the following individuals, from whom I obtained a great deal of valuable information:

John Anderson, MITRE (Common Schema technical lead) <janderso@mitre.org>
Greg Mack, Booz-Allen & Hamilton (Genoa) <gmack@bah.com>
Steve Milligan, GTE-BBN (Advanced Logistics Program) <milligan@bbn.com>

I gratefully acknowledge their help and insight. Of course, the observations and recommendations in this report are mine.

Before proceeding with the meat of the review, it is necessary to state a few caveats. First, I am neither a C4ISR domain expert, nor (primarily) an object analysis and design methodologist. My primary areas of expertise are in the technical details of object models (in the sense of object definition technology), distributed object architectures, and Web technology. In addition, this review has been performed within a one-month time period, and so cannot hope to be comprehensive. The observations and recommendations in this report must thus be evaluated with these caveats in mind. I suspect that at least several of the observations and recommendations made here have been raised before.

2. Overview

The ISO Common Schema (also sometimes known as the Command and Control (C2) Schema) effort grew out of an effort to build a common schema for the JTF-ATD Architecture. The Common Schema supports DARPA ISO's Advanced Information Technology System (AITS) Architecture [McK98], whichsupports a number of ISO programs, including:

Advanced Logistics Program (ALP)
Genoa
Dynamic Multi-user Information Fusion (DMIF)
Joint Force Air Component Commander (JFACC)
Joint Task Force (JTF) Advanced Technology Demonstration (ATD)

The Common Schema serves as a fundamental integrating technology within the architecture, and is being defined and managed by an Object Model Working Group (OMWG).

The Common Schema has two primary goals [CStutorial]:

Interoperability

Allow servers and applications to have common data definitions for the sharing and exchange of data
Standardize naming, structure, and services for data within the ATD architecture
Pursue seamless exchange of information between architecture components

Reuse

Support reuse of data methods and embedded services
Reduce dependence on static data representations to increase supportability and portability
Reduce development and testing time

In the short term, these goals apply within ISO programs; in the long term, the vision is to support these goals more broadly among DoD programs.

The Common Schema is described using a subset of the Unified Modeling Language (UML) [CSschema, CStutorial], a de facto (and now OMG) standard language for describing object models. The structure of the Common Schema is that of a single, deep class (inheritance) hierarchy. However, the Common Schema is logically divided into separate areas (based roughly on the communities with primary interest in those parts of the schema). Formal change control procedures are defined to evaluate and introduce new or changed definitions into the various Common Schema divisions. The Common Schema effort also provides for the use of project-specific schemata, at least on a temporary basis until the needs of the projects can be accommodated within the Common Schema.

Several tools support the Common Schema effort.TASC's Interface Control System (ICS) tool acts as a repository for the Common Schema interface definitions. ICS accepts interface definitions in CORBA IDL, and stores them in its own Object Definition Format. It can output interface definitions in IDL, C++, or Java. SAIC has also developed a repository based on UML, which provides interfaces to code generation tools. (It should be noted that there are commercial tools that support UML modeling and mapping those models to various languages; for example, Rational Rose <http://www.rational.com/> supports mapping to C++, Java, IDL, and Ada.) The ICS and SAIC tools currently are not integrated. It was also intended to use Ontolingua and Ontosaurus as ontology development tools, the resulting ontology(ies) being in turn used to develop the Common Schema. However, the use of these ontology tools has apparently been postponed.

Run-time support for accessing Common Schema information is also provided. The Schema Server is a UML-based server which provides run-time access to schema information and related metadata (such as information to tie object definitions to databases from which state data to build object instances can be obtained). In addition, a Dynamic Schema Service [CSdss] is intended to support schema evolution, as well as mapping of application-specific object definitions to Common Schema definitions.

Additional background can be obtained at the OMWG Web site <http://ics.les.disa.mil> (an account is required, which may be obtained by contacting John Anderson <janderso@mitre.org>).

[CSschema] identifies several hundred object classes, and the ICS tool contains interface definitions for a substantial proportion of these classes. Classes for additional aspects of the relevant application domains are also being defined. In addition, class definitions representing binary agreements between speciic programs that need to interoperate are being included in the repository. These definitions are not yet part of the Common Schema, but are being made available for reuse by other programs that might wish to use them. Given the scale of the complete C4ISR domain, it is safe to say that the current Common Schema definitions do not represent a "complete" set of class definitions for the C4ISR domain. However, it is difficult to evaluate "how complete" the definitions are, since there are multiple definitions possible, which would be more or less reasonable for different purposes (in fact, the definitions may never really be "complete", since the Common Schema represents definitions that must evolve over time). Section 3.4 considers the relationship of the Common Schema effort to other DoD schema efforts.

Object Modeling

Subjectivity of design
Representation and management of semantic information
Resolution of conflicts between Common Schema and project schemata
Schema vs. instances/dynamic relationship representations (e.g., ALP Prototypes)

Object Management

Incompatibilities among language constructs
Evolving standards for object definition and storage
Language/platform-dependent tools and integration of separate tools
Runtime transparency to object/class version evolution

Programmatic Issues

Multiple interrelated initiatives with separate schedules and deliverables
Obtaining acceptance and use of definitions

These issues are fairly typical of issues that have arisen in similar efforts within large organizations to develop reusable class libraries in support of large scale information systems, while simultaneously attempting to support and integrate ongoing system development activities. For example, such efforts typically encounter problems reconciling the near-term needs of individual development projects with longer-term reuse requirements, since success measures tend to be based on individual projects, not on the extent to which a coherent architecture is being developed, often little funding is specifically directed at the development of an overall architecture, and there are often inadequate concrete incentives for individual programs to cooperate in achieving overall architectural goals. Such efforts also typically encounter problems in defining the detailed structure of the shared class library, such as the need to mediate between alternative object classification approaches and other distinct requirements of the various individual participating projects. Hence, the existence of these issues does not necessarily reflect adversely on the Common Schema program, but rather indicates the essential difficulty of the problem being addressed.

3. Observations/Recommendations

3.1 A Common Schema or Object Model is essential.

Overall, the idea of a Common Schema, in the sense of the definition of common domain semantics to be shared and mutually understood by the components in a distributed object architecture, is essential for achieving semantic interoperability, i.e., the ability of the components to mutually understand the meanings of interactions with each other (either operation invocations or data exchanges).

As it stands, the program directs attention to the problem, provides a vehicle for getting members of the ISO community to think about these issues, and thus provides an important contribution. Even the emergence of problems associated with the program is useful, provided that the proper lessons are learned from them. As noted above, the program has actually identified a number of outstanding issues, which need to be the focus of further effort. Overall, it must be recognized that defining the Common Schema will be difficult, and will necessarily evolve through use and continued interaction with the needs of specific applications.

The goals of the program could possibly be enhanced by more emphasis in a number of specific areas, all of which are somewhat interrelated, and which are described in the following sections. These areas have been addressed to a certain extent in the program already, and some of them are research topics in their own right, but a renewed focus on these areas, based on the experience with the Common Schema effort now available, would be rewarding.

For example, due to the extent of the C4ISR domain, and the need to accommodate continuing change, techniques for creating and managing federations and associated schema architectures also need to be provided. A federation is a collection of components that interoperate with each other using common definitions agreed on by the federation (a federation schema), common definitions which are not necessarily shared (or shared completely) throughout the entire architecture. Mappings can be defined between federation schemas to govern interoperation between members of one federation and those of another. A federation of this type may exist temporarily (e.g., until its definitions are more widely accepted and integrated into the Common Schema) or for a more extended period (e.g., if the members of the federation have less need to interoperate extensively with outside components). The Common Schema program needs an approach that supports federation, and also supports managing the evolution of federation-specified definitions toward wider acceptance within the architecture where this is appropriate. The Common Schema program recognizes the need for temporary agreements (e.g., between pairs of programs) and certain forms of schema mappings, but it is not clear how much emphasis the program is placing on this area. This subject is discussed further in Section 3.3.1.

3.2 Distinguish Semantics and Implementation, and Clarify Intent at Both Levels

The Common Schema program in principle incorporates the use of languages and corresponding tools which represent usefully-different levels of abstraction:

Ontolingua, LOOM (Ontosaurus)--assertional semantics
UML (e.g., in the SAIC repository)--can be used at analysis, design, or implementation levels, but generally considered an analysis and design language
IDL, Java, C++ (ICS)--implementation level interfaces

However, in practice, the program does not seem to have taken full advantage of the potential provided by this separation of levels. For example:

The ontology tools do not seem to have been used a great deal
The tools are not well integrated, nor do they appear to have been given distinct roles, corresponding to the different levels of abstraction they (potentially) represent
While UML has been used (e.g., in [CSschema]), only a subset of its modeling capabilities appear to have actually been used, and it seems to have been used primarily as just another way to define implementation-level classes. In particular, features of UML such as multiple inheritance, multiple interfaces, dynamic type changes, etc. [FS97] have not been used.

The Common Schema program effectively has the task of both capturing common domain semantics, and also reflecting these common semantics in terms of object definitions that can be used in real software. However, these things should not become overly inter-mixed.

UML, possibly with some semantic extensions, seems to be a reasonable choice for defining the Common Schema. However, UML can be used at multiple levels of abstraction, to represent analysis, design, and implementation levels. Ideally, it should be used mainly at an analysis or conceptual design level, capturing agreements on vocabularies and element meanings and groupings. Similarly, IDL (at least currently) is a reasonable choice for implementation-level interface definition. However, if the use of UML constructs is overly-constrained based on implementation-level considerations (e.g., the ease with which the UML can be translated to IDL interfaces), then:

this may limit the extent to which semantics can be captured that are not readily captured in IDL by itself (e.g., that a given platform may be classified in alternative ways, or have multiple interfaces)
there doesn't seem to be much advantage in using UML over using IDL by itself, together with direct mappings from IDL to programming language interface definitions

For example, the use of a deep, single-inheritance class hierarchy in the Common Schema suggests an emphasis on static, compile-time type checking as a means of insuring interoperability among applications. This may or may not be the actual intent. For example, it may not be the intent to rule out more dynamic type systems that may be used in particular applications, such as the ALP's Logical Data Model (discussed further in the next section). Separating semantics from implementation allows alternative representations of domain objects (or object types) to be considered for use within the architecture, provided that common semantics are maintained. There are several alternative representations within DARPAISO projects that could be examined in this connection (see Section 3.3). This separation can also improve the stability of the Common Schema definitions, in the sense that the semantics can remain stable while more implementation aspects, like the language and object modeling technologies used, may change independently. At the same time, however, using all of UML and generating complex mappings to language class libraries adds complexity to the implementation layer. The tradeoff between semantic richness and implementation complexity needs careful consideration.

A related issue with the current Common Schema definitions is that in many cases the classes seem to lack attribute definitions (this is certainly true of the platform classes) and in most cases lack operation definitions. Without attribute or operation definitions, type checking reduces to determining that applications have used the proper class names, but this plays no significant role in providing interoperability, and makes the need for a strict class hierarchy questionable. In addition, the lack of operation definitions raises the question of whether the definitions are really intended to be those of distributed objects which could possibly be invoked remotely, or whether the definitions are intended to be those of data records to be exchanged between distributed applications.

3.3 Provide Enhanced Support for Technologies that Accommodate Changed or Heterogeneous Object Definitions

On the surface, the Common Schema program represents an attempt to support interoperability by reducing heterogeneity in the data and objects to be handled, by defining a common set of definitions that all participating components will use. However, the program should also consider the need for flexibility in the face of changed or specialized requirements. Hence, technologies that accommodate changed or heterogeneous object definitions need to be carefully considered as part of the context of the Common Schema program (even if they are not directly addressed by that program). These techniques include:

schema evolution techniques to deal with changes in the Common Schema definitions
schema architectures (use of views or component schemas, as in federated DBMSs or multi-tier object architectures) to allow components to specify their own requirements, and provide a way to manage well-defined mappings between common object definitions and those needed in particular applicationsor component federations (see Section 3.3.1.)
more flexible object modeling techniques (see Section 3.3.2.)
more flexible messaging techniques that allow components to be more resilient in adapting to changes in object definitions

The Common Schema program has explicitly recognized some of these issues. For example, [CSmanagement] briefly discusses long term interoperability as a goal, and the Dynamic Schema Server (DSS) effort considers aspects of some of the techniques identified above. However, it is not clear how much emphasis the program is placing on these areas, and how well coordinated those areas are with the rest of the program.

More specifically, the Common Schema effort explicitly recognizes the need to support some forms of schema adaptability and heterogeneity. Two general approaches are being taken to deal with this. The first, and more "static", involves the use of formal change control procedures to assess proposed changes to the schema, and include them if necessary. The more "dynamic" approach is embodied in the Dynamic Schema Server (DSS) effort [CSdss]. The DSS reflects the fact that systems using the Common Schema definitions may require the ability to adapt dynamically to changes in the schema interface definitions. For example, changes in Common Schema definitions should not necessarily require recompilation of existing systems.

[CSdss] contains a good discussion of many of the issues involved. The DSS provides facilities for recognizing different versions of schema interfaces, and the concept of a special server that maps between different versions of the same interface such that, e.g., a client can identify the particular interface version it wants to use, and be insulated from the fact that a particular server implements a different version. In supporting these facilities, the DSS concept actually identifies a number of much broader issues related to dealing with heterogeneity in general, and what types of heterogeneity the architecture will be prepared to deal with. In particular:

[CSdss] describes an "Interface Association Language" (IAL) for defining mappings between the various versions of a given interface. This is essentially a form of support for "views" (in the database sense), and suggests the need for a general schema architecture (in the federated database sense) as part of the architecture (see Section 3.3.1).
[CSdss] discusses the potential use of tagged data (or possibly a hybrid representation including such data) as a more flexible object/data representation approach. This raises the general issue of possibly using more flexible object models and implementation structures in the architecture (see Section 3.3.2.).

These techniques need to be carefully considered, and their relationships established, both among themselves, and to the types of heterogeneity to be dealt with by the architecture the Common Schema supports (since these techniques are all, in essence, ways of coping with different types of heterogeneity). The specific subjects of views and more flexible object models are discussed in more detail below.

3.3.1 views and schema architectures

Large scale distributed object systems increasingly are being designed with 3- (or sometimes multi-) tier architectures [MGHH+98]. These architectures involve the division of the system's components (and object definitions) into functional tiers based on the different functional concerns they address. For example, a typical 3-tier architecture has a tier for objects representing user interface elements, a tier for business or application objects, and a tier for database servers. The business object tier separates out the common definitions of enterprise operations and semantics from the more specialized concerns addressed in the other tiers. Similar ideas are reflected in federated DBMS architectures [SL90], which typically define several distinct types of schemas (sometimes called "views" in the DBMS literature), including:

schemas representing the contents of the individual databases forming the federation
schemas representing the requirements of individual applications or users
a canonical or global schema representing the complete information contents of the federation, and to which the other schemas are mapped (this mapping can also involve data model translation between the data model used for the global schema and the data models used in the individual databases); in some cases, there may be multiple federations each having its own canonical schema

Somewhat similar ideas are also reflected in the Model-View-Controller concepts in object-oriented software development, the ANSI/SPARC three-schema DBMS architecture [TK77], and the Federation Object Model and Simulation Object Model concepts of DMSO's High Level Architecture [Lutz]. The basic idea of these approaches is to address separation of concerns, or the need to approach a complex problem by breaking it into more tractible pieces. This is done by providing separate definitions of the modeling requirements of distinct parts of the architecture, and addressing interoperability by providing well-defined mappings between these parts, based on a central definition of enterprise semantics. In the case of the ISO architecture, the Common Schema provides the central definition of enterprise semantics, but there is a need to address the other schema levels as well. Specialized schemas represent a way of capturing alternative representations or the requirements of distinct federations, and mappings between them, governed by the Common Schema (as defining the common underlying semantics).

The DSS IAL provides what is effectively a view definition language to deal with, e.g., screening out unnecessary object attributes when delivering an object to an application. [Csmanagement] mentions as an issue the use of "Schema Object Views" for "efficient schema use and expansion", and for filtering large schema objects to provide only the specific attributes and methods required by an application. [Csroadmap] mentions both multiple ontological views of the repository (to provide tailored classification structures) and repository views (for filtering schema objects based on specific user requirements). These ideas should be pursued, generalized, and made a more explicit part of the overall architecture within which the Common Schema functions.

There is research on view creation in object databases (e.g., [AB91, KKS92]), and view mapping in federated database systems (e.g., [FR97]), that could be applied to this issue, as well as using more direct means of defining interface mappings in individual object class definitions. (It is worth noting that UML provides a capability for modeling objects with multiple interfaces, although this is perhaps not the best way to represent the multiple schemas being described here.)

3.3.2 more flexible object models

Whenever "programming in the large," as program size gets very large to cover many related applications (as in C4ISR), or when a complex related collection of programs must be distributed across diverse communities in space (across the Internet) or time (last for many years), evolutionary modeling requirements begin to dominate. Hence evolutionary modeling mechanisms must augment the object-oriented programming language view of object models as statically defined once and for all at some fixed definition time.

Object models (type systems) such as IDL and C++ make it relatively difficult to change aspects of types dynamically, changes such as defining new types at run time (e.g., adding additional attributes to individual object instances to represent special cases), and changing subtype (inheritance) relationships. There are a number of reasons to give serious consideration to more flexible object modeling technology within the Common Schema activity:

Several ISO programs, such as ALP and Genoa, use such object modeling technologies, and interoperability between Common Schema representations and these programs must be defined in any case (e.g., there is explicit recognition that this issue must be addressed in the case of ALP, but no strategy appears to have been defined yet).
The Internet, including work such as HTTP-NG and XML-related activities, is also spawning alternative object modeling techniques. It would be a good idea for the Common Schema to pay attention to these technologies, as the Internet represents both an extreme example of the need for adaptability, and an environment with which DoD applications will probably have to interact (e.g., in accessing open source information, and interoperating with COTS software).
Such object modeling technology can be useful in suggesting higher-level logical modeling requirements, e.g., dynamic type construction, multiple classifications, etc., that the architecture might wish to support (e.g., in providing degrees of adaptability the architecture needs).

The capabilities provided by these technologies could, in the short term, be reflected in UML models (UML can represent some of these forms of flexibility now) to represent logical requirements to be mapped to more static (e.g., IDL) interfaces used in operational software, using DSS capabilities to deal with changes as envisioned now. In the longer term, it would be worthwhile to consider the use of these technologies more directly.

In some cases, these technologies involve the potential for clients to assume additional responsibilities in dealing with type and instance variations (in some cases with mediator assistance, as in DSS). Agent architectures already employ forms of run-time activity of this sort. In the long run, the architecture needs to incorporate the ability to deal with controlled heterogeneity (based on a careful assessments of the tradeoffs involved), as consistent with the use of agents, open sources, COTS products, and the Internet.

The following sections provide some further detail about some of these technologies.

3.3.2.1 ALP LDM and HTTP-NG

The ALP Logical Data Model (LDM) [Mil98, MC98] specifically attempts to deal with the following problems:

millions of types (not instances) of things, e.g., logistics asset types
continuously evolving real-world entities
multiple, changing capabilities and roles

The LDM has adopted several techniques to deal with these problems. First, things are primarily modeled based on their properties, rather than what they are (as defined by a particular type). As a result, for example, it doesn't matter whether a Tank is classified as a Vehicle or as a Weapon, provided that it has the properties of a Vehicle and the properties of a Weapon. Related properties are collected in property groups (e.g., there are Vehicle property groups representing the properties of particular kinds of vehicles, and Weapon property groups representing the properties of particular kinds of weapons). These property groups are defined as prototype instances, from which individual objects can derive behavior by delegation (routing operation invocations from the original object to the appropriate prototype instance). This both reduces the number of classes required to model the domain objects, and allows new types of things to be defined and created dynamically. This construction of objects by aggregating collections of property groups is similar to the approach used in several simulation object models developed in connection with DMSO activities [Cot97, Dud97] (although these simulation object models do not employ delegation).

In effect, the LDM addresses the need to dynamically construct what are in effect new types of things. LDM does this by defining a few generic higher level types, and using prototype instances to define collections of properties which can be aggregated to define individual variations of the basic types. Real world objects are formed by selecting one of the generic types, and adding as many property groups as are required to define the capabilities of the object. Dynamic type construction is possible because new property groups, being object instances (not types), can be created at run time. This effectively creates a "two-tier" type system, where strong typing of interfaces handles only generic type checking, and other mechanisms deal with more detailed type variants.

ALP defines an architecture based on the use of agents, which can engage in various forms of run-time negotiation, so there is a plausible reason in this context for doing more type-oriented checking and negotiation at run-time. The agent, for example, may have its own ideas of what things should be considered to have "similar" types which the architecture needs to accommodate.

Different variants of a "two-tier" type system approach are being investigated in the World Wide Web Consortium (W3C) HTTP-NG project <http://www.w3.org/Protocols/HTTP-NG/ >. The HTTP-NG project is attempting to develop a generic distributed object system to support both current Web capabilities, and the increasing use of the Web for more general distributed applications. As part of this activity, the project's Protocol Design Group has been investigating the problem of type system evolution mechanisms. As part of the Internet, HTTP-NG would face the problem of type system evolution in a particularly acute form, which is referred to as "anarchic evolution". That is, after a given system has been deployed, it is subject to concurrent, independent evolutionary developments, each of which is incrementally rolled out into the deployed system. Any new piece of the system (e.g., Web browser, Web server) is faced with the prospect of having to interact with peers that understand any combination of current and future extensions. It is desirable to minimize both the application programming nuisance and the network performance costs of coping with this situation, as well as develop solutions that are not limited to on-line 2-party interactions. HTTP (among others) addresses this problem with optional headers. However, the type systems of existing distributed object systems (CORBA, DCOM, Java RMI) do not facilitate this type of anarchic evolution as well as HTTP. Since HTTP-NG is proposing a distributed object system to support capabilities currently provided by HTTP, it must address these evolution requirements directly.

While this HTTP-NG type system evolution work is still at a relatively early stage, and details are currently restricted to W3C members, generally speaking, the HTTP-NG approach is to define type systems in which differences due to "evolutionary" changes are not considered in static type checking, but instead are checked at carefully designed points at run-time. To the extent that HTTP-NG represents the potential future of the Web, it would be worthwhile for the Common Schema program to be aware of these developments.

3.3.2.2 Genoa and XML

Genoa represents an application domain where an extremely flexible data/object representation is needed due, for example, to the difficulty in anticipating the structure of the collection of material needed to describe a given situation. As a result, Genoa makes heavy use of property lists (sets of attribute/value pairs, tagged data) in structuring its information. [CSdss] notes that "The advantage of fully tagged schemes are their ability to create data that stands alone, without any need for a centralized authority for managing the definitions. Given sufficiently thorough metadata, useful generic browsers, viewers, and editors can be constructed for virtually any [data] type at all." Tagged data can be considered an incorporation of metadata at the level of individual attributes or other content groupings in the object representation. The individual tags can either be considered as metadata themselves or, more accurately, as indirect references to metadata located elsewhere which describes the meaning of the tagged information. For example, in the case of the Web, the tags could be associated with URLs providing direct access to the metadata describing tag semantics by either a human user or a program.

The use of tagged data provides for an extremely flexible representation. New tags can be freely defined and used within exchanged data, with the definitions of the tag semantics stored on-line for access as needed. Unanticipated combinations of tags can be combined and exchanged between programs. Clients and servers can be designed to ignore tags they don't understand (much as Web browsers today do for HTML tags they don't understand). At the same time, discipline can be imposed by requiring certain minimum sets of tags to be included, by requiring certain combinations of tags to be always used together, or by requiring tags to be selected from one or more controlled vocabularies. These vocabularies could, for example, be developed and controlled by domain-oriented groups who would provide the necessary metadata to define tag semantics. Such constraints in effect define a type system, for which the tagged data serves as a representation (and the type system can be designed to support whatever requirements are needed).

Genoa is apparently looking into the use of W3C's Extensible Markup Language (XML) <http://www.w3.org/XML/> and related technologies as a representation technique for its tagged data requirements. The Web is increasingly targeting XML as its next-generation data representation. Unlike HTML, which defines a fixed set of tags, XML allows the definition of customized markup languages with application-specific tags, e.g., <QUANTITY> or <SPEED>, for representing information in particular application domains. XML Document Type Definitions (DTDs) provide a way to explicitly declare the tag sets, and their structure, to be used in particular units of data.

In addition to these basic capabilities, other technologies related to XML are currently under development within W3C, including

the XML Linking and Pointer Languages, which provide much more powerful linking capabilities that those currently available within HTML (including bidirectional and multi-way links, and links to internal elements within units of data).
the Resource Description Format (RDF), which provides a model for representing metadata (including ontology-like information) in XML based on propositional logic plus certain modalities.
XML namespaces, which provides a means for associating XML tags with specific controlled vocabularies.
the Document Object Model (DOM) <http://www.w3.org/DOM/>, which defines an object-oriented API for XML structures.

In addition to representing units of data in a distributed object architecture, XML can be used in representing other aspects of such architectures. For example:

DataChannel's WebBroker [TL98] represents one of several attempts to build a complete Web-native distributed object computing model, based on the use of XML and HTTP. In these approaches, XML is used to represent both object interfaces and object method invocation messages sent between objects.
Microsoft's Scriptlets [deB98] represents one of several approaches which allow components (COM components in the case of Scriptlets) to be directly written using a combination of XML and a scripting language such as JavaScript.

These and other technologies for combining Web and object concepts are thoroughly described in two technical reports [Man98a,b] from Object Services and Consulting.

The use of tagged data in messages can be easily accommodated within the use of CORBA DII, is similar to the way structured messages are used in electronic commerce applications, and can serve as a means of reducing coupling between clients and servers [ES98]. In addition, a number of OMG activities are currently contemplating the use of XML for various purposes. including XMI (an OMG metadata exchange submission), Tagged Data Facility, Common (Data) Warehouse Metadata RFP, and the CORBA Components submission.

These technologies, and related developments, should be carefully tracked, as a potential means to represent the semantics defined by the Common Schema definitions in a more flexible form.

3.4 Organizational and Community Issues

Knowledge of the Common Schema activity, and serious interaction with it, seems to be uneven within the ISO community. Obtaining the cooperation of the various programs involved in the architecture, and active interaction among the people involved, will be crucial to the success of the program. One aspect of the program (and, in fact, of ISO programs in general) which could be significantly improved is the ease of access to program information. Specific suggestions include:

Reconsider the password protection used on individual project pages. The need to obtain multiple user-ids and passwords (including, for example, separate user-ids and passwords to access the OMWG page, and ICS tool, even for read-only access) inhibits cross-working among projects. This contrasts unfavorably with the ease of access to most DMSO material.
Make more use of standard Internet approaches such as subject- or project-specific email lists to foster more interaction among potential users and participants (such lists may exist, but I was not made aware of them). It is desirable to maximize the number of people aware of, using, and contributing to these ideas, even when they are not necessarily part of the formal process (this, ideally, includes members of the general public, but certainly could reasonably include members of the general DoD community). There are certainly many people unknown to the program within the DoD community who could probably contribute useful ideas in this way on a relatively low-cost basis (a few selected experts could perhaps review suggestions). The W3C, for example, makes much faster progress than it ordinarily would through the use of both member-only and public email lists (to which people have to subscribe, so there would be some control over participation) focusing on specific technical activities.
The OMWG Web site could be made much more of a focal point for updated information, and interaction among program participants.
It would be helpful for the program if it could document its information sources, and the methodology it is using, more thoroughly. This could forestall numerous "did you consider this" questions from people not initially familiar with the program.

At the same time, the Common Schema activity needs to pay more attention to DoD standards, terms and semantics used in other programs (and in practice), DoD programs involving large-scale database schema development, such as the Modern Imaging Product Architecture DBMS (MIDB) <http://www.objs.com/ddb/9703-Dynamic-Database-II-Meeting-1-Notes.htm#MIDB>), and other DoD programs developing object model standards, such as the DMSO Object Model Data Dictionary (OMDD) activity. For example, the OMDD has made a point of basing its contents, where possible, on existing data standards, such as the Defense Data Dictionary System, and, in turn, providing input to other standards based on its own requirements. There appears to be nothing corresponding to this within the Common Schema program, and while there might be good reasons for this, it appears to be an idea worth looking into. There is certainly a need for DoD coordination of these schema/data dictionary development activities, so as to avoid both duplication of effort and inconsistent specifications. Another reason for looking specifically at the OMDD activity is the need for interoperability between simulations and actual C2 systems in some circumstances {NS98].

References

Common Schema <http://ics.les.disa.mil> (account required)

[CSroadmap] OMWG Roadmap, draft release 0.5, 6/24/97.

[CSmanagement] Common Schema Management Overview, 5/20/97.

[CStutorial] Overview of the Object Model Working Group Common Schema, 3/10/98.

[CSschema] OMWG Command and Control Schema, v.0.5.3, 16 Oct. 1996

[CSdss] Randall Schultz, "Request for Comments on Proposed Dynamic Schema Service (DSS) for JTF-ATD Application Development", 8/22/97.

OMFG General Overview and Update, 5/4/98.

OMFG Contractor Coordination Meeting, 2/10/98.

Schema Server Status, 2/10/98.

Genoa <http://echoleader.usae.bah.com/genoa/>

Genoa System Design Document

Genoa Products, Critical Information Packages (CIPs), and Thematic Action Groups (TAGs) Concept and Design White Paper

Critical Information Package Concept of Operations

CrisisBrief: Concept of Operations for the Virtual Situation Book

ALP <http://alp.sra.com/alp/>

[Mil98] Stephen Milligan, "ALP Architectural Considerations", presentation foils, 1998.

[MC98] Stephen Milligan and Todd Carrico, "Investigating Large-Scale Agent Architectures", position paper for the OMG-DARPA Workshop on Compositional Software Architectures, Monterey, CA, Jan. 1998 <http://www.objs.com/work shops/ws9801/index.html>.

DMSO <http://triton.dmso.mil/hla/>, <http://hla.dmso.mil>

[NS98] J. Nielsen and M. Salisbury, "Challenges in Developing the JTLS-GCCS-NC3A Federation", Simulation Interoperability Workshop, 1998. <http://triton.dmso.mil/hla/implement/jtls/>

[Scr97] "HLA Object Model Data Dictionary", presentation, <http://www.arlut.utexas.edu/~imewwww/index.html>

[omddhome] OMDDS Homepage, <http://s3.arlut.utexas.ed u/omdds/code/index.htm>

Other References

[AB91] S. Abiteboul and A. Bonner, "Objects and Views", Proc. ACM SIGMOD '91.

[Cot97] A. Cotton, III, "Developing a Standard Unit-Level Object Model", Naval Postgraduate School, Sept. 1997 (NTIS ADA339220).

[deB98] M. De Bruijn, "Internet Explorer 5.0--for Intranets Only?", WEBBuilder, 3(9), Sept. 1998 (see also <http://www.microsoft.com/xml/>).

[Dud97] D. Dudgeon, "Developing a Standard Platform-Level Army Object Model", Naval Postgraduate School, Sept. 1997 (NTIS ADA-341525).

[ES98] P. Eeles and O. Sims, Building Business Objects, John Wiley & Sons, 1998.

[FS97] M. Fowler (with K. Scott), UML Distilled, Addison-Wesley, 1997.

[FR97] G. Fahl and T. Risch, "Query Processing Over Object Views of Relational Data", VLDB Journal 6(1997) 4, 261-281.

[KKS92] M. Kifer, W. Kim, and Y. Sagiv, "Querying Object-Oriented Databases", Proc. ACM SIGMOD '92.

[Lutz} R. Lutz, "HLA Object Model Development: A Process View", <http://hla.dmso.mil>.

[Man98a] F. Manola, "Towards a Web Object Model", Technical Report, Object Services and Consulting, Inc., <http://www.objs.com/OSA/wom.htm>, 1998.

[Man98b] F. Manola, "Some Web Object Model Construction Technologies", Technical Report, Object Services and Consulting, Inc., <http://www.objs.com/OSA/wom-II.htm >, 1998.

[McK98] John McKim, "DARPA ISO Architecture Lessons Learned", position paper for the OMG-DARPA Workshop on Compositional Software Architectures, Monterey, CA, Jan. 1998 <http://www.objs.com/workshops/ws9801/index.html>.

[MGHH+98] F. Manola, et.al., "Supporting Cooperation in Enterprise-Scale Distributed Object Systems", in M. P. Papazoglou and G. Schlageter (eds.), Cooperative Information Systems: Trends and Directions, Academic Press, 1998.

[SL90] A. Sheth and J. Larson, "Federated Database Systems for Managing Distributed, Heterogeneous and Autonomous Databases", ACM Computing Surveys, 22(3), Sept. 1990.

[TK77] D. Tsichritzis and A. Klug (eds.), The ANSI/X3/SPARC DBMS Framework: Report of the Study Group on Database Management Systems, AFIPS Press, Montvale, NJ, 1977.

[TL98] J. Tigue and J. Lavinder, "WebBroker: Distributed Object Communication on the Web", W3C Note, World Wide Web Consortium, 1998 <http://www.w3.org/TR/1998/NOTE-webbroker>.

This report was prepared by Object Services and Consulting, Inc. (OBJS) under subcontract to the Institute for Defense Analyses (IDA) on its Task A-209, Advanced Information Technology Services Architecture, under contract DASW01 94 C 0054 for the Defense Advanced Research Projects Agency. Publication of this document does not indicate endorsement by the Department of Defense, nor should the contents be construed as reflecting the official position of that agency.

Permission is granted to copy this document provided this copyright statement is retained in all copies.

Disclaimer: Neither OBJS nor IDA warrant the accuracy or completeness of the information in this report.