Flexible Common Schema Study
Frank Manola
Object Services and Consulting, Inc.
fmanola@objs.com
2 December 1998
Contents
Executive Summary
This report reviews the DARPA ISO Common Schema effort, and makes recommendations
regarding its direction. The report pays particular attention to
the modeling approach used in the Common Schema effort compared with object
modeling approaches used in other ISO activities (e.g., ALP and Genoa),
and related technologies for both supporting interoperability, and accommodating
a reasonable degree of both modeling flexibility and heterogeneity.
The ISO Common Schema (also sometimes known as the Command and Control
(C2) Schema) supports DARPA ISO's Advanced Information Technology System
(AITS) Architecture, which, in turn, supports
a number of ISO programs. The Common Schema is viewed as a central mechanism
to insure semantic interoperability among the distributed services and
applications of the architecture.
The Common Schema documentation reflects a general awareness of the
scope, magnitude, and potential difficulties of the task (even though the
program has, so far, not necessarily completely dealt with all of them),
and has identified a number of outstanding issues. These issues are fairly
typical of issues that have arisen in similar efforts within large organizations
to develop reusable class libraries in support of large scale information
systems, while simultaneously attempting to support and integrate ongoing
system development activities. Hence, the existence of these issues does
not necessarily reflect adversely on the Common Schema program, but rather
indicates the essential difficulty of the problem being addressed.
The report makes a number of observations and recommendations:
-
The idea of a Common Schema, in the sense of the definition of common domain
semantics to be shared and mutually understood by the components in a distributed
object architecture, is essential for achieving semantic interoperability.
The program directs attention to the problem of developing these domain
semantics, provides a vehicle for getting members of the ISO community
to think about these issues, and thus provides an important contribution.
The goals of the program could possibly be enhanced by more emphasis
in a number of specific areas, all of which are somewhat interrelated,
and which are described below. These areas have been addressed to a certain
extent already, and some of them are research topics in their own right,
but a renewed focus on these areas, based on the experience with the Common
Schema effort now available, would be rewarding.
-
The Common Schema program needs to distinguish semantics and implementation
more carefully, and clarify its intent at both levels. The Common
Schema program effectively has the task of both capturing common domain
semantics, and also reflecting these common semantics in terms of object
definitions that can be used in real software. However, these things seem
in some cases to be overly inter-mixed. Separating semantics from implementation
allows alternative representations of domain objects (or object types)
to be considered for use within the architecture, provided that common
semantics are maintained. This separation can also improve the stability
of the Common Schema definitions, in the sense that the semantics can remain
stable while more implementation aspects, like the language and object
modeling technologies used, may change independently. However, this separation
can also complicate implementation, and hence the tradeoffs involved need
to be carefully considered.
-
The Common Schema program should provide enhanced support for technologies
that accommodate changed or heterogeneous object definitions. On the surface,
the Common Schema program represents an attempt to support interoperability
by reducing heterogeneity in the data and objects to be handled, by defining
a common set of definitions that all participating components will use.
However, the program should also consider the need for flexibility in the
face of changed or specialized requirements. Hence, technologies that accommodate
changed or heterogeneous object definitions need to be carefully considered
as part of the context of the Common Schema program (even if they are not
directly addressed by that program). These techniques include:
-
schema evolution techniques to deal with changes in the Common Schema definitions
-
schema architectures (use of views or component schemas, as in federated
DBMSs or multi-tier object architectures) to allow components to specify
their own requirements, and provide a way to manage well-defined mappings
between common object definitions and those needed in particular applications
or component federations
-
more flexible object modeling techniques
-
more flexible messaging techniques that allow components to be more resilient
in adapting to changes in object definitions
The Common Schema program has explicitly recognized some of these issues.
However, it is not clear how much emphasis the program is placing on these
areas, and how well coordinated those areas are with the rest of the program.
-
Knowledge of the Common Schema activity, and serious interaction with it,
seems to be uneven within the ISO community. Obtaining the cooperation
of the various programs involved in the architecture, and active interaction
among the people involved, will be crucial to the success of the program.
One aspect of the program (and, in fact, of ISO programs in general) which
could be significantly improved is the ease of access to program information.
At the same time, the Common Schema activity needs to pay more attention
to DoD standards, terms and semantics used in other programs (and in practice),
and other DoD programs developing object model standards(e.g., DMSO simulation
data models), and establish relationships with these programs so as to
avoid both possible duplication of effort and inconsistent specifications.
1. Introduction
The purpose of this report is to review the DARPA ISO Common Schema effort,
as described on the Object Model Working Group's Web page <http://ics.les.disa.mil>,
and make recommendations regarding its direction. The review comments on
overall aspects of the Common Schema effort. However, a particular emphasis
is to comment on the modeling approach used in the Common Schema effort
compared with object modeling approaches used in other ISO activities (e.g.,
ALP and Genoa), and related technologies for both supporting interoperability,
and accommodating a reasonable degree of both modeling flexibility and
heterogeneity.
In performing this review, I reviewed the program documentation and
technical literature cited in the reference section (as well as other program
documentation not explicitly referenced). I also accessed Common Schema
interfaces maintained in the Common Schema activity's ICS tool, as well
as interfaces in the DMSO HLA Object Model Data Dictionary System and Object
Model Library <http://triton.dmso.mil/hla/data_sup/>.
In addition, I had phone conversations and email interactions with the
following individuals, from whom I obtained
a great deal of valuable information:
-
John Anderson, MITRE (Common Schema technical lead) <janderso@mitre.org>
-
Greg Mack, Booz-Allen & Hamilton (Genoa) <gmack@bah.com>
-
Steve Milligan, GTE-BBN (Advanced Logistics Program) <milligan@bbn.com>
I gratefully acknowledge their help and insight.
Of course, the observations and recommendations in this report are mine.
Before proceeding with the meat of the review, it is necessary to state
a few caveats. First, I am neither a C4ISR domain
expert, nor (primarily) an object analysis and design methodologist. My
primary areas of expertise are in the technical details of object models
(in the sense of object definition technology), distributed object architectures,
and Web technology. In addition, this review has been performed within
a one-month time period, and so cannot hope to be comprehensive. The observations
and recommendations in this report must thus be evaluated with these caveats
in mind. I suspect that at least several of the observations and recommendations
made here have been raised before.
2. Overview
The ISO Common Schema (also sometimes known as the Command and Control
(C2) Schema) effort grew out of an effort to build a common schema for
the JTF-ATD Architecture. The Common Schema supports DARPA ISO's Advanced
Information Technology System (AITS) Architecture [McK98], whichsupports
a number of ISO programs, including:
-
Advanced Logistics Program (ALP)
-
Genoa
-
Dynamic Multi-user Information Fusion (DMIF)
-
Joint Force Air Component Commander (JFACC)
-
Joint Task Force (JTF) Advanced Technology Demonstration (ATD)
The Common Schema serves as a fundamental integrating technology within
the architecture, and is being defined and managed by an Object Model Working
Group (OMWG).
The Common Schema has two primary goals [CStutorial]:
-
Interoperability
-
Allow servers and applications to have common data definitions for the
sharing and exchange of data
-
Standardize naming, structure, and services for data within the ATD architecture
-
Pursue seamless exchange of information between architecture components
-
Reuse
-
Support reuse of data methods and embedded services
-
Reduce dependence on static data representations to increase supportability
and portability
-
Reduce development and testing time
In the short term, these goals apply within ISO programs; in the long term,
the vision is to support these goals more broadly among DoD programs.
The Common Schema is described using a subset of the Unified Modeling
Language (UML) [CSschema, CStutorial], a de facto (and now OMG) standard
language for describing object models. The structure of the Common Schema
is that of a single, deep class (inheritance) hierarchy. However, the Common
Schema is logically divided into separate areas (based roughly on the communities
with primary interest in those parts of the schema). Formal change control
procedures are defined to evaluate and introduce new or changed definitions
into the various Common Schema divisions. The Common Schema effort also
provides for the use of project-specific schemata, at least on a temporary
basis until the needs of the projects can be accommodated within the Common
Schema.
Several tools support the Common Schema effort.TASC's Interface Control
System (ICS) tool acts as a repository for the Common Schema interface
definitions. ICS accepts interface definitions in CORBA IDL, and stores
them in its own Object Definition Format. It can output interface definitions
in IDL, C++, or Java. SAIC has also developed a repository based on UML,
which provides interfaces to code generation tools. (It should be noted
that there are commercial tools that support UML modeling and mapping those
models to various languages; for example, Rational Rose <http://www.rational.com/>
supports mapping to C++, Java, IDL, and Ada.) The ICS and SAIC tools currently
are not integrated. It was also intended to use Ontolingua and Ontosaurus
as ontology development tools, the resulting ontology(ies) being in turn
used to develop the Common Schema. However, the use of these ontology tools
has apparently been postponed.
Run-time support for accessing Common Schema information is also provided.
The Schema Server is a UML-based server which provides run-time access
to schema information and related metadata (such as information to tie
object definitions to databases from which state data to build object instances
can be obtained). In addition, a Dynamic Schema Service [CSdss] is intended
to support schema evolution, as well as mapping of application-specific
object definitions to Common Schema definitions.
Additional background can be obtained at the OMWG Web site <http://ics.les.disa.mil>
(an account is required, which may be obtained by contacting John Anderson
<janderso@mitre.org>).
[CSschema] identifies several hundred object classes, and the ICS tool
contains interface definitions for a substantial proportion of these classes.
Classes for additional aspects of the relevant application domains are
also being defined. In addition, class definitions representing binary
agreements between speciic programs that need to interoperate are being
included in the repository. These definitions are not yet part of the Common
Schema, but are being made available for reuse by other programs that might
wish to use them. Given the scale of the complete C4ISR domain, it is safe
to say that the current Common Schema definitions do not represent a "complete"
set of class definitions for the C4ISR domain. However, it is difficult
to evaluate "how complete" the definitions are, since there are multiple
definitions possible, which would be more or less reasonable for different
purposes (in fact, the definitions may never really be "complete", since
the Common Schema represents definitions that must evolve over time). Section
3.4 considers the relationship of the Common Schema effort to other DoD
schema efforts.
The Common Schema documentation reflects a general awareness of the
scope, magnitude, and potential difficulties of the task (even though the
program has, so far, not necessarily completely dealt with all of them).
For example, Common Schema documentation has identified outstanding issues
such as:
Object Modeling
-
Subjectivity of design
-
Representation and management of semantic information
-
Resolution of conflicts between Common Schema and project schemata
-
Schema vs. instances/dynamic relationship representations (e.g., ALP Prototypes)
Object Management
-
Incompatibilities among language constructs
-
Evolving standards for object definition and storage
-
Language/platform-dependent tools and integration of separate tools
-
Runtime transparency to object/class version evolution
Programmatic Issues
-
Multiple interrelated initiatives with separate schedules and deliverables
-
Obtaining acceptance and use of definitions
These issues are fairly typical of issues that have arisen in similar efforts
within large organizations to develop reusable class libraries in support
of large scale information systems, while simultaneously attempting to
support and integrate ongoing system development activities. For example,
such efforts typically encounter problems reconciling the near-term needs
of individual development projects with longer-term reuse requirements,
since success measures tend to be based on individual projects, not on
the extent to which a coherent architecture is being developed, often little
funding is specifically directed at the development of an overall architecture,
and there are often inadequate concrete incentives for individual programs
to cooperate in achieving overall architectural goals. Such efforts also
typically encounter problems in defining the detailed structure of the
shared class library, such as the need to mediate between alternative object
classification approaches and other distinct requirements of the various
individual participating projects. Hence, the existence of these issues
does not necessarily reflect adversely on the Common Schema program, but
rather indicates the essential difficulty of the problem being addressed.
3. Observations/Recommendations
3.1 A Common Schema or Object Model
is essential.
Overall, the idea of a Common Schema, in the sense of the definition of
common domain semantics to be shared and mutually understood by the components
in a distributed object architecture, is essential for achieving
semantic interoperability, i.e., the ability of the components to mutually
understand the meanings of interactions with each other (either operation
invocations or data exchanges).
As it stands, the program directs attention to the problem, provides
a vehicle for getting members of the ISO community to think about these
issues, and thus provides an important contribution. Even the emergence
of problems associated with the program is useful, provided that the proper
lessons are learned from them. As noted above, the program has actually
identified a number of outstanding issues, which need to be the focus of
further effort. Overall, it must be recognized that defining the Common
Schema will be difficult, and will necessarily evolve through use and continued
interaction with the needs of specific applications.
The goals of the program could possibly be enhanced by more emphasis
in a number of specific areas, all of which are somewhat interrelated,
and which are described in the following sections. These areas have been
addressed to a certain extent in the program already, and some of them
are research topics in their own right, but a renewed focus on these areas,
based on the experience with the Common Schema effort now available, would
be rewarding.
For example, due to the extent of the C4ISR domain, and the need to
accommodate continuing change, techniques for creating and managing federations
and associated schema architectures also need to be provided. A
federation is a collection of components that interoperate with each other
using common definitions agreed on by the federation (a federation schema),
common definitions which are not necessarily shared (or shared completely)
throughout the entire architecture. Mappings can be defined between federation
schemas to govern interoperation between members of one federation and
those of another. A federation of this type may exist temporarily (e.g.,
until its definitions are more widely accepted and integrated into the
Common Schema) or for a more extended period (e.g., if the members of the
federation have less need to interoperate extensively with outside components).
The Common Schema program needs an approach that supports federation, and
also supports managing the evolution of federation-specified definitions
toward wider acceptance within the architecture where this is appropriate.
The Common Schema program recognizes the need for temporary agreements
(e.g., between pairs of programs) and certain forms of schema mappings,
but it is not clear how much emphasis the program is placing on this area.
This subject is discussed further in Section 3.3.1.
3.2 Distinguish Semantics and Implementation,
and Clarify Intent at Both Levels
The Common Schema program in principle incorporates the use of languages
and corresponding tools which represent usefully-different levels of abstraction:
-
Ontolingua, LOOM (Ontosaurus)--assertional semantics
-
UML (e.g., in the SAIC repository)--can be used at analysis, design, or
implementation levels, but generally considered an analysis and design
language
-
IDL, Java, C++ (ICS)--implementation level interfaces
However, in practice, the program does not seem to have taken full advantage
of the potential provided by this separation of levels. For example:
-
The ontology tools do not seem to have been used a great deal
-
The tools are not well integrated, nor do they appear to have been given
distinct roles, corresponding to the different levels of abstraction they
(potentially) represent
-
While UML has been used (e.g., in [CSschema]), only a subset of its modeling
capabilities appear to have actually been used, and it seems to have been
used primarily as just another way to define implementation-level classes.
In particular, features of UML such as multiple inheritance, multiple interfaces,
dynamic type changes, etc. [FS97] have not been used.
The Common Schema program effectively has the task of both capturing common
domain semantics, and also reflecting these common semantics in terms of
object definitions that can be used in real software. However, these things
should not become overly inter-mixed.
UML, possibly with some semantic extensions, seems to be a reasonable
choice for defining the Common Schema. However, UML can be used at multiple
levels of abstraction, to represent analysis, design, and implementation
levels. Ideally, it should be used mainly at an analysis or conceptual
design level, capturing agreements on vocabularies and element meanings
and groupings. Similarly, IDL (at least currently) is a reasonable choice
for implementation-level interface definition. However, if the use of UML
constructs is overly-constrained based on implementation-level considerations
(e.g., the ease with which the UML can be translated to IDL interfaces),
then:
-
this may limit the extent to which semantics can be captured that are not
readily captured in IDL by itself (e.g., that a given platform may be classified
in alternative ways, or have multiple interfaces)
-
there doesn't seem to be much advantage in using UML over using IDL by
itself, together with direct mappings from IDL to programming language
interface definitions
For example, the use of a deep, single-inheritance class hierarchy in the
Common Schema suggests an emphasis on static, compile-time type checking
as a means of insuring interoperability among applications. This may or
may not be the actual intent. For example, it may not be the intent to
rule out more dynamic type systems that may be used in particular applications,
such as the ALP's Logical Data Model (discussed further in the next section).
Separating semantics from implementation allows alternative representations
of domain objects (or object types) to be considered for use within the
architecture, provided that common semantics are maintained. There are
several alternative representations within DARPAISO projects that could
be examined in this connection (see Section 3.3). This separation can also
improve the stability of the Common Schema definitions, in the sense that
the semantics can remain stable while more implementation aspects, like
the language and object modeling technologies used, may change independently.
At the same time, however, using all of UML and generating complex mappings
to language class libraries adds complexity to the implementation layer.
The tradeoff between semantic richness and implementation complexity needs
careful consideration.
A related issue with the current Common Schema definitions is that in
many cases the classes seem to lack attribute definitions (this is certainly
true of the platform classes) and in most cases lack operation definitions.
Without attribute or operation definitions, type checking reduces to determining
that applications have used the proper class names, but this plays no significant
role in providing interoperability, and makes the need for a strict class
hierarchy questionable. In addition, the lack of operation definitions
raises the question of whether the definitions are really intended to be
those of distributed objects which could possibly be invoked remotely,
or whether the definitions are intended to be those of data records to
be exchanged between distributed applications.
3.3 Provide Enhanced Support for Technologies
that Accommodate Changed or Heterogeneous Object Definitions
On the surface, the Common Schema program represents an attempt to support
interoperability by reducing heterogeneity in the data and objects to be
handled, by defining a common set of definitions that all participating
components will use. However, the program should also consider the need
for flexibility in the face of changed or specialized requirements. Hence,
technologies that accommodate changed or heterogeneous object definitions
need to be carefully considered as part of the context of the Common Schema
program (even if they are not directly addressed by that program). These
techniques include:
-
schema evolution techniques to deal with changes in the Common Schema definitions
-
schema architectures (use of views or component schemas, as in federated
DBMSs or multi-tier object architectures) to allow components to specify
their own requirements, and provide a way to manage well-defined mappings
between common object definitions and those needed in particular applicationsor
component federations (see Section 3.3.1.)
-
more flexible object modeling techniques (see Section 3.3.2.)
-
more flexible messaging techniques that allow components to be more resilient
in adapting to changes in object definitions
The Common Schema program has explicitly recognized some of these issues.
For example, [CSmanagement] briefly discusses long term interoperability
as a goal, and the Dynamic Schema Server (DSS) effort considers aspects
of some of the techniques identified above. However, it is not clear how
much emphasis the program is placing on these areas, and how well coordinated
those areas are with the rest of the program.
More specifically, the Common Schema effort explicitly recognizes the
need to support some forms of schema adaptability and heterogeneity. Two
general approaches are being taken to deal with this. The first, and more
"static", involves the use of formal change control procedures to assess
proposed changes to the schema, and include them if necessary. The more
"dynamic" approach is embodied in the Dynamic Schema Server (DSS) effort
[CSdss]. The DSS reflects the fact that systems using the Common Schema
definitions may require the ability to adapt dynamically to changes in
the schema interface definitions. For example, changes in Common Schema
definitions should not necessarily require recompilation of existing systems.
[CSdss] contains a good discussion of many of the issues involved. The
DSS provides facilities for recognizing different versions of schema interfaces,
and the concept of a special server that maps between different versions
of the same interface such that, e.g., a client can identify the particular
interface version it wants to use, and be insulated from the fact that
a particular server implements a different version. In supporting these
facilities, the DSS concept actually identifies a number of much broader
issues related to dealing with heterogeneity in general, and what types
of heterogeneity the architecture will be prepared to deal with. In particular:
-
[CSdss] describes an "Interface Association Language" (IAL) for defining
mappings between the various versions of a given interface. This is essentially
a form of support for "views" (in the database sense), and suggests the
need for a general schema architecture (in the federated database sense)
as part of the architecture (see Section 3.3.1).
-
[CSdss] discusses the potential use of tagged data (or possibly a hybrid
representation including such data) as a more flexible object/data representation
approach. This raises the general issue of possibly using more flexible
object models and implementation structures in the architecture (see Section
3.3.2.).
These techniques need to be carefully considered, and their relationships
established, both among themselves, and to the types of heterogeneity to
be dealt with by the architecture the Common Schema supports (since these
techniques are all, in essence, ways of coping with different types of
heterogeneity). The specific subjects of views and more flexible object
models are discussed in more detail below.
3.3.1 views and schema architectures
Large scale distributed object systems increasingly are being designed
with 3- (or sometimes multi-) tier architectures [MGHH+98]. These architectures
involve the division of the system's components (and object definitions)
into functional tiers based on the different functional concerns they address.
For example, a typical 3-tier architecture has a tier for objects representing
user interface elements, a tier for business or application objects, and
a tier for database servers. The business object tier separates out the
common definitions of enterprise operations and semantics from the more
specialized concerns addressed in the other tiers. Similar ideas
are reflected in federated DBMS architectures [SL90], which typically define
several distinct types of schemas (sometimes called "views" in the DBMS
literature), including:
-
schemas representing the contents of the individual databases forming the
federation
-
schemas representing the requirements of individual applications or users
-
a canonical or global schema representing the complete information contents
of the federation, and to which the other schemas are mapped (this mapping
can also involve data model translation between the data model used for
the global schema and the data models used in the individual databases);
in some cases, there may be multiple federations each having its own canonical
schema
Somewhat similar ideas are also reflected in the Model-View-Controller
concepts in object-oriented software development, the ANSI/SPARC three-schema
DBMS architecture [TK77], and the Federation Object Model and Simulation
Object Model concepts of DMSO's High Level Architecture [Lutz]. The basic
idea of these approaches is to address separation of concerns, or the need
to approach a complex problem by breaking it into more tractible pieces.
This is done by providing separate definitions of the modeling requirements
of distinct parts of the architecture, and addressing interoperability
by providing well-defined mappings between these parts, based on a central
definition of enterprise semantics. In the case of the ISO architecture,
the Common Schema provides the central definition of enterprise semantics,
but there is a need to address the other schema levels as well. Specialized
schemas represent a way of capturing alternative representations or the
requirements of distinct federations, and mappings between them, governed
by the Common Schema (as defining the common underlying semantics).
The DSS IAL provides what is effectively a view definition language
to deal with, e.g., screening out unnecessary object attributes when delivering
an object to an application. [Csmanagement] mentions as an issue the use
of "Schema Object Views" for "efficient schema use and expansion", and
for filtering large schema objects to provide only the specific attributes
and methods required by an application. [Csroadmap] mentions both multiple
ontological views of the repository (to provide tailored classification
structures) and repository views (for filtering schema objects based on
specific user requirements). These ideas should
be pursued, generalized, and made a more explicit part of the overall architecture
within which the Common Schema functions.
There is research on view creation in object databases (e.g., [AB91,
KKS92]), and view mapping in federated database systems (e.g., [FR97]),
that could be applied to this issue, as well as using more direct means
of defining interface mappings in individual object class definitions.
(It is worth noting that UML provides a capability for modeling objects
with multiple interfaces, although this is perhaps not the best way to
represent the multiple schemas being described here.)
3.3.2 more flexible object models
Whenever "programming in the large," as program size gets very large
to cover many related applications (as in C4ISR), or when a complex related
collection of programs must be distributed across diverse communities in
space (across the Internet) or time (last for many years), evolutionary
modeling requirements begin to dominate. Hence evolutionary modeling mechanisms
must augment the object-oriented programming language view of object models
as statically defined once and for all at some fixed definition time.
Object models (type systems) such as IDL and C++ make it relatively
difficult to change aspects of types dynamically, changes such as defining
new types at run time (e.g., adding additional attributes to individual
object instances to represent special cases), and changing subtype (inheritance)
relationships. There are a number of reasons to give serious consideration
to more flexible object modeling technology within the Common Schema activity:
-
Several ISO programs, such as ALP and Genoa, use such object modeling technologies,
and interoperability between Common Schema representations and these programs
must be defined in any case (e.g., there is explicit recognition that this
issue must be addressed in the case of ALP, but no strategy appears to
have been defined yet).
-
The Internet, including work such as HTTP-NG and XML-related activities,
is also spawning alternative object modeling techniques. It would be a
good idea for the Common Schema to pay attention to these technologies,
as the Internet represents both an extreme example of the need for adaptability,
and an environment with which DoD applications will probably have to interact
(e.g., in accessing open source information, and interoperating with COTS
software).
-
Such object modeling technology can be useful in suggesting higher-level
logical modeling requirements, e.g., dynamic type construction, multiple
classifications, etc., that the architecture might wish to support (e.g.,
in providing degrees of adaptability the architecture needs).
The capabilities provided by these technologies could, in the short term,
be reflected in UML models (UML can represent some of these forms of flexibility
now) to represent logical requirements to be mapped to more static (e.g.,
IDL) interfaces used in operational software, using DSS capabilities to
deal with changes as envisioned now. In the longer term, it would be worthwhile
to consider the use of these technologies more directly.
In some cases, these technologies involve the potential for clients
to assume additional responsibilities in dealing with type and instance
variations (in some cases with mediator assistance, as in DSS). Agent architectures
already employ forms of run-time activity of this sort. In the long run,
the architecture needs to incorporate the ability to deal with controlled
heterogeneity (based on a careful assessments of the tradeoffs involved),
as consistent with the use of agents, open sources, COTS products, and
the Internet.
The following sections provide some further detail about some of these
technologies.
3.3.2.1 ALP LDM and HTTP-NG
The ALP Logical Data Model (LDM) [Mil98, MC98] specifically attempts to
deal with the following problems:
-
millions of types (not instances) of things, e.g., logistics asset
types
-
continuously evolving real-world entities
-
multiple, changing capabilities and roles
The LDM has adopted several techniques to deal with these problems. First,
things are primarily modeled based on their properties, rather than
what they are (as defined by a particular type). As a result, for
example, it doesn't matter whether a Tank is classified as a Vehicle or
as a Weapon, provided that it has the properties of a Vehicle and
the properties of a Weapon. Related properties are collected in
property
groups (e.g., there are Vehicle property groups representing the properties
of particular kinds of vehicles, and Weapon property groups representing
the properties of particular kinds of weapons). These property groups are
defined as prototype instances, from which individual objects can derive
behavior by delegation (routing operation invocations from the original
object to the appropriate prototype instance). This both reduces the number
of classes required to model the domain objects, and allows new types of
things to be defined and created dynamically. This construction of objects
by aggregating collections of property groups is similar to the approach
used in several simulation object models developed in connection with DMSO
activities [Cot97, Dud97] (although these simulation object models do not
employ delegation).
In effect, the LDM addresses the need to dynamically construct what
are in effect new types of things. LDM does this by defining a few generic
higher level types, and using prototype instances to define collections
of properties which can be aggregated to define individual variations of
the basic types. Real world objects are formed by selecting one of the
generic types, and adding as many property groups as are required to define
the capabilities of the object. Dynamic type construction is possible because
new property groups, being object instances (not types), can be created
at run time. This effectively creates a "two-tier" type system, where strong
typing of interfaces handles only generic type checking, and other mechanisms
deal with more detailed type variants.
ALP defines an architecture based on the use of agents, which can engage
in various forms of run-time negotiation, so there is a plausible reason
in this context for doing more type-oriented checking and negotiation at
run-time. The agent, for example, may have its own ideas of what things
should be considered to have "similar" types which the architecture needs
to accommodate.
Different variants of a "two-tier" type system approach are being investigated
in the World Wide Web Consortium (W3C) HTTP-NG project <http://www.w3.org/Protocols/HTTP-NG/
>.
The HTTP-NG project is attempting to develop a generic distributed object
system to support both current Web capabilities, and the increasing use
of the Web for more general distributed applications. As part of this activity,
the project's Protocol Design Group has been investigating the problem
of type system evolution mechanisms. As part of the Internet, HTTP-NG would
face the problem of type system evolution in a particularly acute form,
which is referred to as "anarchic evolution". That is, after a given system
has been deployed, it is subject to concurrent, independent evolutionary
developments, each of which is incrementally rolled out into the deployed
system. Any new piece of the system (e.g., Web browser, Web server) is
faced with the prospect of having to interact with peers that understand
any combination of current and future extensions. It is desirable to minimize
both the application programming nuisance and the network performance costs
of coping with this situation, as well as develop solutions that are not
limited to on-line 2-party interactions. HTTP (among others) addresses
this problem with optional headers. However, the type systems of existing
distributed object systems (CORBA, DCOM, Java RMI) do not facilitate this
type of anarchic evolution as well as HTTP. Since HTTP-NG is proposing
a distributed object system to support capabilities currently provided
by HTTP, it must address these evolution requirements directly.
While this HTTP-NG type system evolution work is still at a relatively
early stage, and details are currently restricted to W3C members, generally
speaking, the HTTP-NG approach is to define type systems in which differences
due to "evolutionary" changes are not considered in static type checking,
but instead are checked at carefully designed points at run-time. To the
extent that HTTP-NG represents the potential future of the Web, it would
be worthwhile for the Common Schema program to be aware of these developments.
3.3.2.2 Genoa and XML
Genoa represents an application domain where an extremely flexible data/object
representation is needed due, for example, to the difficulty in anticipating
the structure of the collection of material needed to describe a given
situation. As a result, Genoa makes heavy use of property lists (sets of
attribute/value pairs, tagged data) in structuring its information. [CSdss]
notes that "The advantage of fully tagged schemes are their ability to
create data that stands alone, without any need for a centralized authority
for managing the definitions. Given sufficiently thorough metadata, useful
generic browsers, viewers, and editors can be constructed for virtually
any [data] type at all." Tagged data can be considered an incorporation
of metadata at the level of individual attributes or other content groupings
in the object representation. The individual tags can either be considered
as metadata themselves or, more accurately, as indirect references to metadata
located elsewhere which describes the meaning of the tagged information.
For example, in the case of the Web, the tags could be associated with
URLs providing direct access to the metadata describing tag semantics by
either a human user or a program.
The use of tagged data provides for an extremely flexible representation.
New tags can be freely defined and used within exchanged data, with the
definitions of the tag semantics stored on-line for access as needed. Unanticipated
combinations of tags can be combined and exchanged between programs. Clients
and
servers can be designed to ignore tags they don't understand (much as Web
browsers today do for HTML tags they don't understand). At the same time,
discipline can be imposed by requiring certain minimum sets of tags to
be included, by requiring certain combinations of tags to be always used
together, or by requiring tags to be selected from one or more controlled
vocabularies. These vocabularies could, for example, be developed and controlled
by domain-oriented groups who would provide the necessary metadata to define
tag semantics. Such constraints in effect define a type system, for which
the tagged data serves as a representation (and the type system can be
designed to support whatever requirements are needed).
Genoa is apparently looking into the use of W3C's Extensible Markup
Language (XML) <http://www.w3.org/XML/>
and related technologies as a representation technique for its tagged data
requirements. The Web is increasingly targeting XML as its next-generation
data representation. Unlike HTML, which defines a fixed set of tags, XML
allows the definition of customized markup languages with application-specific
tags, e.g., <QUANTITY> or <SPEED>, for representing information in
particular application domains. XML Document Type Definitions (DTDs) provide
a way to explicitly declare the tag sets, and their structure, to be used
in particular units of data.
In addition to these basic capabilities, other technologies related
to XML are currently under development within W3C, including
-
the XML Linking and Pointer Languages, which provide much more powerful
linking capabilities that those currently available within HTML (including
bidirectional and multi-way links, and links to internal elements within
units of data).
-
the Resource Description Format (RDF), which provides a model for representing
metadata (including ontology-like information) in XML based on propositional
logic plus certain modalities.
-
XML namespaces, which provides a means for associating XML tags with specific
controlled vocabularies.
-
the Document Object Model (DOM) <http://www.w3.org/DOM/>,
which defines an object-oriented API for XML structures.
In addition to representing units of data in a distributed object architecture,
XML can be used in representing other aspects of such architectures. For
example:
-
DataChannel's WebBroker [TL98] represents one of several attempts to build
a complete Web-native distributed object computing model, based on the
use of XML and HTTP. In these approaches, XML is used to represent both
object interfaces and object method invocation messages sent between objects.
-
Microsoft's Scriptlets [deB98] represents one of several approaches which
allow components (COM components in the case of Scriptlets) to be directly
written using a combination of XML and a scripting language such as JavaScript.
These and other technologies for combining Web and object concepts are
thoroughly described in two technical reports [Man98a,b] from Object Services
and Consulting.
The use of tagged data in messages can be easily accommodated within
the use of CORBA DII, is similar to the way structured messages are used
in electronic commerce applications, and can serve as a means of reducing
coupling between clients and servers [ES98]. In addition, a number of OMG
activities are currently contemplating the use of XML for various purposes.
including XMI (an OMG metadata exchange submission), Tagged Data Facility,
Common (Data) Warehouse Metadata RFP, and the CORBA Components submission.
These technologies, and related developments, should be carefully tracked,
as a potential means to represent the semantics defined by the Common Schema
definitions in a more flexible form.
3.4 Organizational and Community Issues
Knowledge of the Common Schema activity, and serious interaction with it,
seems to be uneven within the ISO community. Obtaining the cooperation
of the various programs involved in the architecture, and active interaction
among the people involved, will be crucial to the success of the program.
One aspect of the program (and, in fact, of ISO programs in general) which
could be significantly improved is the ease of access to program information.
Specific suggestions include:
-
Reconsider the password protection used on individual project pages. The
need to obtain multiple user-ids and passwords (including, for example,
separate user-ids and passwords to access the OMWG page, and ICS tool,
even for read-only access) inhibits cross-working among projects. This
contrasts unfavorably with the ease of access to most DMSO material.
-
Make more use of standard Internet approaches such as subject- or project-specific
email lists to foster more interaction among potential users and participants
(such lists may exist, but I was not made aware of them). It is desirable
to maximize the number of people aware of, using, and contributing to these
ideas, even when they are not necessarily part of the formal process (this,
ideally, includes members of the general public, but certainly could reasonably
include members of the general DoD community). There are certainly many
people unknown to the program within the DoD community who could probably
contribute useful ideas in this way on a relatively low-cost basis (a few
selected experts could perhaps review suggestions). The W3C, for example,
makes much faster progress than it ordinarily would through the use of
both member-only and public email lists (to which people have to subscribe,
so there would be some control over participation) focusing on specific
technical activities.
-
The OMWG Web site could be made much more of a focal point for updated
information, and interaction among program participants.
-
It would be helpful for the program if it could document its information
sources, and the methodology it is using, more thoroughly. This could forestall
numerous "did you consider this" questions from people not initially familiar
with the program.
At the same time, the Common Schema activity needs to pay more attention
to DoD standards, terms and semantics used in other programs (and in practice),
DoD programs involving large-scale database schema development, such as
the Modern Imaging Product Architecture DBMS (MIDB)
<http://www.objs.com/ddb/9703-Dynamic-Database-II-Meeting-1-Notes.htm#MIDB>),
and other DoD programs developing object model standards, such as the DMSO
Object Model Data Dictionary (OMDD) activity. For example, the OMDD has
made a point of basing its contents, where possible, on existing data standards,
such as the Defense Data Dictionary System, and, in turn, providing input
to other standards based on its own requirements. There appears to be nothing
corresponding to this within the Common Schema program, and while there
might be good reasons for this, it appears to be an idea worth looking
into. There is certainly a need for DoD coordination of these schema/data
dictionary development activities, so as to avoid both duplication of effort
and inconsistent specifications. Another reason for looking specifically
at the OMDD activity is the need for interoperability between simulations
and actual C2 systems in some circumstances {NS98].
References
Common Schema <http://ics.les.disa.mil>
(account required)
[CSroadmap] OMWG Roadmap, draft release 0.5, 6/24/97.
[CSmanagement] Common Schema Management Overview, 5/20/97.
[CStutorial] Overview of the Object Model Working Group Common Schema,
3/10/98.
[CSschema] OMWG Command and Control Schema, v.0.5.3, 16 Oct. 1996
[CSdss] Randall Schultz, "Request for Comments on Proposed Dynamic Schema
Service (DSS) for JTF-ATD Application Development", 8/22/97.
OMFG General Overview and Update, 5/4/98.
OMFG Contractor Coordination Meeting, 2/10/98.
Schema Server Status, 2/10/98.
Genoa <http://echoleader.usae.bah.com/genoa/>
Genoa System Design Document
Genoa Products, Critical Information Packages (CIPs), and Thematic Action
Groups (TAGs) Concept and Design White Paper
Critical Information Package Concept of Operations
CrisisBrief: Concept of Operations for the Virtual Situation Book
ALP <http://alp.sra.com/alp/>
[Mil98] Stephen Milligan, "ALP Architectural Considerations", presentation
foils, 1998.
[MC98] Stephen Milligan and Todd Carrico, "Investigating Large-Scale
Agent Architectures", position paper for the OMG-DARPA Workshop on Compositional
Software Architectures, Monterey, CA, Jan. 1998 <http://www.objs.com/work
shops/ws9801/index.html>.
DMSO <http://triton.dmso.mil/hla/>,
<http://hla.dmso.mil>
[NS98] J. Nielsen and M. Salisbury, "Challenges in Developing the JTLS-GCCS-NC3A
Federation", Simulation Interoperability Workshop, 1998. <http://triton.dmso.mil/hla/implement/jtls/>
[Scr97] "HLA Object Model Data Dictionary", presentation, <http://www.arlut.utexas.edu/~imewwww/index.html>
[omddhome] OMDDS Homepage, <http://s3.arlut.utexas.ed
u/omdds/code/index.htm>
Other References
[AB91] S. Abiteboul and A. Bonner, "Objects and Views", Proc. ACM
SIGMOD '91.
[Cot97] A. Cotton, III, "Developing a Standard Unit-Level Object Model",
Naval Postgraduate School, Sept. 1997 (NTIS ADA339220).
[deB98] M. De Bruijn, "Internet Explorer 5.0--for Intranets Only?",
WEBBuilder,
3(9), Sept. 1998 (see also <http://www.microsoft.com/xml/>).
[Dud97] D. Dudgeon, "Developing a Standard Platform-Level Army Object
Model", Naval Postgraduate School, Sept. 1997 (NTIS ADA-341525).
[ES98] P. Eeles and O. Sims, Building Business Objects, John
Wiley & Sons, 1998.
[FS97] M. Fowler (with K. Scott), UML Distilled, Addison-Wesley,
1997.
[FR97] G. Fahl and T. Risch, "Query Processing Over Object Views of
Relational Data", VLDB Journal 6(1997) 4, 261-281.
[KKS92] M. Kifer, W. Kim, and Y. Sagiv, "Querying Object-Oriented Databases",
Proc.
ACM SIGMOD '92.
[Lutz} R. Lutz, "HLA Object Model Development: A Process View", <http://hla.dmso.mil>.
[Man98a] F. Manola, "Towards a Web Object Model", Technical Report,
Object Services and Consulting, Inc., <http://www.objs.com/OSA/wom.htm>,
1998.
[Man98b] F. Manola, "Some Web Object Model Construction Technologies",
Technical Report, Object Services and Consulting, Inc., <http://www.objs.com/OSA/wom-II.htm
>, 1998.
[McK98] John McKim, "DARPA ISO Architecture Lessons Learned", position
paper for the OMG-DARPA Workshop on Compositional Software Architectures,
Monterey, CA, Jan. 1998 <http://www.objs.com/workshops/ws9801/index.html>.
[MGHH+98] F. Manola, et.al., "Supporting Cooperation in Enterprise-Scale
Distributed Object Systems", in M. P. Papazoglou and G. Schlageter (eds.),
Cooperative
Information Systems: Trends and Directions, Academic Press, 1998.
[SL90] A. Sheth and J. Larson, "Federated Database Systems for Managing
Distributed, Heterogeneous and Autonomous Databases", ACM Computing
Surveys, 22(3), Sept. 1990.
[TK77] D. Tsichritzis and A. Klug (eds.), The ANSI/X3/SPARC DBMS
Framework: Report of the Study Group on Database Management Systems,
AFIPS Press, Montvale, NJ, 1977.
[TL98] J. Tigue and J. Lavinder, "WebBroker: Distributed Object Communication
on the Web", W3C Note, World Wide Web Consortium, 1998 <http://www.w3.org/TR/1998/NOTE-webbroker>.
This report was prepared by Object Services
and Consulting, Inc. (OBJS) under subcontract to the Institute for Defense
Analyses (IDA) on its Task A-209, Advanced Information Technology Services
Architecture, under contract DASW01 94 C 0054 for the Defense Advanced
Research Projects Agency. Publication of this document does not indicate
endorsement by the Department of Defense, nor should the contents be construed
as reflecting the official position of that agency.
© Copyright 1998 Object Services and Consulting,
Inc. (OBJS)
© Copyright 1998 Institute for Defense Analyses
(IDA)
Permission is granted to copy this document provided this
copyright statement is retained in all copies.
Disclaimer: Neither OBJS nor IDA warrant the accuracy
or completeness of the information in this report.