10 February 1998
A number of ongoing activities are attempting to merge aspects of object models with those of the World Wide Web. This paper describes a number of these activities, with particular emphasis on those which focus on providing enhanced facilities for representing metadata for describing Web (and other) resources. The intent of this paper is to:
The use of objects in such architectures reflects the fact that advanced software development increasingly involves the use of object technology. This includes the use of object-oriented programming languages, class libraries and application development frameworks, application integration technology such as Microsoft's OLE, as well as distributed object middleware such as CORBA. It also involves the use of object analysis and design methodologies and associated tools.
This use of object technology is driven by a number of factors, including:
The third factor reflects a situation faced by many large organizations, in which a key issue is not just the development of new software, but the coordination of existing software that supports key internal processes and human activities. Mechanisms provided by object technology can help encapsulate existing systems, and unify them into higher-level processes.
The fourth factor is particularly important. It reflects the fact that, as commercial software vendors incorporate object concepts in key products, it will become more and more difficult to avoid using object technology. This is illustrated by the rapid pace at which object technology is being included in software such as DBMSs (including relational DBMSs) and other middleware, and client/server development environments. Due to this factor, organizations may be influenced to begin adopting object technology before they would ordinarily consider doing so.
At the same time, the Internet is becoming an increasingly important factor in planning for enterprise distributed computing environments. For example, companies are providing information via World Wide Web pages, as well as customer access via the Internet to such enterprise computing services as on-line ordering or order/service tracking facilities. Companies are also using Internet technology to create private Intranets, providing access to enterprise data (and, potentially, services) from throughout the enterprise in a way that is convenient and avoids proprietary network technology. Following this trend, software vendors are developing software to allow Web browsers to act as user interfaces to enterprise computing systems, e.g., to act as clients in workflow or general client/server systems. Products have also been developed that link mainframes to Web pages (e.g., translating conventional terminal sessions into HTML pages).
Organizations perceive a number of advantages in using the Web in enterprise computing. For example, Web browser software is widely available for most client platforms, and is cheaper than most alternative client applications. Web pages generally work reasonably well with a variety of browsers, and maintenance is simpler since the browser and associated software can reduce the amount of distributed software to be managed. In addition, the Web provides a representation for information which
OMG's Object Management Architecture (OMA) is an example of a distributed object architecture intended to support distributed enterprise computing applications. The OMA includes the following components:
The ORB in the OMA is defined by the CORBA specifications. An ORB does not require that the objects it supports be implemented in an object-oriented programming language. The CORBA architecture defines interfaces for connecting code and data to form object implementations, with interfaces defined by IDL, that are managed by the ORB and its supporting object services. It is this flexibility that enables ORBs to be used in connecting legacy systems and data together as components in enterprise computing architectures.
A distributed enterprise object system must provide functionality beyond that of simply delivering messages between objects. OMG's Object Services have been defined to address some of these requirements. Object Services provide the next level of structure above the basic object messaging support provided by CORBA. The services define specific types of objects (or interfaces) and relationships between them in order to support higher-level capabilities. Object Services currently defined by OMG include, among others:
If the Web is to be used as the basis of complex enterprise applications, it must provide generic capabilities similar to those provided by the OMA, although these may need to be adapted to the more open, flexible nature of the Web. Providing these capabilities involves addressing not only the provision of higher level services and facilities for the Web, but also the suitability of the basic data structuring capabilities provided by the Web (its "object model"). For example, in the case of services, search engines (a form of query service) are becoming indispensable tools, and agent technology can add additional intelligence to the searching process. Similarly, extended facilities to support transactions over the Web are being investigated. However, the ability to define and apply powerful generic services in the Web, and the ability to generally use the Web to support complex applications, depends crucially on the ability of the Web's underlying data structure to support these complex applications and services.
A more fundamental direction of efforts to address HTML limitations has been attempts to integrate aspects of object technology with the basic infrastructure of the Web. There are a number of reasons for the interest in integrating Web and object technologies:
If the Internet is to develop to support advanced application requirements, there is a need for both richer individual data structuring mechanisms, and a unifying overall framework which supports heterogeneous representations and extensibility, and provides metalevel concepts for describing and integrating them.
The intent of this paper is to describe how a number of (in some respects) separate "threads" of Web-related development can be combined to form the basis of a Web object model to address these requirements. This combination is based on the observation that the fundamental components of any object model are:
In the following sections, this paper will:
The Introduction specifically noted that what is needed to progress toward a Web object model is:
Several of the sections describe ongoing activities of the World Wide Web Consortium (W3C), particularly:
These are the same structuring requirements that apply to object state in object models; i.e., an object's state must be structured in such a way that the object methods can find the parts of the state that they need in order to execute properly. As compared with HTML, whose tags are primarily concerned with how the tagged information is to be presented, satisfying this structuring requirement involves some form of semantic markup, i.e., the ability to tag items with names that can be used to identify items based (at least to some extent) on their semantics.
This section describes a number of developments directed at dealing with the problems of providing richer data structuring capabilities for Web data.
Resource Description Messages (RDM) <http://www.w3.org/TR/NOTE-rdm>, 24 July 1996, by Darren Hardy (Netscape), is a technical specification of Resource Description Messages (RDM). RDM is used in Netscape's Catalog Server. RDM is a mechanism to discover and retrieve metadata about network-accessible resources, known as Resource Descriptions (RDs). A Resource Description consists of a list of attribute-value pairs (e.g., Author = Darren Hardy, Title = RDM) and is associated with a resource via a URL. Agents can generate RDs automatically (e.g., a WWW robot), or people can write RDs manually (e.g., a librarian or author). Once a repository of Resource Descriptions is assembled, the server can export it via RDM as a programmatic way for WWW agents to discover and retrieve the RDs.@DOCUMENT { http://www.netscape.com:80/ Title{20}: Welcome to Netscape! Last-Modified{29}: Thu, 16 May 1996 11:45:39 GMT }
RDM uses Harvest's SOIF format to encode the RDs. The data model that SOIF provides is a flat name space for the attributes, and treats all values as blobs. The RDM schema definition language extends the SOIF data model by providing:
However, while SOIF supports attribute/value pairs, its structuring capabilities are not sufficiently rich to support the full structuring requirements of the Web. For example, it lacks support for nested structures, and cannot support the functionality of HTML, let alone extensions to it. It is also not well integrated with more advanced developments in Web data representation, such as XML, RDF, and DOM, described later.
original (TSIMMIS) OEM:
+-----+-------+------+-------+ | oid | label | type | value | type includes "set" +-----+-------+------+-------+ +-----+-------+------+-----------------+ | oid | label | set | {oid, oid, ...} | +-----+-------+------+-----------------+In the newer (Lore) version of OEM, the structures have been modified so that edges are labeled rather than nodes. In this scheme, a complex object consists of a set of (label,oid) pairs. These effectively represent relationships between the containing object and the target object. That is, a given (label,targetoid) pair contained in object sourceobject represents the relationship
label(sourceobject, targetobject)This revised structure thus more closely resembles a first order logic (FOL) structuring of data. These structures are shown in the figure below.
new (Lorel) OEM:
atomic object +-----+------+-------+ | oid | type | value | +-----+------+-------+ complex object +-----+---------+-------------------------------------------+ | oid | complex | value = {(label, oid), (label, oid), ...} | +-----+---------+-------------------------------------------+Since individual objects do not have labels in this scheme, additional labels are introduced so that top-level objects can also have names.
As an example, a simple structure for information on books in a library might have the following structure in the TSIMMIS OEM:
+----+---------+------+---------------+ | &1 | library | set | {&2, &5, ...} | +----+---------+------+---------------+ +----+------+------+----------+ | &2 | book | set | {&3, &4} | +----+------+------+----------+ +----+--------+--------+-----+ | &3 | author | string | Aho | +----+--------+--------+-----+ +----+-------+---------+-----------+ | &4 | title | string | Compilers | +----+-------+---------+-----------+Linearly, this might be represented as:
<&1, library, set, {&2,&5,...} > <&2, book, set, {&3,&4} > <&3, author, string, Aho > <&4, title, string, Compilers >In the Lorel OEM, the same structure would be:
+----+------+-----------------------------+ library: | &1 | set | {(book,&2), (book,&5), ...} | +----+------+-----------------------------+ +----+------+---------------------------+ | &2 | set | {(author,&3), (title,&4)} | +----+------+---------------------------+ +----+--------+-----+ | &3 | string | Aho | +----+--------+-----+ +----+---------+-----------+ | &4 | string | Compilers | +----+---------+-----------+OEM can represent complex graph structures, similar to those that exist in the Web. It is a "lightweight" object model in the sense that:
While these models are intended to represent data in (or extracted from) Web and other resources, and hence constitute a form of metadata, the capabilities of these models for representing metadata that might already exist about a resource, and for representing their own metadata, are somewhat undeveloped. They do not explicitly consider capturing type and schema information where it exists, or linking that type information to the structures it describes. For example, when OEM is used to capture a database structure, a schema actually exists for this data, unlike Web resources. It should be possible to capture both the data and the schema in OEM, and link them together. This is not really followed up in existing OEM work (although it could be). Related work has been done on a concept called DataGuides [GW97, NUWC97]. A DataGuide resembles a schema, but is derived dynamically as a summary of the structures that have been encountered, and only approximately describes the structures that may actually be encountered. This is appropriate for unstructured and semistructured data, but does not fully represent the semantics of an actual schema.
These models as currently implemented are also not well integrated with emerging Web technologies, such as the XML, DOM, and RDF work described below, that are likely to change the basic nature of the Web's representation. The approach taken in work such as OEM has so far assumed that the Web will continue to be largely unstructured or semistructured, based on HTML, and that data from the Web will need to be extracted into separate OEM structures (or interpreted as if it had been) in order perform database-like manipulations on it. On the other hand, the new Web technologies provide a higher level, more semantic representational structure, which can start with the assumption that information authors themselves have support to provide more semantic structural information. Our work on a Web object model is based on the idea that, with this additional representation support, it makes sense to investigate building more database-like capabilities within the Web infrastructure itself, rather than assuming that almost all of these database capabilities need to be added externally. Since Web structures are unlikely to become as regular as conventional databases, some of the principles developed by work such as OEM will continue to be important (and, in fact, as a model, OEM has many similarities with work such as RDF described later in this report). However, it seems likely that these principles will need to be applied in the context of representations such as XML and DOM, used directly as the basis of an enhanced Web infrastructure.
ontology(o_857) ontology_name(o_857,'healthcare') ontology_frame(o_857,f_123) frame(f_123) frame_name(f_123,'encounter_drg') slot(s_345) frame_slot(f_123,s_345) slot_name(s_345,'patient_age') constraint(c_674) slot_constraint(s_345,c_674) constraint_expression(c_674,[[gt,'patient_age',43] [lt,'patient_age',75]]]The example illustrates that the KIF representation of data is based on the use of attribute/value pairs; in fact, this is a direct representation of the way this information might be expressed in first-order logic. This also illustrates the fact that a FOL representation necessarily introduces a number of "intermediate" object identifiers (like o_857 and f_123), in order to assert the identity of distinct concepts, and to represent relationships among the various parts of the description. This is similar to the way that OEM introduces identifiers for the individual parts of a resource description. The KIF example particularly illustrates the use of such identifiers in defining namespaces like frames or ontologies, which qualify contained information.
Like OEM, KIF is capable of representing arbitrary graph structures. Moreover, KIF illustrates the importance of identifying parts of a data structure representation with logical assertions in conveying semantics between applications. Section 3 will describe how this principle serves the basis of a formal Web object model definition. However, while KIF is widely used for knowledge interchange, it, like OEM, is not well integrated with emerging Web infrastructure technologies.
Because authors and providers can design their own document types using XML, browsers can benefit from improved facilities, and applications can use tailored markup to process data. As a result, XML provides direct support for using application-specific tagged data items (attribute/value pairs) in Web resources, as opposed to the current need to use ad hoc encodings of data items in terms of HTML tags. [KR97] provides a useful overview of the potential benefits of using XML in Web-related applications.
Although XML could eventually completely replace HTML, XML and HTML are expected to coexist for some time. In some cases, applications may wish to define entirely separate XML documents for their own processing, and convert the XML to HTML for display purposes. Alternatively, applications may wish to continue using HTML pages as their primary document format, embedding XML within the HTML for application-specific purposes. For example, [Hop97] describes the use of blocks of XML markup enclosed by <XML> and </XML> tags within an HTML document for this purpose.
XML has considerable industry support, e.g., from Netscape, Microsoft, and Sun. For example, Microsoft has built an XML parser into Internet Explorer 4.0 (which uses XML for several applications), has made available XML parsers in Java and C++, together with links to other XML tools (see http://www.microsoft.com/xml/), and has indicated that it will use XML in future versions of Microsoft Office products. Microsoft has also contributed to a number of proposals to W3C on the use of XML as a base for various purposes (some of which will be discussed in later sections). Netscape has said it will support XML via the Meta Content Framework (described in Section 2.2) in a future version of its Communicator product. Work is also underway on tying XML to Java in a number of ways. Other commercial vendors are also developing XML-related software tools. In addition, a number of XML tools are available for free non-commercial use. A list of some of these tools is available at the W3C XML Web page identified above.
A number of industry groups have defined SGML Document Type Definitions (DTDs) for their documents (e.g., the U.S. Defense Department, which requires much of its documentation to be submitted according to defined SGML DTDs); in many cases these could either be used with XML directly, or converted in a straightforward fashion. Work is already underway to define XML-based data exchange formats in both the chemical and healthcare communities. Work has also been done on other applications of XML, e.g., an Ontology Markup Language (OML) <http://wave.eecs.wsu.edu/WAVE/Ontologies/OML/OML-DTD.html> for representing ontologies in XML.
The W3C XML specification has several parts:
An XML document may be either valid or well-formed. A valid XML document is well-formed, and has a DTD. The document begins with a declaration of its DTD. This may include a pointer to an external document (a local file or the URL of a DTD that can be retrieved over the network) that contains a subset of the required markup declarations (called the external subset), and may also include an internal subset of markup declarations contained directly within the document. The external and internal subsets, taken together, constitute the complete DTD of the document. The DTD effectively defines a grammar which defines a class of documents. Normally, the bulk of the markup declarations appear in the external subset, which is referred to by all documents of the same class. If both external and internal subsets are used, the XML processor must read the internal subset first, then the external subset. This allows the entity and attribute declarations in the internal subset to take precedence over those in the external subset (thus allowing local variants in documents of the same class). XML DTDs can also be composed, so that new document types can be created from existing ones.
A well-formed XML document can be used without a DTD, but must follow a number of simple rules to ensure that it can be parsed correctly. These rules require, among other things, that:
<!DOCTYPE bCard "http://www.objs.com/schemas/bCard"> <bCard> <?xml default bCard firstname = "" lastname = "" company = "" email = "" webpage = "" ?> <bCard firstname = "Frank" lastname = "Manola" company = "Object Services and Consulting" email = "fmanola@objs.com" webpage = "http://www.objs.com/manola.htm" /> <bCard firstname = "Craig" lastname = "Thompson" company = "Object Services and Consulting" email = "thompson@objs.com" webpage = "http://www.objs.com/thompson.htm" /> </bCard>The default specification ensures that every tag has the same number of attribute-value pairs.
An alternative representation uses different tags, rather than XML attributes, to identify the meaning of the content. Using this approach, the same content would be represented as:
<bCard> <FIRSTNAME>Frank</FIRSTNAME> <LASTNAME>Manola</LASTNAME> <COMPANY>Object Services and Consulting</COMPANY> <EMAIL>fmanola@objs.com</EMAIL> <WEBPAGE>http://www.objs.com/manola.htm</WEBPAGE> </bCard> <bCard> <FIRSTNAME> Craig </FIRSTNAME> <LASTNAME> Thompson </LASTNAME> <COMPANY>Object Services and Consulting</COMPANY> <EMAIL> thompson@objs.com </EMAIL> <WEBPAGE>http://www.objs.com/thompson.htm</WEBPAGE> </bCard>The paper XML representation of a relational database <http://www.w3.org/XML/RDB.html> uses a relational database as a simple example of how to represent more complex structured information in XML. A relational database consists of a set of tables, where each table is a set of records. A record in turn is a set of fields and each field is a pair field-name/field-value. All records in a particular table have the same number of fields with the same field-names. This description suggests that a database could be represented as a hierarchy of depth four: the database consists of a set of tables, which in turn consist of rows, which in turn consist of fields. The following example, taken from the cited paper, describes a possible XML representation of a single database with two tables:
<!doctype mydata "http://www.w3.org/mydata"> <mydata> <authors> <author> <name>Robert Roberts</name> <address>10 Tenth St, Decapolis</address> <editor>Ella Ellis</editor> <ms type="blob">ftp://docs/rr-10</ms> <born>1960/05/26</born> </author> <author> <name>Tom Thomas</name> <address>2 Second Av, Duo-Duo</address> <editor>Ella Ellis</editor> <ms type="blob">ftp://docs/tt-2</ms> </author> <author> <name>Mark Marks</name> <address>1 Premier, Maintown</address> <editor>Ella Ellis</editor> <ms type="blob">ftp://docs/mm-1</ms> </author> </authors> <editors> <editor> <name>Ella Ellis</name> <telephone>7356</telephone> </editor> </editors> </mydata>The representation is human-readable, but fairly verbose (since XML in general is verbose). However, it compresses well with standard compression tools. It is also easy to print the database (or a part of it) with standard XML browsers and a simple style sheet.
The database is modeled with an XML document node and its associated element node:
<!doctype name "url">The name is arbitrary. The url is optional, but can be used to point to information about the database. The order of the tables is also arbitrary, since a relational database defines no ordering on them. Each table of the database is represented by an XML element node with the records as its children:
<name>
table1
table 2
...
table n
</name>
<name>The name is the name of the table. The order of the records is arbitrary, since the relational data model defines no ordering on them. A row is also represented by an element node, with its fields as children:
record1
record2
...
recordm
</name>
<name>The name is the name of the row type (this was not required in the original relational model, but the current specification allows definition of row types); the name is required in XML anyway. The order of the fields is arbitrary. A field is represented as an element node with a data node as its only child:
field1
field2
...
fieldm
</name>
<name type="t">If d is omitted, it means the value of the fields is the empty string. The value of t indicates the type of the value (such as string, number, boolean, date). If the type attribute is omitted, the type can be assumed to be `string.'
d
</name
This example illustrates that XML tags can (and will) represent concepts at multiple levels of abstraction. The example defines a specific four-level hierarchy, but does not explicitly define the relational model and indicate the hierarchical relationships among the various relational constructs. In order to do this in a generic way for all relational databases, there would need to be explicit tags such as <SCHEMA>, <TABLE>, <ROW>, etc., and a specification of how they should be nested. This is metalevel information as far as the XML representation is concerned, and could be specified in the DTD. The definition of models, such as the relational model, for organizing data for specific purposes, is independent of XML, and needs to be done separately. The definition of such models (in some cases using XML as their representation) is discussed in the next section.
An XML document consists of text, and is basically a linearization of a tree structure. At every node in the tree there are several character strings. The tree structure and character strings together form the information content of an XML document. Some of the character strings serve to define the tree structure; others are there to define content. In addition to the basic tree structure, there are mechanisms to define connections between arbitrary nodes in the tree. For example, in the following document there is a root node with three children, with one of the children containing a link to one of the other children:
In this case, the third child contains an href attribute which points to the first child, using its id attribute as an identifier.<p> <q id="x7">The first child of type q</q> <q id="x8">The second child of type q</q> <q href="#x7">The third child of type q</q> </p>
The XML linking model is described in the XLL draft <http://www.w3.org/TR/WD-xml-link>. The full hypertext linking capabilities of XML are much more powerful than those of HTML, and are based on more powerful hypertext technology such as described in HyTime [ISO92] <http://www.hytime.org/> and the Text Encoding Initiative (TEI) <http://www.uic.edu/orgs/tei/>. The current specification supports both conventional URLs, and TEI extended pointers. The latter provide support for bidirectional and multi-way links, as well as links to a span of text (i.e., a subset of the document) within the same or other documents.
XSL <http://www.w3.org/TR/NOTE-XSL> is a submission defining stylesheet capabilities for XML documents. XML stylesheets enable formatting information to be associated with elements in a source document to produce formatted output. XML stylesheet capabilities are based on a subset of those defined in the ISO standard Document Style Semantics and Specification Language (DSSSL) [ISO96] used in formatting SGML documents. The formatted output is created by formatting a tree of flow objects. A flow object has a class, which represents a kind of formatting task, together with a set of named characteristics, which further specify the formatting. The association of elements in the source document tree to flow objects is defined using construction rules. A construction rule contains a pattern to identify specific elements in the source tree, and an action to specify a resulting subtree of flow objects. The stylesheet processor recursively processes source elements to produce a complete flow object tree which defines how the document is to be presented.
The XML working group is also currently developing a Namespace facility <http://www.w3.org/TR/1998/NOTE-xml-names> that will allow Generic Identifiers (tag names) to have a prefix which will make them unique and will prevent name clashes when developing documents that mix elements from different schemas. This facility allows a document's prolog to contain a set of Processing Instructions (an SGML concept) of the form:
<?xml:namespace name="some-uri" as="some-abbreviation"?>for example
<?xml:namespace name="http://www.w3.org/schemas/rdf-schema" as="RDF"?> <?xml:namespace name="http://www.purl.org/DublinCore/schema" as="DC"?>Elements in the document may then use generic identifiers of the form <RDF:assertions> or <DC:Title>. Those element names would expand to URIs such as http://www.w3.org/schemas/rdf-schema#assertions. This work is still under development, and the details of the final specification may differ from those described here.
XML provides basic tagged value support, as well as support for nesting, and enhanced link capabilities. Because the Web community is increasingly targeting XML as its "next generation Web representation", the Web object model described in Section 3 uses XML as its basic representation of object state. However, additional concepts must also be defined to apply XML to extended data and metadata structuring requirements, and particularly the requirements for a Web object model that go beyond a richer state representation. Some of these requirements are illustrated both by the relational database example above, and by the RDF and related efforts described in the next section. These efforts generally involve defining data model concepts for representing specific kinds of data (as the relational model does for database data), and then using the tagged value structures supported by XML as their representation. These models support various ways of using identifier concepts (URLs plus other identifier concepts) to provide support for graph structured data. An additional general requirement, not generally addressed by Web-related activities, is the definition of structured database capabilities (e.g., an algebra or calculus to serve as the basis for database-like query and view facilities for XML data).
A data model defines one level of "what to represent". For example, the relational data model defines structuring concepts such as rows and tables, and provides one basic organizational framework for representing data. The example from the previous section of how to represent relational data in XML illustrated how using the relational model imposed additional structure on the XML representation. Defining a data model for data represented in XML both suggests specific structuring concepts for using XML to organize data, and may also involve the specification of certain standard tags or attributes (like <TABLE>) to reflect those concepts. Use of particular data models (represented using techniques such as XML) regularizes the structures that may be encountered, and potentially simplifies the task of applications that process those structures.
An additional level of "what to represent" is provided by standardizing the use of domain-specific attribute/value pairs and document structures (e.g., standards for specific kinds of reports or forms). SGML and XML DTDs constitute one way to specify such standards, and there are already numerous SGML DTDs in use for this purpose (these could, in most cases, be easily adapted for use with XML).
An important source of efforts to develop such higher-level model specifications for use on the Web has been work on developing representation techniques for Web metadata, i.e., data available via the Web that describes or helps interpret either other Web resources or non-Web resources. This metadata is used both to facilitate Web searches for relevant data, and to either provide direct access to it (if it is Web-accessible) or at least indicate its existence and possibly describe how to obtain it. The reason why the development of metadata representations has driven the development of higher-level models is that the metadata is intended to support indexing, searching, and other automated processes that require more structure than may be present in the data itself. Metadata requirements have also driven the development of structured representations themselves. For example, the SOIF format described in Section 2.1.1 was developed to represent Web metadata.
Efforts to develop enhanced metadata capabilities have involved several types of activity (a given effort may bundle more than one of them):
The Warwick Framework has two fundamental components: packages, which are typed metadata sets, and containers, which are the units for aggregating packages.
A container may be either transient or persistent. In its transient form, it exists as a transport object between and among repositories, clients, and agents. In its persistent form, it exists as a first-class object in the information infrastructure. That is, it is stored on one or more servers and is accessible from these servers using a globally accessible identifier (URI). A container may also be wrapped within another object (i.e., one that is a wrapper for both data and metadata). In this case the "wrapper" object will have a URI rather than, or in addition to, the metadata container itself.
Independent of the implementation, the only operation defined for a container is one that returns a sequence of packages in the container. There is no provision in this operation for ordering the members of this sequence and thus no way for a client to assume that one package is more significant or "better" than another.
Each package is a typed object; its type may be determined after access by a client or agent. Packages are of three types:
+--------------------+ | container | | | | +---------------+ | | | package | | | | (Dublin Core) | | | +---------------+ | | +---------------+ | | | package | | | | (MARC Record) | | | +---------------+ | +------------------------+ | +---------------+ | URI | package | | | package |-+------>| (terms and conditions) | | | (indirect) | | +------------------------+ | +---------------+ | +--------------------+
Figure 1 illustrates a simple example of a Warwick Framework container. The container in this example contains three logical packages of metadata. The first two, a Dublin Core record and a MARC record, are contained within the container as a pair of packages. The third metadata set, which defines the terms and conditions for access to the content object, is referenced indirectly via a URI in the container (the syntax for terms and conditions metadata and administrative metadata is not yet defined).
The mechanisms for associating a Warwick Framework container with a content object (i.e., a document) depend on the implementation of the Framework. The proposed implementations discussed in the cited reference illustrate some of the options. For example, a simple Warwick Framework container may be embedded in a document, as illustrated in the HTML implementation proposal; or an HTML document can include a link to a container stored as a separate file. On the other hand, as illustrated in the distributed object proposal, a container may be a logical component of a so-called digital object, which is a data structure for representing networked objects.
The reverse linkage, which ties a container to a piece of intellectual content, is also relevant, since anyone can create descriptive data for a networked resource, without permission or knowledge of the owner or manager of that resource. This metadata is fundamentally different from the metadata that the owner of a resource chooses to link to or embed with the resource. As a result, an informal distinction is made between two categories of metadata containers, which both have the same implementation:
Another motivation was the recognition that there are many other kinds of metadata besides that used for descriptive cataloging that may need to be recorded and organized. These kinds of metadata include, among others:
Each rating service picks a URL as its unique identifier, and includes this unique identifier in all content labels the service produces. It is intended that the URL, in addition to simply being a unique identifier, also refer to an HTML document which describes both the rating service, but also the rating system used by the service (possibly via a link to a separate document).
A rating system specifies the dimensions used for labeling, the scale of allowable values for each dimension, and a description of the criteria used in assigning values. For example, the MPAA rates movies in the U.S. based on a single dimension with allowable values G, PG, PG-13, R, and NC-17. The current PICS specification allows only floating point values.
Each rating system is identified by a URL. This allows multiple services to use the same rating system, and refer to it by its identifier. The URL identifying a rating system can be accessed to obtain a human-readable description of the rating system.
A content label, or rating, contains information about a document. The format of a content label is defined in the Label Format document referenced above, and has three parts:
When an end-user attempts to access a particular URL, a software filter built into the Web client (browser) fetches the document. The client also accesses the document's content label(s) based on rating systems that the client has been told to pay attention to. The client then compares the content label to the rating-system-specified values that the client has been told to base access decisions on, and either allows or denies access to the document.
Content labels may be:
((PICS-version 1.1) (rating-system "http://www.gcf.org/ratings") (rating-service "http://www.gcf.org/v1.0/") (icon "icons/gcf.gif") (name "The Good Clean Fun Rating System") (description "Everything you ever wanted to know about soap, cleaners, and related products") (category (transmit-as "suds") (name "Soapsuds Index") (min 0.0) (max 1.0)) (category (transmit-as "density") (name "suds density") (label (name "none") (value 0) (icon "icons/none.gif")) (label (name "lots") (value 1) (icon "icons/lots.gif"))) (category (transmit-as "subject") (name "document subject") (multivalue true) (unordered true) (label (name "soap") (value 0)) (label (name "water") (value 1)) (label (name "soapdish") (value 2)) (label-only)) (category) (transmit-as "color") (name "picture color") (integer) (category (transmit-as "hue") (label (name "blue") (value 0)) (label (name "red") (value 1)) (label (name "green") (value 2))) (category (transmit-as "intensity") (min 0) (max 255))))There are four top-level categories in this rating system. Each category has a short transmission name to be used in labels (e.g., "suds"); some also have longer names that are more easily understood (e.g., "Soapsuds Index"). The "Soapsuds Index" category rates soapsuds on a scale between 0.0 and 1.0 inclusive. The "suds density" category can have ratings from negative to positive infinity, but there are two values that have names and icons associated with them. The name "none" is the same as 0, and "lots" is the same as 1. The "document subject" category only allows the values 0, 1, and 2, but a single document can have any combination of these values. The "picture color" category has two sub-categories.
A label list is used to transmit a set of PICS labels. The following is a label list for two documents rated using the above rating system.
(PICS-1.1 "http://www.gcf.org/v2.5" by "John Doe" labels on "1994.11.05T08:15-0500" until "1995.12.31T23:59-0000" for "http://www.w3.org/PICS/Overview.html" ratings (suds 0.5 density 0 color/hue 1) for "http://www.w3.org/PICS/Underview.html" by "Jane Doe" ratings (subject 2 density 1 color/hue 1))PICS-NG (Next Generation) was a W3C effort based on the observation that the PICS infrastructure could be generalized to support arbitrary Web metadata, with PICS categories serving as metadata attributes, having meanings defined by the rating system. The W3C paper Catalogs: Resource Description and Discovery <http://www.w3.org/pub/WWW/Search/catalogs.html> also observes that the structure of a PICS label is similar to:
PICS illustrates a number of important ideas in data modeling and metadata representation. One such idea is the definition of specific required data items (e.g., category, label) having predefined meanings in the model. Such specifications are important in supporting interoperability among applications that use PICS ratings. PICS also illustrates the use of metalevel pointers. The URLs that identify rating services and rating systems in PICS point to information that describes PICS metadata (i.e., to metametadata). These illustrate the idea that a given piece of data on the Web, no matter what its intended purpose (e.g., whether it is intended to represent data or metadata), can itself point to (or be related in some other way to) data that can be used to help interpret it. Finally, PICS illustrates the use of a metalevel (or reflective) architecture. PICS requires that ordinary requests for data on the Web be interrupted or intercepted, so that rating information about the requested resource can be retrieved, and a decision made about whether to return the requested data or not. This same basic idea can be used to enhance individual requests with other types of additional processing, often transparently to users. For example, such processing could be used to bracket a collection of individual requests to form a database-like transaction, by adding interactions with a transaction processor to these requests. Examples of such processing are described in [CM93, Man93, SW96]. These same ideas are the basis for current OBJS work on an Intermediary Architecture <http://www.objs.com/workshops/ws9801/papers/paper103.html> for the Web.
As illustrated by the existence of a PICS-NG effort, PICS itself requires extensions to deal with more general metadata requirements. Some of these are described further in the discussion of the Resource Description Framework (Section 2.2.6). In addition, in order to provide a complete Web object model, PICS and similar ideas must be augmented with an API providing applications with easy access to the state, and with mechanisms to link code to the state represented using models such as PICS. These aspects will be discussed in subsequent sections.
For example, an XML document might contain a "book" element which lexically contains an "author" element and a "title" element. An XML-Data schema can describe such syntax. However, in another context, it may simply be necessary to represent more abstractly that books have titles and authors, irrespective of any syntax. XML-Data schemas can also describe such conceptual relationships. Further, the information about books, titles and authors might be stored in a relational database, in which case XML-Data schemas can describe the database row types and key relationships.
One immediate implication of the ideas in XML-Data is that, using XML-Data, XML document types can be described using XML itself, rather than DTD syntax. Another is that XML-Data schemas provide a common vocabulary for ideas which overlap between syntactic, database and conceptual schemas. All features can be used together as appropriate.
Schemas in XML-Data are composed principally of declarations for:
Some data:
<?xml:namespace name="http://company.com/schemas/books/" as="bk"/> <?xml:namespace name="http://www.ecom.org/schemas/dc/" as="ecom" ?> <bk:booksAndAuthors> <Person> <name>Henry Ford</name> <birthday>1863</birthday> </Person> <Person> <name>Harvey S. Firestone</name> </Person> <Person> <name>Samuel Crowther</name> </Person> <Book> <author>Henry Ford</author> <author>Samuel Crowther</author> <title>My Life and Work</title> </Book> <Book> <author>Harvey S. Firestone</author> <author>Samuel Crowther</author> <title>Men and Rubber</title> <ecom:price>23.95</ecom:price> </Book> </bk:booksAndAuthors>The schema for http://company.com/schemas/books:
<?xml:namespace name="urn:uuid:BDC6E3F0-6DA3-11d1-A2A3-00AA00C14882/" as="s"/?> <?xml:namespace href="http://www.ecom.org/schemas/ecom/" as="ecom" ?> <s:schema> <elementType id="name"> <string/> </elementType> <elementType id="birthday"> <string/> <dataType dt="date.ISO8601"/> </elementType> <elementType id="Person"> <element type="#name" id="p1"/> <element type="#birthday" occurs="OPTIONAL"> <min>1700-01-01</min><max>2100-01-01</max> </element> <key id="k1"><keyPart href="#p1" /></key> </elementType> <elementType id="author"> <string/> <domain type="#Book"/> <foreignKey range="#Person" key="#k1"/> </elementType> <elementType id="writtenWork"> <element type="#author" occurs="ONEORMORE"/> </elementType> <elementType id="Book" > <genus type="#writtenWork"/> <superType href=" http://www.ecom.org/schemas/ecom/commercialItem"/> <superType href=" http://www.ecom.org/schemas/ecom/inventoryItem"/> <group groupOrder="SEQ" occurs="OPTIONAL"> <element type="#preface"/> <element type="#introduction"/> </group> <element href="http://www.ecom.org/schemas/ecom/price"/> <element href="ecom:quantityOnHand"/> </elementType> <elementTypeEquivalent id="livre" type="#Book"/> <elementTypeEquivalent id="auteur" type="#author"/> </s:schema>While this example does not illustrate all of the capabilities of XML-Data, it does illustrate the capabilities of declaring such things as:
XML-Data is another example of a higher-level model built using XML as its representation. It is not yet clear how the overlap in metadata capabilities between such representations as DTDs, RDF, and XML-Data will work out. The XML-Data approach may prove to be better than DTDs in supporting some types of processing, such as database-like operations, since it makes no distinctions between data and metadata representations. Like the other data models described in this section, XML-Data is not sufficient to form a complete Web object model. In particular, it requires integration with an API facility and a mechanism to access associated code.
MCF is essentially a structure description language. The basic information structure used is the Directed Labeled Graph (DLG). An MCF database is a set of DLGs, consisting of:
Each label/property type, such as pageSize, is a node (but not all nodes are property types). Since labels are nodes, they can participate in relationships that, e.g., define its semantics. For example, a pageSize node could have properties that specify its domain (e.g., Document), its range (sizeInBytes), that a Document has only one pageSize, and that provide human-readable documentation of the intended semantics.
An MCF node can be either a primitive data type or a "Unit". The primitive data types are the same as the Java primitive types. In addition, a DATE type should be supported by the low-level MCF machinery. The concept of a "Unit" corresponds loosely to the Java concept of "Object".
MCF defines a small set of units with predefined semantics in order to "bootstrap" the type system. These include, among others:
Like PICS, MCF illustrates a number of important ideas in data modeling and metadata representation. For example, MCF illustrates both the use of specific required data items having predefined meanings in the model, and metalevel pointers. Unlike PICS, MCF represents a data model that can be used for more general purposes than content labeling. For example, it includes a type hierarchy, a richer set of base types, and other aspects of a full data model. In addition to required data items representing aspects of the model structure, the MCF reference identifies a list of suggestions for standard application-specific item names borrowed from the Dublin Core and elsewhere. MCF "units" are similar to the individual elements of the OEM model. Many MCF concepts have been incorporated into W3C's RDF (described in the next section). However, as noted in connection with other models in this section, these concepts must be combined with an API and a mechanism for integrating behavior to provide full object model support.
The basis of RDF is a model for representing named properties and their values. These properties serve both to represent attributes of resources (and in this sense correspond to attribute/value pairs) and to represent relationships between resources. The RDF data model is a syntax-independent way of representing RDF statements.
The core RDF data model is defined in terms of:
In this data model both the resources being described and the values describing them are nodes in a directed labeled graph (values may themselves be resources). The arcs connecting pairs of nodes correspond to the names of the property types. This is represented pictorially as:
[resource R] ---propertyType P---> [value V]and can be read "V is the value of the property P for resource R", or left-to-right; "R has property P with value V". For example the statement "John Smith is the Author of the Web page "http://www.bar.com/some.doc" would be represented as:
[http://www.bar.com/some.doc] ---author---> "John Smith"where the notation [URI] denotes the instance of the resource identified by URI and "..." denotes a simple Unicode string.
According to the above definition, the property "author", i.e. the arc labeled "author" plus its source and target nodes is the triple (3-tuple):
{author, [http://www.bar.com/some.doc], "John Smith"}where "author" denotes a node used for labeling this arc. The triple composed of a resource, a property type, and a value is an RDF statement.
A collection of these triples with the same second item is called an assertions. Assertions are particularly useful when describing a number of properties of the same resource. Assertions are diagramed as follows:
[resource R]-+---property P1----> [value Vp1] | +---property P2----> [value Vp2]An RDF assertions can be a resource itself and can therefore be described by properties; that is, an assertions can itself be used as the source node of an arc. The name assertions is suggestive of the fact that the properties specified in it are effectively (logical) assertions about the resource being described. This establishes a relationship between RDF and a logic-based interpretation of the data structure which will be further developed in Section 3.
Assertions may be associated with the resource they describe in one of four ways:
The set of properties in a given assertions, as well as any characteristics or restrictions of the property values themselves, are defined by one or more schemas. Schemas are identified by a URL. An assertions may contain properties from more than one schema. RDF uses the XML namespace mechanism to associate the schema with the properties in the assertions. The schema URL may be treated merely as an identifier, or it may refer to a machine-readable description of the schema. By definition, an application that understands a particular schema used by an assertions understands the semantics of each of the contained properties. An application that has no knowledge of a particular schema will minimally be able to parse the assertions into the property and property value components, and will be able to transport the assertions intact (e.g., to a cache or to another application).
A human- or machine-readable description of an RDF schema may be accessed through content negotiation by dereferencing the schema URL. If the schema is machine-readable, it may be possible for an application to dynamically learn some of the semantics of the properties named in the schema.
An RDF statement can itself be the target node of an arc (i.e. the value of some other property) or the source node of an arc (i.e. it can have properties). In these cases, the original property (i.e., the statement) must be reified; that is, converted into nodes and arcs. RDF defines a reification mechanism for doing this. Reified properties are drawn as a single node with several arcs emanating from it representing the resource, property name, and value:
[property P1]-+---PropName---> ["name"] | +---PropObj----> [resource R] | +---PropValue--> [value Vp1]This allows RDF to be used to make statements about other statements; for example, the statement "Joe believes that the document 'The Origin of Species' was authored by Charles Darwin" would be diagramed as:
[Joe]--believes-->[stmnt1]+--InstanceOf-> RDF:Property | +--PropName->"author" | +--PropObj->[http://loc.gov/Books/Species] | +--PropValue->"Charles Darwin"To help in reifying properties, RDF defines the InstanceOf relation (property) to provide primitive typing, as shown in the example.
To reify a property, all that is done is to add to the data model an additional node (with a generated label) and the three triples with first items (or arcs with labels) using the predefined names RDF:PropName, RDF:PropObj, and RDF:PropValue respectively, second item the generated node label, and third item the corresponding property type, resource node, and value node respectively. In the above example, the three added triples would be:
{PropName, stmnt1, "author"} {PropObj, stmnt1, [http://loc.gov/Books/Species]} {PropValue, stmnt1, "Charles Darwin"}(The use of the "RDF:" prefix in names illustrates the use of the XML namespace mechanism to qualify names to indicate the schema in which they are defined.)
Frequently it is necessary to create a collection of nodes; e.g. to state that a property has multiple values. RDF defines three kinds of collections: ordered lists of nodes, called sequences, unordered lists of nodes, called bags, and lists that represent alternatives for the (single) value of a property, called alternatives. To create collections of nodes, a new node is created that is an RDF:InstanceOf one of the three node types RDF:Seq, RDF:Bag, or RDF:Alternatives. The remaining arcs from that new node point to each of the members of the collection and are uniquely labeled using the elements from Ord. For the RDF:Alternatives, there must be at least one member whose arc label is RDF:1, and that is the default value for the Alternatives node.
The RDF data model provides an abstract, conceptual framework for defining and using metadata. A concrete syntax is also needed for the purpose of authoring and exchanging this metadata. The syntax does not add to the model, and APIs could be provided to manipulate RDF metadata without reference to a concrete syntax. RDF uses XML encoding as its syntax. However, RDF does not require an XML DTD for the contents of assertion blocks (and RDF schemas are not required to be XML DTDs). In this respect, RDF requires at most that its XML representations be well-formed.
RDF defines several XML elements for its XML encoding. The RDF:serialization element is a simple wrapper that marks the boundaries in an XML document, where the content is explicitly intended to be mappable into an RDF data model instance. RDF:assertions and RDF:resource contain the remaining elements that instantiate properties in the model instance. Each XML element E contained by an RDF:assertions or an RDF:resource results in the creation of a property (a triple that is an element of the formal set T defined earlier).
With these basic principles defined, directed graph models of arbitrary complexity can be constructed and exchanged. A simple example would be "John Smith is the Author of the document whose URL is http://www.bar.com/some.doc" (all these examples are taken from the RDF paper cited above, but updated to use more recent XML namespace syntax). This assertion can be modeled with the directed graph:
[http://www.bar.com/some.doc] ---bib:author---> "John Smith"(This report uses a notation where Nodes are represented by items in square brackets, arcs are represented as arrows, and strings are represented by quoted items.) This small graph can be exchanged in the serialization syntax as:
<?xml:namespace name="http://docs.r.us.com/bibliography-info" as="bib"?> <?xml:namespace name="http://www.w3.org/schemas/rdf-schema" as="RDF"?> <RDF:serialization> <RDF:assertions href="http://www.bar.com/some.doc"> <bib:author>John Smith</bib:author> </RDF:assertions> </RDF:serialization>This example illustrates how the resource, property name, and value are translated into XML.
A more elaborate model could be created in order to say additional things about John Smith, such as his contact information, as in the model:
[http://www.bar.com/some.doc] | bib:author | V [John Smith]-+---bib:name----> "John Smith" | +---bib:email----> "john@smith.com" | +---bib:phone----> "+1 (555) 123-4567"which could be exchanged using the XML serialization representation:
<?xml:namespace name="http://docs.r.us.com/bibliography-info" as="bib"?> <?xml:namespace name="http://www.w3.org/schemas/rdf-schema" as="RDF"?> <RDF:serialization> <RDF:assertions href="http://www.bar.com/some.doc"> <bib:author> <RDF:resource> <bib:name>John Smith</bib:name> <bib:email>john@smith.com</bib:email> <bib:phone>+1 (555) 123-4567</bib:phone> </RDF:resource> </bib:author> </RDF:assertions> </RDF:serialization>The serialization above is equivalent to this second serialization:
<?xml:namespace name="http://docs.r.us.com/bibliography-info" as="bib"?> <?xml:namespace name="http://www.w3.org/schemas/rdf-schema" as="RDF"?> <RDF:serialization> <RDF:assertions href="http://www.bar.com/some.doc"> <bib:author href="#John_Smith"/> </RDF:assertions> </RDF:serialization> <RDF:resource id="John_Smith"> <bib:name>John Smith</bib:name> <bib:email>john@smith.com</bib:email> <bib:phone>+1 (555) 123-4567</bib:phone> </RDF:resource>In these representations, the RDF:resource element creates an in-line resource. Typically such a resource will be a surrogate, or proxy, for some other real resource that does not have a recognizable URI. The id= attribute in the second representation provides a name for the resource element so that the resource may be referred to elsewhere.
As an example of making a statement about a statement, consider the case of computing a digital signature on an RDF assertion. (It is assumed that the signature is computed over a concrete XML representation of the assertion rather than over an internal representation. The figure below shows a box containing a small graph. This is a convention to indicate that the XML content whose ID is foo is a concrete representation of the graph it contains.) What is to be specified in the model is expressed by the pair of graphs below - that there is an XML encoding of some assertion, and that there is some other XML content that is a digital signature over that encoding.
+---------------------------------------------------------------+ | ID=foo | | | | [http://www.bar.com/some.doc] ---DC:creator---> "John Smith" | | | +---------------------------------------------------------------+ [foo]------DSIG:Signature------>"AKGJOERGHJWEJ348GH4GHEIGH4ROI4"
The details could be expressed in the model below:
"AKGJOERGHJWEJ348GH4GHEIGH4ROI4"<--RDF:PropValue----+ | [DSIG:Signature]<----RDF:PropName-----+ | +--RDF:InstanceOf-->[RDF:Property]<--RDF:InstanceOf--+ | | | | [foo]<----------------RDF:PropObj-----------------[prop-001] | | +---------------------------------------------+ | | +-----------------------------+ | | | | RDF:PropObj RDF:PropName RDF:PropValue | | | V V V [http://www.bar.com/some.doc] ---DC:creator---> "John Smith"These models could also be expressed as:
<?xml:namespace name="http://purl.org/DublinCore/RDFschema" as="DC"?> <?xml:namespace name="http://www.w3.org/schemas/rdf-schema" as="RDF"?> <?xml:namespace name="http://www.w3.org/schemas/DSig-schema" as="DSIG"?> <RDF:serialization> <RDF:assertions href="http://www.bar.com/some.doc" id="foo"> <DC:Creator>John Smith</DC:Creator> </RDF:assertions> <RDF:assertions href="#foo"> <DSIG:Signature>AKGJOERGHJWEJ348GH4HGEIGH4ROI4</DSIG:Signature> </RDF:assertions> </RDF:serialization>(Note that node labels such as "RDF:Property" are shorthand for a full URI such as "http://www.w3.org/schemas/rdf-schema#Property").
The RDF data model intrinsically only supports binary relations. However, higher arity relations can also be represented, using just binary relations. As an example, consider the subject of one of John Smith's recent articles - library science. The Dewey Decimal Code for library science could be used to categorize that article. While the numeric code is the true Dewey value, few people can understand those codes. Therefore, the description of the Dewey categories has been translated into several different languages. In fact, Dewey Decimal codes are far from the only subject categorization scheme. So, it might be desirable to define a "Subject" node that not only specified the subject of a paper, but also indicated the language and categorization scheme it came from. That might look like:
[http://www.webnuts.net/Jan97.html] | DC:subject | V [subject_001]-+---DC:scheme----> "Dewey Decimal Code" | +---DC:lang----> "English" | +---RDF:PropValue----> "020 - Library Science"which could be exchanged as:
<?xml:namespace name="http://purl.org/DublinCore/RDFschema" as="DC"?> <?xml:namespace name="http://www.w3.org/schemas/rdf-schema" as="RDF"?> <RDF:serialization> <RDF:assertions href="http://www.webnuts.net/Jan97.html"> <DC:subject> <RDF:resource id="subject_001"> <DC:scheme>Dewey Decimal Code</DC:scheme> <DC:lang>English</DC:lang> <RDF:PropValue>020 - Library Science</RDF:PropValue> </RDF:resource> </DC:subject> </RDF:assertions> </RDF:serialization>A common use of this higher-arity capability is when dealing with units of measure. A person's weight is not just a number like 94, it also requires specification of the units on that number. In this case either pounds or kilograms might be used. A relationship with an additional arc might be used to record the fact that John Smith is a rather strapping gentleman:
+--NIST:units--> "pounds" | [John Smith]--NIST:weight-->[weight_001]-+ | +--RDF:PropValue--> "200"which can be exchanged as:
<?xml:namespace name="http://www.nist.gov/RDFschema" as="NIST"?> <?xml:namespace name="http://www.w3.org/schemas/rdf-schema" as="RDF"?> <RDF:serialization> <RDF:assertions href="John_Smith"> <NIST:weight> <RDF:resource id="weight_001"> <NIST:units href="#pounds"/> <RDF:PropValue>200</RDF:PropValue> </RDF:resource> </NIST:weight> </RDF:assertions> </RDF:serialization>assuming the node "pounds" was defined elsewhere.
The RDF effort is attempting to define a very general abstract metadata architecture and associated support facilities. RDF, like MCF, illustrates how a higher level model can be used together with XML to support specific types of application requirements, and illustrates a number of the same metadata modeling ideas as MCF. The RDF examples above specifically illustrate a requirement for metalevel pointers to explicitly link tags to attribute definitions (by an explicit pointer, not by looking up the name in a dictionary). The more powerful facilities of XML for defining hyperlinks will improve the ability to define very general relationships between data and metadata that describes (and can help interpret) it. For example, the advanced XML linking facilities defined in XLL would allow assertions to refer to parts of referenced documents. It seems likely that RDF will also investigate mechanisms to automatically provide access to RDF metadata at runtime (implementing the various association modes such as along-with), similar to the mechanisms provided by PICS for content labels. In implementing a Web object model, these techniques will be required to gain access to the object methods (which may be either embedded in the Web page, or located as separate resources).
Because of its generality in representing metadata, and the likelihood that it will be the basis of future Web developments in representing metadata, the Web object model described in Section 3 uses RDF (and its XML representation) as part of its structural base (although RDF is currently incomplete, and will be developed further). Additional aspects of MCF may be used as well, depending on more detailed analysis to be performed later. Section 3 will describe further decisions about the nature of the object model, based on RDF as a starting point.
However, RDF and MCF themselves are not sufficient to support all requirements of a Web object model. For example, the object model requires an API to its state representation, and thus RDF and MCF must be integrated with parallel work on a Document Object Model (see below), which is not currently the case. Also, mechanisms for linking code to RDF and MCF structures must be further developed. Finally, structured database capabilities do not exist for these structures, and must be worked out.
This section describes several mechanisms developed within the Web community for defining relationships between state and code, and for providing an API to state (the second and third bullets above). Specifically, techniques developed for embedding objects and scripts in Web documents represents one way of associating behavior with the state represented by a Web document. The W3C's Document Object Model (DOM) effort represents another way of addressing this issue, as well as the issue of providing an API to this state. These two issues are closely related.
A program must gain access to data in order to process it, and so an object method must have access to the object's state. It is always possible to pass data as a value to a program. However, the program must understand the structure of this data in order to access it efficiently. Conventional object models provide what is in effect a special API for object methods to use when accessing state for this purpose. This is also necessarily in a Web object model. However, the need for such an API becomes especially important when the state has a rich, complex structure, such as an XML document. Without an API to this state (and its implementation), each program would have to implement a considerable amount of code simply to parse the structure, in order to locate the parts of the document required for specific purposes. An API providing access to the various parts of a document, together with an implementation of this API as part of the general representation of this state's "data type", provides this code as a pre-existing component, allowing the program to concentrate on application-related processing. The DOM provides such an API. At the same time, it provides part of a general mechanism (albeit a very unconstrained one) for linking code and state, since it provides a straightforward mechanism for code (currently, programs such as plug-ins or external applications) to access the state it needs.
Finally, the Web Interface Definition Language (described in Section 2.3.3) is commercial technology that represents another mechanism for providing an API to state (as well as to Web-based services).
DOM is a generalization of Dynamic HTML facilities defined by Microsoft and Netscape. Functionality equivalent to the Dynamic HTML support provided by Netscape Navigator 3.0 and Microsoft Internet Explorer 3.0 is referred to as "DOM level 0". DOM level 1 extends these capabilities to, for example, allow creation "from scratch" of entire Web documents in memory by creating the appropriate objects. The DOM Working Draft specification <http://www.w3.org/TR/WD-DOM> includes level 1 Core specifications which apply to both HTML and XML documents, and level 1 specializations for HTML and XML documents. The DOM object class definitions in these specifications have their interfaces defined using OMG IDL. Java interface specifications are also defined (see the specifications for details).
DOM represents a document as a hierarchy of objects, called nodes, which are derived (by parsing) from a source representation of the document (HTML or XML). The DOM object classes represent generic components of a document, and hence define a document object metamodel. The DOM Level 1 working draft defines a set of object classes (and their inheritance relationships) for representing documents. The major classes are:
Node | +--Document | | | +--HTMLDocument | +--Element | | | +--HTMLElement | | | +--specific HTML elements | +--Attribute | +--Text | +--PI [Processing Instruction, an XML concept from SGML] | +--CommentThe Node object is the base type for all objects in the DOM. It may have an arbitrary number (including zero) of sequentially-ordered child nodes. It usually has a parent Node, the exception being that the root Node in a document tree has no parent.
Element objects represent the elements in HTML and XML documents. Elements contain, as child nodes, all of the content between the start tag and the corresponding end tag of an element. Aside from Text nodes, the vast majority of node types that applications will encounter when traversing a document structure will be Element nodes. Element objects also have a list of Attribute objects which represent the set of attributes explicitly defined as part of the element, and those defined in the DTD that have default values.
Text objects are used to represent any non-markup values, whether the values are intended to represent an integer, date, or some other type of value. For XML documents, all whitespace between markup results in Text objects being created.
The Document object is the root node of a document object tree, and represents the entire HTML or XML document. The HTMLDocument subtype represents a specialization of the generic Document type for the specific requirements of HTML documents.
Additional object classes are defined in the working draft for representing XML Document Type Definitions, and auxiliary data structures (e.g., lists of nodes).
Normally, a DOM-compliant implementation will make the main Document instance available to the application through some implementation-defined mechanism. For example, a typical implementation would give the application a reference to a DocumentContext object. This object describes the source of the document, as well as related information such as the date and time the document was last changed. From the DocumentContext, the application may access the Document object, which is the root of the document object hierarchy. From the Document object, the application can use the methods provided for accessing individual nodes, selection of specific node types (such as all images), and so on. For XML documents, the DTD is available through the documentType method (which returns null for HTML documents and XML documents without DTDs). Document also defines a getElementsByTagName method. This produces an enumerator that iterates over all Element nodes within the document whose tagName matches the input name provided. (The DOM working draft indicates that a future version of the DOM will provide a more generalized querying mechanism for nodes).
As an example generally illustrating how an XML document might be presented to an application in the DOM, consider the example described in Section 2.1.4 of a simple relational database represented in XML. The DOM for XML would present the XML document to an application as a collection (actually, a tree) of objects. Most of these objects would be of type Node, and specifically of its subtypes Element (representing the individual elements) and Text (representing the content). More precisely:
<!doctype mydata "http://www.w3.org/mydata"> <mydata> ... </mydata>(the outer markup) would be presented as an object of type Document (a subtype of Node). The children of this node would be objects representing the Table elements (and, indirectly, their contained rows and fields). Type Node provides a method getChildren() to access the children. The table delimited by
<authors> ... </authors>would be presented as an object of type Element (another subtype of Node) representing the Authors table. Type Element provides a method getTagName() to provide access to the actual tag name (authors in this case). The children of this node would be objects representing Row elements of type Author (and, indirectly, the contained fields). Similarly,
<editors> ... </editors>would be presented as another object of type Element representing the Editors table.
Each element delimited by
<author> ... </author>would be presented as an object of type Element representing a particular Author row. The children of this node would be objects representing the fields contained in the row. Elements delimited by
<editor> ... </editor>would similarly be presented as objects of type Element representing Editor rows.
Fields would similarly be presented as Element objects. For example, each element delimited by
<name> ... </name>would be presented as an object of type Element representing that particular field. Each of these elements would have a child node of type Text (Text is not a subtype of Element) representing the text value of the field (e.g., "Robert Roberts"). The data() method of the Text object type returns the actual string representation. In this case, this would end the nesting.
The representation of a Web page in terms of objects makes it easy to associate code with the various subcomponents of the page. The DOM requirements also identify the need for an event model, to provide a way to schedule the execution of the code associated with particular parts of a Web page at appropriate times. This event model (not yet specified) would extend the current event capabilities provided by most Web clients. The requirements specify that:
The DOM remains under development, and further work is required to integrate it both with other Web technology developments, and with capabilities required to provide full Web object model support. For example, SGML's DSSSL (described briefly in the XML section) defines a very general object model for SGML documents, called groves, which resembles the DOM to some extent. Groves are intended to provide a runtime object model for use while processing SGML documents. However, it is not clear to what extent DOM and grove capabilities will be integrated. Groves are extremely general (e.g., using groves it is possible to define each character in a document as a separate element), and it is not clear that the same level of generality is required for DOM. Moreover, groves define an object model for static documents. DOM, on the other hand, is designed to deal with dynamic documents, which can be modified by processing applications (via the DOM interface) at runtime. However, the XML stylesheet proposals are based to some extent on DSSSL (and hence presumably on the use of some aspects of groves). Another interesting aspect of this integration is that DSSSL defines a query language called SDQL for accessing parts of SGML documents for use in stylesheet processing. The provision of a query language (or aspects of one) for XML would provide an important base for the development of full-fledged database-like processing capabilities for Web documents represented in XML. This issue is being explored further in a companion OBJS technical report in progress.
The DOM defines its API at a generic level, i.e., at the level of components of a document metamodel. Additional work would be required to define "application level" object interfaces. For example, in the relational database example defined above, DOM provides objects of types node, element, and so on, rather than objects of type author or editor (or even objects of type table or row). Using DOM, an application could effectively create such types from the information given, but it would have to "know what to look for", and would have to traverse the various element objects to find that information. It would be desirable to have a capability for creating DOM-like, but application-oriented, APIs. This could involve using additional metadata (e.g., the DTD, or an XML-Data-like schema) to generate a default API automatically (which the document's author could then customize). It might then be possible to attach specific methods to this API to define application-specific object behavior. An integration of DOM and the embedded OBJECT elements described below would be one way to support this. This would effectively permit the creation of objects in the classic object-oriented programming sense.
The DOM work also needs to be integrated with the work on higher-level models described in Section 2.2. One effect of this would be to provide a way to add object behavior to documents without the need for references to the associated programs to be embedded in the page, as with OBJECT elements. These models might also provide additional support for generating application-specific object APIs.
In the most general case, an inserted rendering mechanism specifies three types of information (although in specific cases not all this information may need to be explicitly specified):
In HTML 4.0, the OBJECT element specifies the location of a rendering mechanism and the location of data required by the rendering mechanism. This information is specified by the attributes of the OBJECT element. The PARAM element specifies a set of run-time values.
A client interprets an OBJECT element by first trying to render the mechanism specified by the element's attribute. If this cannot be done for some reason (e.g., the client is configured not to, or the client platform cannot support that mechanism), the client must try to render the element's contents. This provides a way to specify alternate object renderings, since the contents of an OBJECT element can be another OBJECT element specifying an alternative mechanism. The contents of the most deeply embedded element should be text. Data to be rendered can be supplied either inline, or from an external resource. An HTML document can be included in another document by using an OBJECT element with the data attribute specifying the file to be included.
The following simple Java applet:
<APPLET code="AudioItem" width="15" height="15"> <PARAM name="snd" value="Hello.au|Welcome.au> Java applet that plays a welcoming sound. </APPLET>may be rewritten as follows using OBJECT:
<OBJECT codetype="application/octet-stream" code="AudioItem" width="15" height="15"> <PARAM name="snd" value="Hello.au|Welcome.au"> Java applet that plays a welcoming sound. </OBJECT>The OBJECT element includes, among others, the following attributes:
The OBJECT element is also the basis of current capabilities that link Web pages into CORBA distributed object architectures. This is done by using Java applets (referenced from OBJECT elements on Web pages) which define CORBA objects, and can interact with other CORBA objects (not necessarily written in Java) via CORBA's Internet Inter-ORB Protocol (IIOP), using an ORB contained in the Web client (Netscape Communicator supports such an ORB). This is an important capability in merging Web and object technologies, particularly the object service capabilities provided by CORBA architectures. Combining this capability with the facilities of our Web object model would provide a deeper integration of Web and object technology, and an improved ability to apply object services to Web resources. This is discussed further in Section 3.
A central feature of WIDL is that programmatic interfaces can be defined and managed for Web resources such as:
WIDL definitions provide a mapping between such Web resources and applications written in conventional programming languages such as C/C++, COBOL, Visual Basic, Java, JavaScript, etc., enabling automatic and structured Web access by compatible client programs, including mainstream business applications, desktop applications, applets, Web agents, and server-side Web programs (CGI, etc.). Using WIDL, programs can request Web data and services by making local calls to functions which encapsulate standard Web access protocols and utilize WIDL definitions to provide naming services, change management, error handling, condition processing and intelligent data binding. A browser is not required to drive Web applications. WIDL requires only that target systems be Web-enabled (there are numerous commercial products which allow existing systems to be Web-enabled).
A service defined by WIDL is equivalent to a function call in standard programming languages. At the highest level, WIDL files describe the locations (URLs) of services, input parameters to be submitted (via Get or Post methods) to each service, conditions for successful processing, and output parameters to be returned by each service. In much the same way that DCE or CORBA IDL is used to generate code fragments, or 'stubs', to be included in application development projects, WIDL provides the structure necessary for generating client code in languages such as C/C++, Java, COBOL, and Visual Basic.
Many of the features of WIDL require a capability to reliably identify and extract specific data elements from Web documents. Various mechanisms for accessing elements of HTML and/or XML documents have been defined, such as the JavaScript Page Object Model, the Document Object Model, and XML-Link. The following capabilities are desirable for accessing elements of Web documents:
The following example (from the cited reference) illustrates the use of WIDL to define a package tracking service for generic Shipping. By allowing a WIDL definition to reference a 'Template' WIDL definition, a general class of shipping services can be defined. 'FoobarShipping' is one implementation of the 'Shipping' interface.
<WIDL NAME="genericShipping" TEMPLATE="Shipping" BASEURL="http://www.shipping.com" VERSION="2.0"> <SERVICE NAME="TrackPackage" METHOD="Get" URL="/cgi-bin/track_package" INPUT="TrackInput" OUTPUT="TrackOutput" /> <BINDING NAME="TrackInput" TYPE="INPUT"> <VARIABLE NAME="TrackingNum" TYPE="String" FORMNAME="trk_num" /> <VARIABLE NAME="DestCountry" TYPE="String" FORMNAME="dest_cntry" /> <VARIABLE NAME="ShipDate" TYPE="String" FORMNAME="ship_date" /> </BINDING> <BINDING NAME="TrackOutput" TYPE="OUTPUT"> <CONDITION TYPE="Failure" REFERENCE="doc.title[0].text" MATCH="Warning Form" REASONREF="doc.p[0].text" /> <CONDITION TYPE="Success" REFERENCE="doc.title[0].text" MATCH="Foobar Airbill:*" REASONREF="doc.p[1].value" /> <VARIABLE NAME="disposition" TYPE="String" REFERENCE="doc.h[3].value" /> <VARIABLE NAME="deliveredOn" TYPE="String" REFERENCE="doc.h[5].value" /> <VARIABLE NAME="deliveredTo" TYPE="String" REFERENCE="doc.h[7].value" /> </BINDING> </WIDL>In this example, the values defined in the 'TrackInput' binding get passed via HTTP Get as name-value pairs to a service residing at 'http://www.shipping.com/cgi-bin/track_package'. Object References are used in the 'TrackOutput' binding to a) check for successful completion of the service, and b) extract data elements from the document returned by the HTTP request.
'Input' and 'Output' bindings specify the input and output variables of a particular service. Input bindings define the name-value pairs to be passed via Get or Post methods to a Web-based application. Output bindings use object references to identify and extract data elements from documents returned by HTTP requests.
Conditions define 'success' and 'failure' states for output bindings, and determine whether a binding attempt should be retried in the case of a 'server busy' error: Conditions can apply to a binding as a whole, or to a specific object reference. Conditions can define error messages to be returned as the value of the service; error messages can be a literal, or can be extracted from the returned document.
WIDL is another example of technology that provides an API (an object interface) to state. In addition, it supports the definition of similar interfaces to Web-based services. Facilities for defining such interfaces are helpful tools in integrating Web-based state and behavior.
While a complete description of OMG activities is outside the scope of this report, several OMG technologies address structured data representation capabilities similar to others descrbed in Section 2, and hence are of direct interest here. Specifically, the OMG has been considering a Tagged Data Facility, and a Mediated Exchange Facility based on it, as part of its Common Facilities Architecture. The Tagged Data Facility involves the use of tagged data items to support semantics-based information exchange between applications, and also supports nesting and the ability to locate objects via tags through layers of nesting. The Mediated Exchange Facility is built on the Tagged Data Facility by adding mediator components and related services. Several submissions to OMG's Business Object Facility RFP describe such capabilities. In addition, the already-approved OMG Property Service provides similar capabilities. These OMG technologies are of interest in showing that there is a recognized need for tagged "data" representations to pass semantically-rich data structures between clients and servers within OMG's distributed object architecture, just as the representations described in Section 2.1 illustrated the need to do the same thing in the Web. However, there is not yet any coordination between these two communities in developing these facilities.
The OMG Property Service essentially provides a simple, dynamic, object-oriented interface to relatively unstructured property/value pairs. Object models (including OMG's) are generally static, in that they require an object class to have a fixed number of attributes and methods. The OMG Property Service addresses this restriction, and thus adds value to the object model. It does not specify an actual representation (this would presumably be specified using object externalization capabilities currently being developed by OMG), it is not as rich as XML, nor does it provide the higher-level modeling capabilities such as those described in Section 2.2. However, in some respects it resembles a very simple DOM, in that it does provide an object interface to an (unspecified) representation.
The TDF requirements seem to fit the basic structural capabilities of OEM and MCF to some extent (the draft TDF RFP explicitly references OEM), in the sense that they seem to call for the ability to construct complex graph structures of relatively simple labeled nodes. However, MCF in particular goes much further than TDF in defining the basis of a rather complete object model (which is unnecessary in TDF since TDF objects are already CORBA objects). TDF also specifies some metadata-related requirements, such as dealing with namespace issues and synonyms. However, like the Property Service, TDF is not well-integrated with related Web developments. Of course, as an RFP, the TDF leaves a great deal of detail, both of technology and usage scenarios, to be supplied by specific technology proposals submitted in response. As a result, it may be possible that some technology integrating OMG and Web technology, e.g., combining XML and DOM, could be adopted in response to the TDF RFP, once it is issued.
In supporting an object model, XML pages (like HTML pages) can also be used as containers for embedded objects and object methods (e.g., Java applets)
Object (state) Class object +---------------+ +-------------+ | class pointer |------------->| Class data | +---------------+ +-------------+ | variable 1 | | method 1 | | variable 2 | | method 2 | | ... | | ... | | variable n | | method m | +---------------+ +-------------+C++ implementations use similar structures. The state is a collection of programming language variables, which (usually) are not visible to anything but the methods (this is referred to as encapsulation). A typical object model has a tight coupling between the methods and state. All the structures (class objects, internal representation of methods and state, etc.) are determined by the programming language implementation, and are created together as necessary. The class (in particular, the methods it defines) defines the way the state should (and will) be interpreted within the system, and hence is a form of metadata for the state. As a result, the link between an object and its class is essentially a metadata link.
Extending this idea to the Web environment, the idea is that Web pages can be considered as state, and objects can be constructed by enhancing those pages with additional metadata that allows the pages to be considered as objects in some object model. In particular, we want to enhance Web pages with metadata consisting of programs that act as object methods with respect to the "state" represented by the Web page. The resulting structure would, at a minimum, conceptually be something like:
+----------+ +---------->| method 1 | +-------+ | +----------+ | Web |--+ ... | page |--+ +-------+ | +----------+ +---------->| method n | +----------+The NCITS Object Model Features Matrix [Man97] identifies many different object models, with widely differing characteristics. Different object models could also be defined for the Web. The details of the structures to be supported in a Web object model depend on the details of the object model we choose to define. For example, many object models are class-based, such as the Smalltalk and C++ models mentioned above. Choosing a class-based model for the Web would require defining separate class objects to define the various classes. Other object models are prototype-based, and do not require a class object (each object essentially defines itself). Either of these forms (plus others) could be supported by the basic mechanism we propose.
In a Web object model, some of the tight coupling that exists in programming language object models would probably be relaxed, and the connection between the state and code would be somewhat "looser". This would allow more flexibility in defining associations between programs and Web pages in the model. For example, unless special constraints prohibited such access, a user would probably be able to directly access the state (and manipulate it as well) using standard Web document viewing and creation tools, without necessarily using any associated methods (just as users today can often usefully access pages containing Java applets even when Java is inactive or unsupported on their browsers). In these cases, encapsulation would be relaxed and access to any methods related to the state would be optional.
Constructing these object model structures requires a number of "pieces" of technology, as we have already observed several times. These pieces are:
In the approach we propose, relationships between the state and the methods will be defined in either of two ways:
The two mechanisms identified above (embedded OBJECT elements and RDF resources associated with the page) potentially provide a way to access the methods when the state is accessed. In addition, a mechanism is required to invoke the code as it is needed. The OBJECT element already provides such a mechanism which can be used in some cases (for example, this is used to invoke Java applets embedded in pages). A more general mechanism would necessary for methods defined in RDF resources. There may be a way to do this provided within a general RDF-supported metadata access mechanism (this is currently not clear, since RDF is still under development). Alternatively, it may be necessary to define this as an extension. Again, this would probably be relatively straightforward.
Many details of this technology integration must still be worked out (partially because some of the key technologies we have identified are still under development). Nevertheless, we feel that the capabilities inherent in these technologies provide the necessary support for the object model integration we propose.
The Harvest Object System (HOS) [CHHM+94] modified the Mosaic browser to include a Harvest Object Broker, allowing users to interact with remote objects via a special Harvest Object Protocol (HOP). HOS defines objects from existing files and programs by recording metadata roughly of the form:
user-defined type name URL --> file data URL --> method (program) URL --> method URL --> method ... URL --> methodusing SOIF to hold that metadata. The HOP is used for retrieving IDL information, moving object code and data, and invoking objects. A command such as GETOBJS hop://URL/some.obj (where URL/some.obj designates a file) returns the object data for some.obj along with its metadata, including a set of methods.
ANSAWeb <http://www.ansa.co.uk/ANSA/ISF/overview.html> provides a strategy for interoperability between the Web and CORBA using HTTP-IIOP gateways -- the I2H gateway converts IIOP requests to HTTP, and H2I converts HTTP requests to IIOP. The H2I gateway allows WWW clients to access CORBA services; the I2H gateway allows CORBA clients to access Web resources. The pair of gateways together behave like an HTTP proxy to the client and server. A CORBA IDL mapping of HTTP represents HTTP operations as methods and headers as parameters. An IDL compiler generates client stubs and server skeletons for the gateways. H2I is both a gateway to IIOP and a full HTTP proxy so a client can access resources from a server that does not have an I2H gateway. A locator service decides when to use IIOP or HTTP. If the locator can find an interface reference to a I2H server-side gateway, IIOP is used; otherwise, the H2I gateway passes the request via HTTP.
The W3Objects <http://arjuna.ncl.ac.uk/w3objects/> project at the University of NewCastle upon Tyne provides facilities for transforming standard Web resources (HTML documents, GIF images, PostScript files, audio files, and the like) from file-based resources into objects called W3Objects, i.e., encapsulated resources possessing internal state and well-defined behaviors. The motivating notion is that the current Web can be viewed as an object-based system with a single class of object -- all objects are accessed via an HTTP daemon. W3Objects are responsible for managing their own security, persistence, and concurrency control. These common capabilities are made available to derived application classes from system base classes. A W3Objects server supports multiple protocols by which client objects can access server objects. When using HTTP, the URL binds to the server object and the permitted object operations are defined by the HTTP protocol. Or, the RPC protocol can be used to pass operation invocations to a client-stub generated from a description of the server object interface. W3Objects uses C++ as the interface definition language, although CORBA IDL and ILU ISL can be used. W3Objects can also be accessed though a gateway, implemented as a plug-in module for an extensible Web server, such as Apache <http://www.apache.org/>. URLs beginning with /w3o/ are passed by the server to the gateway; the remainder of the URL identifies the requested service and its parameters. Using a Name Server, the appropriate HTTP method is invoked on the requested service.
These projects have identified a number of important ideas in supporting objects on the Web (in particular, objects constructed in the HOS resemble in many respects those that would be constructed using the approach described in Section 3.1). However, they based their attempts to develop object capabilities for the Web on the existing Web infrastructure. As a result, they had to use a number of non-standard Web extensions (e.g., special protocols referenced in URLs to trigger the loading of object methods), which limit their widespread usability. Dependence on the existing Web infrastructure also limits the ability of the resulting objects to support more complex Web applications. Our work, on the other hand, is based on what will likely be the next-generation Web infrastructure. This infrastructure is still evolving, and hence some extensions to it may yet be necessary. However, based on our analysis, these new Web technologies seem likely to provide a much better basis for providing powerful Web object facilities, that are at the same time based on standard (hence, widely accessible) Web protocols and components.
An approach similar to that provided by ANSAWeb is becoming increasingly popular, and is potentially very powerful. This involves placing Java applets on Web pages (using the APPLET or OBJECT elements in HTML). Once on the Web client, these objects then communicate with other objects on remote servers using various protocols. A particularly important variant of this approach is to use it to combine Java and CORBA. In this variant, Java applets downloaded to the client communicate with other CORBA objects over the Internet via CORBA's IIOP (Internet Inter-ORB Protocol), which is supported by all CORBA Object Request Brokers. This approach is, for example, supported by Netscape Communicator, which includes Visigenic's Java ORB. Using this approach, the advantages of CORBA's object services are potentially available to Internet objects. This also allows non-Java objects to be integrated into the Internet, since CORBA objects can be written in many languages. Java has also been the basis of proposals to improve Web capabilities by representing more and more Web content directly as Java objects, using the existing Web largely as a transport mechanism for these objects.
Such approaches provide important new mechanisms for supporting more powerful Web capabilities, and integrating enterprise distributed object systems (which are likely to be CORBA-based) with the Internet. However, these approaches suffer from a number of disadvantages when used by themselves, e.g.:
There are a number of potential ways to use the "objects" constructed using the mechanism we are proposing. One approach would be to use the methods associated with a document in the same way that Java applets are used now. The difference would be that the code would not need to be embedded in the document. (In fact, depending on the exact details of the DOM, if the methods were separately-located OBJECT elements, they could presumably be embedded dynamically in the document at the client using the DOM interface, and act just the way embedded OBJECTs would act). A more conventional "object-like" use would be to allow the associated methods to be invoked via an enhanced DOM interface by programs acting through the client. That is, the DOM effectively implements a generic interface of a type something like XML-document (for XML documents). Application-specific subtypes of this generic type could be created which included the application-specific methods associated with the document as parts of the interfaces defined for those subtypes. Programs acting through the client could then invoke these methods through the new interfaces just as they invoke the methods of other objects.
The mechanism defined here provides a form of "component-oriented" development, in that it allows the arbitrary composition of objects from data and code resources found on the Internet. Using this approach, a client could have multiple "object views" of the same base data (e.g., access the same data resources using different classes), by simply changing the collection of methods it uses when accessing the data (this would be like using different annotation sets or PICS-like labels in accessing a document).
The approach may appear somewhat "heavyweight", in the sense that it involves additional mechanism, and may involve delays in accessing the code that implements object methods. However:
So far, our work has focused on identifying new Web technologies to serve as a base, analyzing their capabilities, and developing the basic principles for integrating them. Further work needs to be done to work out the additional details required to build a prototype implementation. For example, we have already noted that there are many object models that could be supported using the principles we have identified. It will be necessary to choose a particular object model (or possibly more than one) to use for our Web object model. This, in turn, will affect the structure of the metadata that must be supported. For example, if a class-based model is chosen, additional metadata will need to be defined to support the class objects (these could be recorded as Web objects too, using RDF, possibly together with techniques from MCF or XML-Data). Further work will be necessary to determine an appropriate type of object model for use on the Web.
Additional work is also required to define the mechanism that invokes the object methods once they are returned to the client. This will depend on the details of how the RDF standard evolves. As noted at the end of Section 3.1, the general RDF-supported metadata access mechanism may provide a way to insert this method invocation mechanism. Alternatively, it may be necessary to define this as an extension to the RDF mechanism.
Finally, as noted already, the DOM currently defines its API at a generic level, i.e., at the level of components of a document metamodel. Additional work is required to define "application level" object interfaces which include interfaces to the methods associated with the objects. For example, in the relational database example described in Section 2.3.1, DOM provides objects of types node, element, and so on, rather than objects of type author or editor (or even objects of type table or row). Using DOM, an application could effectively create such interfaces from the information given, but it would have to "know what to look for", and would have to traverse the various element objects to find that information. It would be desirable to have a capability for creating DOM-like, but application-oriented, APIs. This could involve using additional metadata (e.g., the DTD, or an XML-Data-like schema) to generate a default API automatically (it might then be possible for the document's author to customize this API or, alternatively, define the API explicitly). It might then be possible to attach specific methods to this API to define application-specific object behavior. An integration of DOM and embedded OBJECT elements would be one way to support this. This would effectively permit the creation of objects in the classic object-oriented programming sense.
In this section, we describe some basic ideas behind work on a formal definition for our Web object model. The ideas are derived from work on the foundations of Web metadata concepts, work on object-oriented logics, and our own prior work on object model formalization. Many of these same ideas are currently being reflected in W3C's ongoing RDF activity.
Common features of these representational models are:
As an example of work within the W3C addressing the relationship of logic and metadata, Describing and Linking Web Resources is an early W3C note which discusses general ideas and issues for describing and linking Web resources. It references work such as PICS, SOIF, and MCF, and notes that, though these different formats exhibit a range of syntactic variations, semantically they attempt to convey similar information. The architectural model that is common to them is the basic structure of the web: a directed graph with labeled arcs. The nodes (or points, or vertices) of the graph are URLs--anchor or resource addresses. The arcs are links. The labels are link relationships. Associated with each node is a set of attributes, or slots, or fields. Each attribute has a name and a value. Values are defined in a media-type specific manner.
The note also identifies the relationship of these attribute/value-based schemes to basic concepts in propositional logic. This allows the identification of the basic principles of the model independently of particular representations. R(S, T) can be used to denote a link from S to T with relationship R. The same notation can be used for attributes, writing N(S, V) for an attribute named N on an anchor at S with value V. For example, both the SOIF description
@FILE {"http://www.shoes.com" Author{4}: Fred Supersedes{30}: http://www.provider.com/shoes }and the HTML
<about href="http://www.shoes.com"> <meta name=author content="Fred"> <link rel=Supersedes href="http://www.provider.com/shoes"> </about>can be interpreted as:
Author(http://www.shoes.com, "Fred") Supersedes(http://www.shoes.com, http://www.provider.com/shoes)Link semantics can be modeled by observing that anything can be considered a point in the web--including people, organizations, dates, and subject categories--by giving it a URL. A link or attribute in the web can be interpreted as an assertion, given an understanding of the semantics of the link relationship or attribute name. For example, given the definitions:
In addition to the description of simple, flat, sets of attribute/value pairs describing individual entities, it is necessary for these structural models to be able to handle more complex structures, such as trees (e.g., repeating groups) and networks (directed graphs). In defining these more complex structures, the ability to assign identifiers to both resources, and individual (or groups of) attribute/value pairs is important. This allows a given (sub)structure to be assigned an identity, and then referenced from multiple places within a data structure. In actual representations, such substructures are indicated not by assigning them separate identifiers, but by some distinct representation technique (e.g., by nesting them within a larger tag). Such substructures need to be understood as being "flattened", with separate identifiers defined, in interpreting them within a logic-based framework (just as, in the relational data model, data must at least be represented in unnested "first normal form"). Techniques for factoring nested parts of a hierarchical structure into a "flat" logical form, and the need for both AND and OR logical operators, are illustrated and discussed in On Information Factoring in Dublin Metadata Records <http://www.uic.edu/~cmsmcq/tech/metadata.factoring.html>.
Various specific representation techniques for metadata, such RDF, MCF, SOIF, OEM, etc., can be understood in the context of these observations as simply involving different encodings of the basic logic-based structures. Each encoding selects specific attributes, identifiers, etc. to cluster together in specific data representations, and selects others to represent as separate entities. Also, they select some relationships to represent explicitly by using identifiers as pointers, and some to represent implicitly by grouping related constructs in the same data structure. This interpretation of attribute/value pairs (and associated structures) as logical assertions is a key element in the development of a formal basis for our Web object model, and is explicitly reflected in RDF as well.
A number of abstract models for Web metadata describe the ability to link metadata individually to tagged items (attributes). For example, the Dublin Core describes the ability to access the definition of an individual attribute. This, for example, allows the attributes used in a particular description to be linked to an ontology that defines the attributes, and the set of concepts used in the context that the attributes are intended to describe. (A resource pointing to its ontology is similar to an object pointing to its methods, in a sense: it provides an interpretation (the methods are a "procedural specification" of the meaning/behavior appropriate to the data, while an ontology is human-readable). Work by groups such as the Stanford knowledge group is intended to merge these ideas and make the ontology readable/usable by knowledge-based software, the idea being that one could have a logic-based or other semantic specification which is declarative, and machine-interpretable.) The relationship between attribute/value pairs and formal logic described above also provides a basis for representing these additional kinds of links.
Describing and Linking Web Resources discusses how higher level information (such as beliefs), and information about the attributes or relationships themselves, can also be encoded using predicate logic. The basic approach is to assign each relationship (or attribute) its own URL (object identity), thus reifying the relationship (or attribute). Once a relationship has a URL (or other unique identifier), it can have its own metadata, by recording additional assertions about that identifier. If the relationship is identified with a URL, dereferencing the URL should access a definition of the link relationship, in either human-readable or machine-readable form. In addition, information about the association between a given attribute or assertion and a given resource can also be recorded. For example, in addition to recording an assertion like cost(o1, $26.95), information as to who made that assertion, and when, can also be recorded, e.g.:
who( (o1,cost), "fred") when( (o1,cost), "04/07/97")In this case, (o1,cost) acts as a new unique identifier which is the identity of the use within (or for) o1 of the attribute "cost" (this is a form of identifier construction mechanism supported by object logics, such as F-logic, described below).
Metadata Architecture [Ber97] observes that the URL space is an appropriate space for the definition of attribute names in the Web because it effectively provides for a federated name space, within which users can freely define attribute names without necessarily "registering" them with a central authority. However, the URLs that identify relationships or attributes need not necessarily be used locally (within a given resource). Instead, local names from a namespace defined by the resource can be used as abbreviations. However, it should always be possible to translate from a local name to the global URL that represents the actual definition of the relationship or attribute. Relationships such as the following could be defined to represent these concepts:
A full exposition of F-logic is outside the scope of this paper (and in any case can be obtained from the cited references). However, F-logic includes a number of capabilities that are relevant to this discussion. For example, F-logic supports operations on both flat data structures (along the lines of the conventional relational model) and nested data structures (path traversal). F-logic also supports id-terms representing object identities. These are logical terms which which use object constructor functions that can be interpreted as constructing object identities that are functionally dependent on their arguments. These terms are used to represent derived objects (e.g., objects to be constructed on the left-hand sides of rules), with the arguments of the function indicating the base objects from which the new objects were derived (effectively, the derived identity can be considered as the labeled tuple of the base identities). The ability to construct derived objects is crucial in describing the semantics of queries which produce new objects from existing ones (as a relational join operation does) and of views.
Finally, F-logic introduces higher-order capabilities, in order to effectively describe inheritance, and operations on metadata (e.g., database schemas), while retaining first-order semantics. This is done, as suggested in the previous section, by reifying concepts such as predicates, functions, and atomic formulas, allowing them to be manipulated as first-class objects. This reification allows the use of higher-order syntax, while retaining first order semantics. Under first-order semantics, predicates and functions have associated objects, called intensions, which can be manipulated directly. Depending on the context in which they appear, these intensions may assume different roles, acting as relations, functions, or propositions. For example, in F-logic, id-terms are handled as individuals when they occur as object identities, viewed as functions when they appear as object labels (attributes), and as sets when representing classes of objects. When functions or predicates are treated as objects, they are manipulated as terms through their intensions; when being applied to arguments, they are evaluated as functions or relations through their extensions.
The use of F-logic concepts in helping define query language concepts for object-oriented databases is described in [KKS92], including query language support for:
We feel that a particularly important aspect of this work is the attempt to rely to the greatest possible extent on standards (commonly-accepted or likely-to-be-accepted Web technology) in developing our integration approach, and on working within standards-developing organizations such as W3C and OMG in further refining it and developing additional capabilities. This both takes maximum advantage of existing work, and improves the chances that the technology that is developed will become widely available (albeit possibly in some modified form) in commercial software products.
Further work on this project will include:
[BBBC+97] R. Bayardo, Jr., W. Bohrer, R. Brice, A. Cichocki, J. Fowler, A. Helal, V. Kashyap, T. Ksiezyk, G. Martin, M. Nodine, M. Rashid, M. Rusinkiewicz, R. Shea, C. Unnikrishnan, A. Unruh, and D. Woelk, "InfoSleuth: Agent-Based Semantic Integration of Information in Open and Dynamic Environments", Proc. 1997 ACM SIGMOD Conf., SIGMOD Record, 26(2), June 1997.
[BDHS96] P. Buneman, S. Davidson, G. Hillebrand, and D. Suciu, "A Query Language and Optimization Technique for Unstructured Data", Proc. SIGMOD'96, 505-516.
[BDFS97] P. Buneman, S. Davidson, M. Fernandez, and D. Suciu, "Adding Structure to Unstructured Data", Proc. ICDT, 1997.
[Ber97] T. Berners-Lee, Metadata Architecture, <http://www.w3.org/DesignIssues/Metadata>.
[Bor95] A. Borgida, "Description Logics in Data Management", IEEE Trans. on Knowledge and Data Engineering, 7(5), October 1995, 671-682.
[Bos97] J. Bosak, XML, Java, and the Future of the Web, <http://sunsite.unc.edu/pub/sun-info/standards/xml/why/xmlapps.htm>, 1997.
[CHHM+94] B. Chhabra, D. Hardy, A. Hundhausen, D. Merkel, J. Noble, M. Schwartz, "Integrating Complex Data Access Methods into the Mosaic/WWW Environment", Proc. Second Intl. World Wide Web Conf., Oct. 1994, 909-919.
[CM93] S. Chiba and T. Masuda, "Designing an Extensible Distributed Language with a Meta-Level Architecture", Proc. ECOOP '93, LNCS 707, Springer-Verlag, July 1993, 482-501.
[DeR97] S. DeRose, The SGML FAQ Book, Kluwer, 1997.
[FR97] G. Fahl and T. Risch, "Query Processing over Object Views of Relational Data", VLDB Journal 6(1997) 4, 261-281.
[GB97] R. Guha and T. Bray, Meta Content Framework Using XML, <http://www.w3.org/TR/NOTE-MCF-XML/>, June 6, 1997.
[GW97] R. Goldman and J. Widom, "DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases", Technical Report, Stanford University, 1997, http://www-db.stanford.edu/pub/papers/dataguide.ps.
[Hop97] A. Hopmann, et. al., Web Collections using XML, 1997 <http://www.w3.org/TR/NOTE-XMLsubmit.html >.
[IK96} T. Isakowitz and R. J. Kauffman, "Supporting Search for Reusable Software Objects", IEEE Trans. Software Engrg. 22(6), June 1996, 407-423.
[ILCS95] D. Ingham, M. Little, S. Caughey, S. Shrivastava, "W3Objects: Bringing Object-Oriented Technology to the Web", Proc. Fourth Intl. World Wide Web Conf., World Wide Web Journal, December, 1995, 89-105.
[ISO86] International Standard ISO 8879:1986(E), Information Processsing - Text and Office Systems - Standard Generalized Markup Language (SGML), International Organization for Standardization, 1986.
[ISO92] International Standard ISO/IEC 10744:1992, Information Technology - Hypermedia/Time-based Structuring Language (HyTime), International Organization for Standardization, 1992.
[ISO96] International Standard ISO/IEC 10179:1996(E), Information Technology - Processing languages - Document Style Semantics and Specification Language (DSSSL), International Organization for Standardization, 1996.
[KKS92] M. Kifer, W. Kim, and Y. Sagiv, "Querying Object-Oriented Databases", Proc. ACM SIGMOD Conf., 1992, 393-402.
[KL89] M. Kifer and G. Lausen, "F-Logic": A Higher-Order Language for Reasoning about Object, Inheritance, and Scheme", Proc. 1989 ACM-SIGMOD Intl. Conf. on Management of Data, 1989. See also other papers on F-logic and related formalisms <http://www.cs.sunysb.edu/~kifer/dood/>.
[KLW95] M. Kifer, G. Lausen, and J. Wu, "Logical Foundations of Object-Oriented and Frame-Based Languages", Journal of the ACM, July 1995, 741-843.
[KR97] R. Khare and A. Rifkin, "XML: A Door to Automated Web Applications", IEEE Internet Computing, 1(4), July-August 1997, 78-87.
[Man93] F. Manola, "MetaObject Protocol Concepts for a 'RISC' Object Model", TR-0244-12-93-165, GTE Laboratories Incorporated, 1993 <ftp.gte.com, directory pub/dom>.
[Man97] F. Manola (ed.), "NICTS Technical Committee H7 Object Model Features Matrix", X3H7-93-007v12b, May 25, 1997, http://www.objs.com/x3h7/h7home.htm.
[MGHH+97] F. Manola, D. Georgakopoulos, S. Heiler, B. Hurwitz, G. Mitchell, F. Nayeri, "Supporting Cooperation in Enterprise-Scale Distributed Object Systems", in M. Papzoglou and G. Schlageter, eds., Cooperative Information Systems, Academic Press, 1997.
[NUWC97] S. Nestorov, J. Ullman, J. Wiener, and S. Chawathe, "Representative Objects: Concise Representations of Semistructured Hierarchical Data", in Proc. Thirteenth Intl. Conf. on Data Engineering, Birmingham, U.K., April 1997.
[OMG95] Object Management Group, The Common Object Request Broker: Architecture and Specification, Revision 2, July, 1995.
[OMG97] Object Management Group, A Discussion of the Object Management Architecture, June, 1997, http://www.omg.org/library/omaindx.htm.
[PGW95] Y. Papakonstantinou, H. Garcia-Molina, and J. Widom, "Object Exchange Across Heterogeneous Information Sources", IEEE Intl. Conf. on Data Engineering, 251-260, Taipei, March 1995. See also the other papers available at the TSIMMIS Publications page <http://www-db.stanford.edu/tsimmis/publications.html>.
[REMB+95] O. Rees, N. Edwards, M. Madsen, M. Beasley, A. McClenaghan, "A Web of Distributed Objects", Proc. Fourth Intl. World Wide Web Conf., World Wide Web Journal, December, 1995, 75-87.
[SG95] N. Singh and M. Gisi, "Coordinating Distributed Objects with Declarative Interfaces", http://logic.stanford.edu/sharing/papers/oopsla.ps.
[SW96] R. Stroud and Z. Wu, "Using Metaobject Protocols to Satisfy Non-Functional Requirements", in C. Zimmermann (ed.), Advances in Object-Oriented Metalevel Architectures and Reflection, CRC Press, Boca Raton, 1996, 31-52.
This research is sponsored by the Defense Advanced Research Projects Agency and managed by the U.S. Army Research Laboratory under contract DAAL01-95-C-0112. The views and conclusions contained in this document are those of the authors and should not be interpreted as necessarily representing the official policies, either expressed or implied of the Defense Advanced Research Projects Agency, U.S. Army Research Laboratory, or the United States Government.
© Copyright 1997, 1998 Object Services and Consulting, Inc. Permission is granted to copy this document provided this copyright statement is retained in all copies. Disclaimer: OBJS does not warrant the accuracy or completeness of the information in this survey.
This page was written by Frank Manola. Send questions and comments about it to fmanola@objs.com.
Last updated: 2/10/98 fam