If the Web is to be used as the basis of complex enterprise applications, it must provide generic capabilities similar to those provided by the OMA (although these may need to be adapted to the more open, flexible nature of the Web, and specific requirements of Web applications). This involves such things as providing higher-level services (such as enhanced query and transaction support) and their composition in the Web. However, the basic data structuring capabilities provided by the Web must also be addressed, since the ability to define and apply powerful generic services in the Web, and the ability to generally use the Web to support complex applications, depends crucially on the ability of the Web's underlying data structuring facilities to support these complex applications and services.
A fundamental direction of efforts to address the limitations of current Web data structuring technology has been attempts to integrate aspects of object technology with the basic infrastructure of the Web. This paper describes a number of new Web technologies currently being developed that further this integration. These technologies provide the basis for integrating data and behavior in the Web, effectively providing the basis of a "Web object model", and allowing the construction of object-like facilities to address the requirements of more powerful Web applications.
A fundamental direction of efforts to address HTML limitations has been attempts to integrate aspects of object technology with the basic infrastructure of the Web. There are a number of reasons for the interest in integrating Web and object technologies. For example:
A number of new Web technologies are being developed to address the limitations of current Web data structuring technology. In addition to their individual contributions towards improving the functionality of the Web, these Web technologies can be understood as enhancing the capabilities of the Web in supporting an object model [Man98a,b]. This is based on the observation that key components of any object model are:
Thinking in this way, the "Web object model" can be improved by providing:
The XML definition [xmllang] provides specifications for both XML documents, and XML Document Type Definitions (DTDs). An XML document need not have a DTD, but it must be at least well-formed. This means that it must follow a number of simple rules to ensure that it can be parsed properly without a DTD, e.g., an element must generally have both start and end tags, and elements must properly nest inside each other. Applications can readily use well-formed XML documents for data interchange. The following is an example of a well-formed XML document describing a publication:
<?xml Version="1.0"?> <PUBLICATION> <TITLE>Why I am Overworked</TITLE> <AUTHOR role="author"> <FIRSTNAME>Fred</FIRSTNAME> <LASTNAME>Smith</LASTNAME> <COMPANY>Jones and Associates</COMPANY> </AUTHOR> <ABSTRACT>This is the text of the abstract</ABSTRACT> </PUBLICATION>A valid document is well-formed, has a DTD, and the document structure conforms to the DTD. A DTD is a formal definition of a particular type of document. The DTD specifies which element tags can appear in the document, which attributes are associated with each element, and the permitted structure of the elements (e.g., <COMPANY> may only appear inside <AUTHOR>). The following is a DTD for the above document:
<?xml Version="1.0"?> <!DOCTYPE PUBLICATION [ <!ELEMENT PUBLICATION (TITLE,AUTHOR+,ABSTRACT*)> <!ELEMENT AUTHOR (FIRSTNAME, LASTNAME, (UNIVERSITY | COMPANY)?)> <!ATTLIST AUTHOR role (author|techwriter) "author"> <!ELEMENT FIRSTNAME (#PCDATA)> <!ELEMENT LASTNAME (#PCDATA)> <!ELEMENT UNIVERSITY (#PCDATA)> <!ELEMENT COMPANY (#PCDATA)> <!ELEMENT ABSTRACT (#PCDATA)> ]>The linking of resources with their DTDs is similar to the association of a database record with its schema type, and to the association of an object with its type or class definition. DTDs are particularly useful in supporting document creation and editing.
An XML document need not be structured as a single piece of text, stored a single file. Instead, the document can be composed of separate pieces, called entities. The entities may be in separate files identified by URLs (external entities). Internal entities may also be used within a document to define units of text that are reused in several places within a document. XML also supports unparsed entities to represent non-XML resources (e.g., images) that are to form part of an XML document. Each unparsed entity has a NOTATION declaration that identifies the entity's representation, or a processor that can be called by the XML application to process the entity.
XML is an approved W3C recommendation. Additional capabilities for XML are also under development (but not yet approved by W3C). Unlike HTML, XML per se provides no facilities for defining the presentation aspects of documents (e.g., whether certain text should be in a specific size or color). Instead, the presentation aspects of XML documents are intended to be described using separate stylesheets. As a result, stylesheets play a much more important role in XML than they do in HTML. The Extensible Stylesheet Language (XSL) [xsl] defines stylesheet capabilities for XML documents. XSL capabilities are resemble in some respects those of the ISO Document Style Semantics and Specification Language (DSSSL) [iso96] used in formatting SGML documents. XML linking capabilities are described in the XML Linking Language (XLink) [xll] and XML Pointer Language (XPointer) [xptr] specifications. These linking capabilities are much more powerful than those of HTML, providing support for both bidirectional and multi-way links, as well as links to a span of text within the same or other documents. An XML namespace facility [namespace] is also under development. This allows the definition of prefixes associated with URI-identified collections of names (namespaces). These prefixes can be used with element (tag) names to prevent name clashes when developing documents that mix elements from different namespaces.
XML has considerable industry support. For example, Microsoft has built some XML support into Internet Explorer 4 (and further support into IE 5), made available several XML-based tools [msxml], and contributed a number of proposals to W3C on XML extensions and applications; Netscape has made similar contributions. A number of industry groups have defined SGML DTDs for their documents (e.g., the U.S. Defense Department, which requires much of its documentation to be submitted according to SGML DTDs). In many cases these could be either used with XML directly or converted in a straightforward fashion. Work is already underway to define XML-based data exchange formats in both the chemical and healthcare communities, as well as on other applications of XML.
XML does not itself address all of the Web's data structuring requirements, but it does provide a solid data representation in terms of which higher-level capabilities can be defined. In particular, it provides the basis for associating application-specific processing with particular elements, and hence can form the basis of an "object state" representation.
Some of this work has involved attempts to define abstract metadata models, or the basic principles of such models. For example, the Dublin Core defines a minimum standard vocabulary (e.g., TITLE, CREATOR, SUBJECT) for describing documents to support applications such as search and automatic indexing. The work also describes a number of general requirements that need to be supported in any general metadata model, such as the need to support multiple levels of metadata (metadata about metadata), and multiple levels of granularity. The Warwick Framework defines a metadata container architecture that describes how multiple, separately-managed metadata sets should be defined, managed, and associated with the resources they describe. The W3C has done considerable work on metadata mechanisms for describing various characteristics of Web resources. Relevant W3C metadata technologies include the Platform for Internet Content Selection (PICS) and its generalization, the Resource Description Framework (RDF) Web metadata models. PICS defines a set of technologies for defining machine-readable descriptions of rating systems for labeling Internet content, generating content labels according to those rating systems, associating labels with specific Internet resources, and distributing the labels to Web clients as the resources are accessed. All these efforts have been discussed in a recent paper by Ora Lassila [las98].
The RDF combines extensions of the PICS technology with other work on metadata models to support such applications as resource discovery by search engines, cataloging, knowledge sharing and exchange by intelligent software agents, and electronic commerce. RDF [rdflang] defines both a data model for representing RDF metadata and an XML-based syntax for expressing and transporting the metadata. The basis of RDF is a model for representing named properties and their values which form logical assertions about Web resources identified by URLs. The model is based on propositional logic, plus certain modalities. RDF properties can represent both attributes of resources and relationships between resources. RDF also allows the reification of properties, so that individual properties and assertions can themselves be described by properties. Assertions may be associated with the resource they describe in several ways, e.g., embedded in the resource, external to the resource but supplied with the resource in the same retrieval transaction, or retrieved from a separate source. RDF also supports the specification of standard vocabularies (but does not impose one).
XML can also be interpreted as representing Web information in the form of properties (identified by tags) and their values (the contents of the tagged elements), and hence an XML document can also be interpreted as a set of logical assertions. This assertion-based interpretation of Web-represented information, the ability to make assertions about those assertions, and the ability to use URLs (or URIs) as a universal identifier mechanism, creates the basis for a formal model of the Web [ber97], and facilitates the use of more "intelligence" in processing Web information.
Another important area of activity in supporting expanded applications for XML is work on improved facilities for defining data type information for XML. DTDs currently provide only limited support for defining what would be recognized as data types in an object model, or schemas in a database. For example, it is not currently possible to directly specify that a particular XML element is to contain an integer value. A number of proposals have been made for alternative schema and data type facilities for XML, including:
DOM is based on Dynamic HTML facilities defined by Microsoft and Netscape. DOM Level 1 extends these capabilities to, for example, allow creation "from scratch" of entire Web documents in memory by creating the appropriate objects. However, DOM does not yet implement all the Dynamic HTML facilities currently available (for example, an event mechanism is not yet defined). These and other capabilities will be defined in later DOM levels. However, by providing an API to document contents, DOM provides the foundation for integrating a document's data with processing code.
These mechanisms for associating behavior with HTML pages involve the use of special HTML elements, e.g., OBJECT and SCRIPT, with pre-defined content types and processing behaviors. XML currently does not mandate any particular way of incorporating behavior; e.g., it does not specify such pre-defined elements or behaviors. However, the ability to associate behavior with particular parts of XML pages is particularly important, since each XML element type potentially represents distinct semantics, and hence may need to be associated with behavior specific to those semantics.
XML does provide the ability to define elements that could contain scripts or other behavior representations, or pointers to them. These elements could then be processed (interpreted) by separate interpreters (much as Web browsers refer HTML SCRIPT and OBJECT elements to specific interpreters). For example, one approach to representing behavior in XML and associating it with an XML page would be to simply define an unparsed entity to hold the behavior representation (e.g., the script or applet) as part of the page's content. For example, a NOTATION definition could be specified to associate the notation name JavaScript with a processing engine (interpreter) for JavaScript, identified by a URL. One or more entities could then be defined as having the JavaScript notation to hold whatever scripts are needed. The XML application (e.g., the Web client) could then call the JavaScript interpreter in order to deal with those elements (assuming some additional infrastructure to define when the scripts were to be called, etc.). However, this approach defines a rather tight coupling between the XML and the associated behavior.
XML-related technology is also being developed to support more flexible ways of associating behavior with XML pages or parts of pages. A particularly popular approach is based on the use of stylesheets and related techniques. Stylesheets already define some forms of behavior (e.g., what happens during a mouseover), and they allow for such behavior to be associated with specific elements. In addition, they provide flexibility (since different stylesheets can be applied to the same document) and modularity (since the behavior is defined separately from the page itself, and hence can be applied to multiple pages) in associating behavior with Web pages.
One approach to using stylesheets in providing additional behavior is to generalize the types of results that stylesheet formatting can produce, to allow the inclusion of behavior in the resulting page. A step in this direction is a W3C submission from Hewlett-Packard called Spice [spice]. Spice is a combination of ideas from DSSSL, Cascading Style Sheets (CSS), and JavaScript, designed to make it simple to apply style and behavior to XML documents (it can also be applied to HTML). Spice uses CSS-like style rules to associate flow object classes that define formatting tasks with specific elements to be formatted. However, Spice supports not only the predefined set of CSS flow objects, but also downloadable sets of extended flow objects which can be written in Spice, Java, or ActiveX. These flow objects can exploit the full capabilities of the Document Object Model in processing the document contents. To further control behavior, event handlers can be written to script flow objects. This allows the document to be dynamically altered after it has been loaded. (Spice differs from XSL primarily in using CSS syntax for style rules and properties.)
A Netscape submission to W3C describes another stylesheet-related approach called action sheets [actionsheets]. Action sheets provide a mechanism for defining the script-encoded behavior of document elements in a reusable package, separate from the structural definition of a document. In the same way that external stylesheet rules can associate presentation properties with specific XML elements, external action sheet rules can associate arbitrary event handlers with specific XML elements or classes of elements. An action sheet contains a set of productions (rules) somewhat similar in form to XSL template (formatting) rules. Simplifying somewhat, a rule contains a selector (pattern) which defines the document elements to which the rule applies, and an action, which specifies a script to be run for a given action (e.g., an event such as onClick). Action sheets would be associated with XML documents in the same way as stylesheets. Microsoft has defined a somewhat related technology in Internet Explorer 5.0, called DHTML Behaviors [deB98], which allows a scriptlet (see below) to be associated with a particular element using a CSS stylesheet.
Another interesting development in the area of associating behavior with XML is Microsoft's Scriptlets technology [deB98]. Scriptlets allow COM components to be written in a combination of XML and a scripting language. These components can be used in the same way as any other COM component, e.g., they can be used by COM clients such as Microsoft Office or embedded in Web pages like ActiveX components. A scriptlet is defined by a file with a .sct extension, which contains script code and XML markup (using a specialized tag set) that defines the methods, their parameters, and the properties to be exposed by the component. A special DLL acts as a runtime engine to interpret and execute the XML definition of the scriptlet. It also acts as a broker between clients and the scriptlets. Netscape has defined a similar technology for constructing JavaBeans using XML markup and scripts called JavaScript Beans [Nic98]. Microsoft has also defined "DHTML Behaviors", a mechanism which allows scriptlets to be associated as behaviors with specific document elements using CSS stylesheets. Such scriptlets can also expose custom events to the page, access the containing page's DHTML (or DOM) objects, and receive event notifications.
These and other technologies for associating behavior with XML provide a further integration of Web and object technology, by allowing the construction of object-like aggregates of data and behavior. In particular, Scriptlets and JavaScript Beans show that Web technologies may be used to construct objects not only in an extended notion of "object model", but in conventional programming language object models (COM and Java, respectively) using specialized XML markup and scripting as the representation for the object state, methods, and interfaces. With the appropriate interpreters, this same general approach could also be used to directly construct objects in other object models, such as CORBA IDL. While much work needs to be done on mechanisms for associating behavior with XML, numerous options exist, and the existing technologies show a great deal of promise.
The use of XML to represent object interfaces (and, in fact, complete objects) in Microsoft's Scriptlets has already been mentioned. Technologies such as DataChannel's WebBroker[webbroker] represent attempts to build a complete Web-native distributed object computing model, based on the use of XML and HTTP. WebBroker defines DTDs for XML documents that represent method call and return messages between software component objects. A calling component sends an objectMethodRequest to another component, and receives an objectMethodResponse in return. WebBroker also uses XML to represent interface definitions for these objects. In WebBroker, software components become URL-addressable HTTP resources. The Web client contains a Java applet which acts as a client-side broker for remote requests generated by local Java applets. This applet generates XML request messages from these requests and sends them to a server using the HTTP POST method. Request messages include a callback URL to identify the client. A Java servlet on the server formats the XML request into a call to the appropriate server resource. When the response is ready, it is formatted into an XML response message and sent back to the client using an HTTP POST method to the callback URL. The client also contains an httpd server (a local HTTP server). This client-side server accepts the response and passes it back to the requesting Java applet. WebBroker handles both COM+ and CORBA objects, and has been submitted to the W3C.
DataChannel notes several potential advantages to using XML in a distributed object architecture. For one thing, by using XML, metadata describing object interfaces can be defined as a collection of interlinked XML documents available on a Web repository server, eliminating an unnecessary distinction between this metadata and other information. This would also allow the repository server to provide this information in a single round trip, as opposed to the multiple calls needed to access it using current interfaces. Using XML could also reduce the amount of code needed in lightweight Web clients to handle object messaging, since they will probably be able to process XML already, eliminating the need for extra code to support DCOM or CORBA syntax.
UserLand Software has developed a similar technology called XML-RPC for using XML messages and the HTTP POST method as the basis of remote procedure calls, as part of its Frontier 5 Web content development and management environment. In addition, Microsoft is developing a related protocol, called the Simple Object Access Protocol (SOAP), together with UserLand Software and DevelopMentor. Development of a single XML-based RPC protocol could create the basis of a widely-available "universal ORB" capable of interacting with objects in a wide range of different object models. WebMethods, Inc.'s Web Interface Definition Language (WIDL) [widl, KR97] defines a somewhat similar approach, using XML to define object-like interfaces to Web servers. These interfaces can then be accessed by remote systems using HTTP messages, and provides the structure necessary for generating client code in languages such as Java, C/C++, COBOL, and Visual Basic.
W3C's HTTP-NG project [httpng] is pursuing an approach that is in some sense the opposite of the above technologies. Instead of building a distributed object system on top of the Web, HTTP-NG involves building a distributed object system under the Web, and then converting the current Web to an application of that distributed object system. HTTP-NG represents a longer-term solution to the Web's expansion to include more general distributed applications, based on the idea that layering these applications on top of HTTP will result in problems due to unnecessary performance costs, and lack of functionality and generality.
By basing the Web on a generic distributed object system, the HTTP-NG project hopes to enable distributed applications to use this distributed object system directly. The goal is for the generic distributed object system to be simple, yet rich enough to meet the semantic and performance requirements of CORBA, DCOM, and Java RMI (without, however, unifying their object models). The project has defined:
[ber97] T. Berners-Lee, Metadata Architecture, 1997. <http://www.w3.org/DesignIssues/Metadata>.
[bos97] J. Bosak, "XML, Java, and the Future of the Web" <http://sunsite.unc.edu/pub/sun-info/standards/xml/why/xmlapps.htm>, 1997.
[dcd] T. Bray, C. Frankston, and A. Malhotra, "Document Content Description for XML", W3C Note, World Wide Web Consortium, 1998; <http://www.w3.org/TR/NOTE-dcd>
[deB98] M. De Bruijn, "Internet Explorer 5.0--for Intranets Only?", WEBBuilder, 3(9), Sept. 1998, 25-28.
[dom] L. Wood, et. al., "Document Object Model (DOM) Level 1 Specification", W3C Proposed Recommendation, World Wide Web Consortium, 1998; http://www.w3.org/TR/WD-DOM/.
[httpng] H. J. Nielsen, "Hypertext Transfer Protocol - Next Generation", Overview page, August, 1998. <http://www.w3.org/Protocols/HTTP-NG/>.
[iso96] International Standard ISO/IEC 10179:1996(E), Information Technology-Processing Languages-Standard Generalized Markup Language (SGML), 1986.
[kr97] R. Khare and A. Rifkin, "XML: A Door to Automated Web Applications", IEEE Internet Computing, 1(4), July-August 1997, 78-87.
[las98] O. Lassila, "Web Metadata: A Matter of Semantics", IEEE Internet Computing, 2(4), July-August 1998, 30-37.
[Man98a] F. Manola, "Towards a Web Object Model", Technical Report, Object Services and Consulting, Inc., <http://www.objs.com/OSA/wom.htm>, 1998.
[Man98b] F. Manola, "Some Web Object Model Construction Technologies", Technical Report, Object Services and Consulting, Inc., <http://www.objs.com/OSA/wom-II.htm>, 1998.
[msxml] <http://www.microsoft.com/xml/>
[namespace] T. Bray, D. Hollander, and A. Layman, "Namespaces in XML", W3C Working Draft, World Wide Web Consortium, 1998; <http://www.w3.org/TR/WD-xml-names>.
[Nic98] D. Nickerson, Official Netscape JavaBeans Developer's Guide, Ventana Communications Group, 1998.
[omg97] Object Management Group, A Discussion of the Object Management Architecture, June, 1997; http://www.omg.org/library/omaindx.htm.
[rdflang] O. Lassila and R. R. Swick, "Resource Description Framework (RDF) Model and Syntax", W3C Working Draft, World Wide Web Consortium, 1998; http://www.w3.org/TR/WD-rdf-syntax/.
[rdfschemas] D. Brickley, R. V. Guha, and A. Layman, "Resource Description Framework (RDF) Schemas", W3C Working Draft, World Wide Web Consortium, 1998; http://www.w3.org/TR/WD-rdf-schema/.
[spice] R. Stevahn, "Adding Style and Behavior to XML with a dash of Spice", W3C Note, World Wide Web Consortium, 1998; http://www.w3.org/pub/WWW/TR/NOTE-spice.
[webbroker] J. Tigue and J. Lavinder, "WebBroker: Distributed Object Communication on the Web", W3C Note, World Wide Web Consortium, 1998; http://www.w3.org/TR/1998/NOTE-webbroker.
[widl] P. Merrick and C. Allen, "Web Interface Definition Language (WIDL)", W3C Note, World Wide Web Consortium, 1997; http://www.w3.org/TR/NOTE-widl.
[xmllang] T. Bray, J. Paoli, and C. M. Sperberg-McQueen, "Extensible Markup Language (XML) 1.0", W3C Recommendation, World Wide Web Consortium, 1998; http://www.w3.org/TR/REC-xml.
[xll] E. Maler and S. DeRose, "XML Linking Language (XLink)", W3C Working Draft, World Wide Web Consortium, 1998; http://www.w3.org/TR/WD-xlink.
[xptr] E. Maler and S. DeRose, "XML Pointer Language (XPointer)", W3C Working Draft, World Wide Web Consortium, 1998; http://www.w3.org/TR/WD-xptr.
[xsl] J. Clark and S. Deach, "Extensible Stylesheet Language (XSL)", W3C Working Draft, World Wide Web Consortium, 1998; http://www.w3.org/TR/WD-xsl.
[xmldata] A. Layman, et. al., "XML-Data", W3C Note, World Wide Web Consortium, 1998; http://www.w3.org/TR/1998/NOTE-XML-data.
[xmltext1] D. Megginson, Structuring XML Documents, Prentice Hall, 1998.
[xmltext2] E. R. Harold, XML: Extensible Markup Language, IDG Books, 1998.
Acknowledgements: The author would like to acknowledge the contributions of the OBJS team and the participants in the xml-dev email list to the ideas contained in this paper.
* This research is sponsored by the Defense Advanced Research Projects Agency and managed by the U.S. Army Research Laboratory under contract DAAL01-95-C-0112. The views and conclusions contained in this document are those of the author and should not be interpreted as necessarily representing the official policies, either expressed or implied, of the Defense Advanced Research Projects Agency, U.S. Army Research Laboratory, or the United States Government.
Object (state) Class object +---------------+ +-------------+ | class pointer |------------->| Class data | +---------------+ +-------------+ | variable 1 | | method 1 | | variable 2 | | method 2 | | ... | | ... | | variable n | | method m | +---------------+ +-------------+C++ implementations use similar structures. These structures are determined by the programming language implementation, and are created as necessary to represent the program and its data. The state is a collection of programming language variables, which are operated on by the methods. The class methods define the way the state is interpreted, and hence is a form of metadata for the state, making the link between the object and its class a metadata link.
Extending this idea to the Web, Web pages (or smaller units of Web data) can be considered as state, and objects constructed by enhancing those pages with additional "metadata" that allows the pages to be considered as objects in some object model. In particular, we want to enhance Web pages with programs that act as methods with respect to the "state" represented by the Web page. The resulting structure would, at a minimum, conceptually be something like:
+----------+ +---------->| method 1 | | +----------+ +-------+ + | Web |--+ ... | page |--+ +-------+ | +----------+ +---------->| method n | +----------+The methods could be physically embedded in the page, referenced by embedded or separate pointers (URLs), or associated with the page in some other way, e.g., using stylesheets or some similar technology (there are already a number of mechanisms used in the Web to integrate code (behavior) with Web pages, as described in the text). Unlike a programming language object model, in which the methods are tightly coupled to the state, a Web object model would ideally support a looser coupling of methods and state, so that the information represented by a Web page could be reused for different processing requirements.
In this framework, Web technologies can play the following roles: