HTML is the standard markup language for specifying the format of documents in the WWW. HTML commands embedded within a document are interpreted by Web Browsers for display and link traversal. HTML is embedded into documents, much like a very simple version of LaTex. There are three levels of HTML compliance. The third (highest) level has two major variants: the "official" version and Netscape enhancements. As more features are added to HTML it becomes possible to create increasingly sophisticated documents, but at the expense of an increasingly complex markup language. It is reasonable to foresee HTML becoming as complex as LaTex, with families of macros supporting different "document" styles as opposed to the current level where styles apply only to individual elements.
The increasing complexity of HTML, the preference many "content
creators" have for WYSIWYG editors, and the existence of
many documents without HTML embeddings leads to a desire to simplify
the HTML authoring process. The tools in this category either
automate or facilitate the process of inserting HTML markup.
We believe the three most salient issues in authoring HTML documents are how the HTML is created, the structure of the tools that create the HTML, and the level of integration of the HTML-ness of the tool with whatever other functionality it provides. For a good summary of the issues regarding HTML authoring and maintenance, see also Interleaf White Paper.
Existing text can be automatically translated to HTML. Alternatively, HTML can be created or converted manually by a user.
Authoring tools are structured as either editors or filters.
Integration of HTML capabilities with other capabilities of the tool are an issue only with HTML editors, since HTML filters in general have no other capabilities.
Key integration issues for HTML editors are how robust the general word processing capabilities are (e.g., is there spell checking), and how well the HTML facilities are integrated with the rest of the word processing capabilities (e.g., can you spell check something in HTML format without the spell checker flagging all the HTML mark-up as spelling errors).
High-end text word processors like FrameMaker are tightly integrated and support direct creation of HTML as part of normal editing. They also tend to be able to import and convert more kinds of document formats and styles than lower priced editors. Lower-priced HTML editors typically provide less of what is considered normal word processing functionality and thus cannot also serve as general purpose editors. The cheapest way to create an HTML editor is by using "HTML add-ins" for popular word processors like MSWord. This is a popular way to add HTML functionality because it preserves the existing software and experience base for the users; add-ins are often provided free by the maker of the editor, at least in reasonable beta versions. Add-ins can usually convert styles known to the base editor (see filtering). The biggest drawback to add-ins is that not all of them are tightly integrated with the original editor and the seams between the base and add-in can show in odd ways.
The many levels and variants of HTML, along with the number of HTML clients (viewers) have created a situation where many browsers support only parts of "full" HTML. Further, because browsers have some degree of freedom in how they will display a given piece of HTML-marked material, even if a given markup is supported it will not always display the same way under different browsers. This becomes more of a problem as the design of pages becomes more sophisticated, since a great deal of work can go into a "look and feel" that turns out to display oddly in some browsers. When using an integrated system (see Hypermedia Systems and WWW Browsers), the problem is eliminated. However, in the open world, HTML compliance is a real problem. To combat this, a number of HTML Test Patterns are now available; e.g., WWW Test Pattern. These Test Patterns are WWW pages that exercise browsers with suites of HTML markup at various levels (e.g., a Level 2 pattern, a NetScape extensions pattern).
A final item for authoring are templates (stylesheets) for "good
looking" pages for various kinds of things. Some products
such as Interleaf provide a template library; I have not run across
a stand-alone library.
There are a tremendous number of systems that support HTML authoring, in fact, this seems to now be a fairly standard class project for graduate students. As a result of this surfeit of systems, we quickly decided that finding a system compatible with any given environment and feature requirements would not be hard. Many of these are available free over the Internet from major vendors as add-ins to existing commercial products such as MS Word. We also determined that architecturally, authoring tools are either stand-alone and thus pose no particular architectural puzzles, or that they are designed to integrate into (as an add-in) other existing systems such as general text editors. In consequence, we cut short our survey of such tools in order to concentrate on other areas. Below, we list pointers to other reviews and to tools.
This research is sponsored by the Defense Advanced Research Projects Agency and managed by the U.S. Army Research Laboratory under contract DAAL01-95-C-0112. The views and conclusions contained in this document are those of the authors and should not be interpreted as necessarily representing the official policies, either expressed or implied of the Defense Advanced Research Projects Agency, U.S. Army Research Laboratory, or the United States Government.
© Copyright 1996 Object Services and Consulting, Inc. Permission is granted to copy this document provided this copyright statement is retained in all copies. Disclaimer: OBJS does not warrant the accuracy or completeness of the information on this page.
This page was written by David Wells. Send questions and comments about it to wells@objs.com.
Last updated: 04/22/96 7:40 PM
Back to Internet Tool Survey -- Back to OBJS