AgentGram: Natural Language Interface for Agents

Project Summary

Paul Pazandak and Craig Thompson
Object Services and Consulting, Inc.

17 June 2002

Executive Summary
Objective
Background
Technical Accomplishments

Initial AgentGram Prototype
Web-ready MBNLI
MBNLI Interface Generator for DBMSs
Speech Interface
Gridifying MBNLI

Technology Transition

NEO TIE
CoAX TIE

Executive Summary

Objective

The objective of the Agentgram project was to develop a modular menu-based natural language interface (MBNLI) component that can be used in client-server web environments as a front-end to agents and other Internet resources (e.g., data sources). From the point of view of Agility's goal to demonstrate agent capabilities that scale to mass markets, the impact we were aiming at was to develop a technology that enables humans, anywhere on the semantic web, to task and query remote agents and Internet resources using complex but undertandable commands in constrained natural language.

Specific scientific and engineering subgoals were:

enable MBNLI on any webpage as a way to communicate with remote web resources, e.g., agents, databases, ...
semi-automate generation of MBNLI interfaces
prototype a companion speech interface to MBNLI
develop MBNLI as a component that can interface to other components and can connect to the CoABS grid
demonstrate MBNLI in scenarios of interest to DoD

Our primary thesis is that MBNLI can operate on the desktop or web to provide naive users with natural language interfaces they can actually use. Thus, MBNLI is a signficant potential step towards the development of a more semantic web and to "scaling agent systems to the masses."

Background

Menu-based natural language interface technology (MBNLI) combines constrained grammars, a predictive parser, and interface technology to provide users with a guided natural language query and command capability. As explained in MBNLI Overview, this approach bypasses frustrating habitability problems that other NL interface technologies suffer from where users undershoot or overshoot the NL systems' capabilities.

Technical Accomplishments

Under the DARPA CoABS contract, we extended MBNLI in the following ways:

Initial AgentGram Prototype

An initial AgentGram prototype, developed in 1999, was based on the notion of distributed agents which contain grammars that can be dynamically composed. Users were able to construct (using cascading menus) complex sentences (commands/queries) which simultaneously involve the grammars of several agents. The grammars were dynamically loaded from web-based agents on demand. This first implementation focused on dynamically constructing restricted English phrases from the partial grammars of multiple distributed agents simultaneously. The result is a readable sentence which represents a complex executable command. See screenshots of this AgentGram prototype. This implementation was somewhat simplistic, using tree grammars represented in XML. Later implementations extended the MBNLI toolkit which permits attributed context free grammars.

Web-ready MBNLI

This task was the heart of the AgentGram project. The objective was to enable humans anywhere on the semantic web to task and query remote agents and Internet resources using complex but undertandable commands in constrained natural language. The interfaces appear as annotations on web pages. The system should scale to any number of users, grammars, webpages, and target resources. The system should be deployable with no effort by the user (no explicit downloading action). This is a step in making agent technology pervasive. Making MBNLI web-ready, required re-engineering several parts of the original system:

We re-implemented the front-end user interface. We added the ability to support alternative interaction paradigms including cascaded menus (reimplementing the Java Swing menus placement algorithm for cascading menus) and phrase buttons (an alternative interaction paradigm to minimize screen real estate).
We worked on having thin MBNLI interfaces (little download and no install overhead so no barrier of use). We considered several approaches - refreshing whole pages, applets, and downloading the entire parser. The first approach appears awkward. The last is OK for demos and for users that want full service but not for establishing wide-spread adoption by end-users.

We built a prototype applet that handles menu selection but where the parser is remote. Initially we used two-way RMI but found that that involved applets signing certificates and that it violated browser security. We looked into executing the RMI version of the applet in IE (which required downloading and installing IE5.0) but the applet wouldn't run and IE's console window provided a cryptic message. We redesigned the applet to eliminate two-way RMI. We downloaded and installed TinyWebServer, a 48k HTTP web-server, so he could test the MBNLI Applet. We completed an initial working applet with expert support. Different portable specs are accessed via different applets which are parameterized with information about the portable spec.

We later implemented a stateless C-based (cgi) front-end to the parser which generates HTML/Javascript, no Java at all. It provides an interface much like the applet version and has support for experts, but it is smaller and faster.

We designed and implemented a web-based multi-threaded MBNLI parser farm enabling parsers to be instantiated on the fly. The parser farm manages the set of active parsers and routes user requests to the appropriate parser based on the grammar and lexicon requested. If such a parser doesn't exist, then a new parser is started. A basic security model was also implemented.
We developed a grammar-on-the-fly capability for MBNLI. This allows the user to select and change between grammars after MBNLI is up and running, rather than requiring this information at start-up. We added APIs to the parser and modified the NLI UI by adding menus and file dialogs to permit selection of portable specs.
We developed a browser-based XML-driven dynamic interface for MBNLI. Internet Explorer allows page updates without refresh by supporting data-driven components. This approach enables MBNLI to use a single static interface within a browser as opposed to refreshing for every change (involving the server).
We reimplemented the experts API, adding new associated classes, and creating several new experts (code and UIs to facilitate and constrain user input) which can be invoked via definitions in the portable spec.
We extended the web version of MBNLI to support remote query execution and local display of results using PHP to handle CGI <-> ODBC.
We modified MBNLI to work on Win95 and NT machines with Winsock2 installed. This involved converting the parser/lincoo from Unix/C++ to Win95/NT MSVC, eliminating dependence on cygwin DLLs, and developing code to support Win32 sockets to allow them to be treated exactly like files. This reduced the backend codebase from 900k to about 338k.
We benchmarked MBNLI and made various other improvements. However, more work is still needed here before we get a good overall picture of how to scale the design to 100s or 1000s of agents simultaneously using the parser across the web.

MBNLI Interface Generator for DBMSs

A subproblem in making MBNLI widely useful is generating new MBNLI interfaces. If this requires specialized knowledge, it will slow down the process of scaling the technology for widespread use. The initial AgentGram prototype described above provides a simple way to do this for very simple tree structured grammars represented in XML. This is simple enough for many web developers to use as is. The original MBNLI prototype provided a grammar parameterized with DBMS elements stored in a .spc file in Lisp syntax but creating such files was tedious, error prone, and required a Lisp background.

(defrel Elephant
     :key-attrs (Name Time )
     :default-attrs (Name Location Altitude Velocity AirTemp Humidity BodyTemp
                          BloodPressure Pulse BasalSkinResponse Time Herd )
     :menu-string ((:default "elephants")
                   (of "of elephants")))
(defattr Elephant Name
     :type STRING
     :menu-string ((:default "elephant's name")
                   (:short "name")
                   (:plural "names")
                   (:whose-is-default "where the elephant's name is")
                   (:whose-is-short "whose name is"))
     :op-prop ( :comparable :groupable)
     :trx " Elephant.Name"
     :expert "DBCHOICE RelName=Elephant AttrName=Name")
...
A Fragment of a Portable Spec in Lisp from the CoAX TIE

A first step was to develop an equivalent XML representation, as shown in the following example.

<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE grammar PUBLIC "-//OBJS//DTD NLI-PSEditor//EN" "file://C:/MBNLI/PSE/portableSpec.dtd">
<portableSpec >
   <relation   name="Elephant">
      <attrInfo >
         <keyAttrs >
            <attrName   name="Name"/>
            <attrName   name="Time"/>
         </keyAttrs>
         <defaultAttrs >
            <attrName   name="Name"/> <attrName   name="Location"/> <attrName   name="Altitude"/> <attrName   name="Velocity"/> <attrName   name="AirTemp"/>
            <attrName   name="Humidity"/> <attrName   name="BodyTemp"/> <attrName   name="BloodPressure"/> <attrName   name="Pulse"/>
            <attrName   name="BasalSkinResponse"/> <attrName   name="Time"/> <attrName   name="Herd"/>
         </defaultAttrs>
      </attrInfo>
      <menuStrings >
         <menuEntry   type="default" name="elephants"/>
         <menuEntry   type="of" name="of elephants"/>
      </menuStrings>
      <relAttrs >
         <relAttrChild   name="Name" type="STRING" txString=" Elephant.Name" expert="DBCHOICE RelName=Elephant AttrName=Name">
            <menuStrings >
               <menuEntry   type="default" name="elephant's name"/>
               <menuEntry   type="short" name="name"/>
               <menuEntry   type="plural" name="names"/>
               <menuEntry   type="whose-is-default" name="where the elephant's name is"/>
               <menuEntry   type="whose-is-short" name="whose name is"/>
            </menuStrings>
            <operator   name="comparable"/>
            <operator   name="groupable"/>
         </relAttrChild>
...
A Corresponding Fragment of a Portable Spec represented in XML

Then, to largely automate the process of quickly developing MBNLI interfaces to DBMSs, we developed the PSEditor GUI. The PSEditor enables MBNLI users to quickly create and edit specifications used by MBNLI to generate MBNLI interfaces to tables in a relational DBMS. The PSEditor uses the XML-based format for SQL-related portable specifications. It also accepts the original Lisp syntax. PSEditor is composed of about 60 Java classes and is about 131K (source code size). PSEditor GUI can be useful for desktop or Internet-based deployment of MBNLI. A screenshot of the PSEditor editing the NEO TIE tables is shown below.

The final step was to develop a utility to read RDBMS catalogs. Amazingly, searching many documents, KBs, and posts to newsgroups turned up nothing of use, and there does not appear to be a standard catalog export format or utility. After considerable experimentation, we completed an ODBC schema import and translation capability so that, in a portable way (so far tested with Oracle 8 and Microsoft Access), database schemas (tables, columns, primary keys, and joins) exported from a relational DBMS can be used to automatically define initial natural language interfaces for use with MBNLI. The capability has been integrated with the MBNLI Portable Specification Editor, which allows editting of the generated interface and translation to the MBNLI portable specification format (see .avi movie). This allows the rapid creation of new AgentGram interfaces by relatively naive users.

Speech Interface

We also wanted to support a speech interface to MBNLI so speakers could simply read the menu choices. We reviewed W3C Voice Browser standards and then considered several commercial speech interfaces, including JavaSoft's Java Speech API, IBM's ViaVoice Technology, and IBM's Speech for Java (the only one of these products to support a Java API at the time). We fairly rapidly completed a rough proof-of-concept integration of MBNLI and IBM Via Voice based on IBM's Speech for Java API with grammar rules dynamically defined using Sun's proposed grammar standard JSGF. This enables users to compose sentences using speech or via menu selection.

Gridifying MBNLI

CoABS Grid. The CoABS grid is a JINI-based implementation of an agent interoperability platform developed by GITI, the DARPA CoABS program integration contractor. It is an important, on-going experiment in agent system interoperability. As described elsewhere, we contributed architectural ideas to the grid. But in addition, we developed three standalone agent components (eGents, WebTrader, and AgentGram) that can play a role as grid components or services. As part of the Agility AgentGram project, we developed the the grid-relevant capabilities described below. At the same time, we note that AgentGram can also standalone as a potentially pervasive capability that could be tied into any future grid implementation.

MBNLI Interface to Grid Log. For the CoABS Science Fair in November 1999, we demonstrated an interface to the CoABS grid log that allowed users to query the log files via MBNLI. This was done by first defining a database import facility to import the CoABS grid log files from XML into an Access relational DBMS, then developing an associated schema.

MBNLI Grid Agents. We developed the following Grid agents, demonstrated at the CoABS Boston meeting:

MBNLIGridAgentTester - this agent has a GUI and can register or deregister itself on the grid. Once registered, it asks for all AgentGram agents on the grid and lets the user choose one, then establishes an AgentGram session
MBNLIGridAgents - these are remote agents, one per AgentGram interface (e.g., one for the DAVCO DBMS, another for the Grid Log interface). These agents offer a programmatic interface for controlling a session, accepting messages to get parse state, translation, and results of an execution.
BrowserAgent - pops up a Netscape or IE browser on the user's machine to permit the user to query the selected MBNLIGridAgent and see query results.

Launch Page. In October, 2000, we created a launch page capability for the 7x24 grid - the page describes MBNLI and allows users to launch MBNLI demos. In November 2000, we converted MBNLI to function on the then latest version of the grid. MBNLI agents were maintained for a year on the 24x7 Grid (see the grid archives available at that web site).

Technology Transition

We demonstrated evolving versions of MBNLI at all CoABS PI Workshops and the CoABS Science Fair.

We applied MBNLI in the following DARPA CoABS Technology Integration Experiments (TIEs):

NEO TIE

The Non-combatant Evaluation Order (NEO) TIE involved an urban rescue effort and served to organize many CoABS program activities in the first year of the CoABS program. The thrust was on agent interoperability and rapid assembly of heterogeneous agent systems to solve problems. Lessons learned and components from this exercise where later incorporated into the CoABS grid.

Paul Pazandak (OBJS) participated in a NEO TIE organizational meeting held at ISX in Agoura Hills (Los Angeles area) on 28-29 September 1998. Based on the meeting and subsequent discussions, we determined that MBNLI should play a role, and worked with Steve Minton (ISI Ariadne, TIE#2 coordinator) and Adam Cheyer (SRI Open Agent Architecture/Multimodal Map aka MMM) to develop a vignette (TIE #2) involving Find Civilians, Get Them to Embassy. In this vignette, OBJS MBNLI was used to query relationally formatted data. OAA/MMM was used to provide a speech interface and as a general controller. USC/ISI Ariadne was used to extract data on civilian locations from various web resources into a relational format. Minton supplied a relational schema that we used this to parameterize MBNLI to define a restricted language interface to the Ariadne data. The interactions between Ariadne, MMM, and MBNLI (as well as OBJS WebTrader) are shown in the figure below.

The TIE required the following extensions to MBNLI:

OAA-compatible MBNLI wrapper agent
limited class inheritance capability (IS-A) and improvements to MBNLI grammar to project all join attributes for TIE joins, e.g. making "List the people and their addresses" equivalent to "List the people who have addresses -format including name, phone, address, latitude, longitude"

The following figure shows the NEO TIE MBNLI interface:

CoAX TIE

In 2000 and 2001, we participated in the CoAX TIE aimed at demonstrating CoABS technology in a coalition scenario. Our work on AgentGram was featured in the Laki Safari Park Vignette described below. In this vignette, OBJS eGents (agents that communicate using email) send biosurveillance reports (e.g., location of elephants threatened by a planned UN firestorm in Safari Park Binni Wildlife vignette) to a DBMS. AgentGram was used to find the location of elephants near the planned UN firestorm. This was demoed at CoABS Workshop in Miami and Nashua. See CoAX TIE avi (.exe includes TechSmith TSCC Codec and viewer - 4.2MB).

This research is sponsored by the Defense Advanced Research Projects Agency and managed by the U.S. Air Force Research Laboratory under contract F30602-98-C-0159. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Defense Advanced Research Projects Agency, U.S. Air Force Research Laboratory, or the United States Government.

© Copyright 1998, 1999, 2000, 2001 Object Services and Consulting, Inc. All rights reserved. Permission is granted to copy this document provided this copyright statement is retained in all copies. Disclaimer: OBJS does not warrant the accuracy or completeness of the information in this survey.

Last revised: June 2002. Send comments to Craig Thompson.

Acknowledgements: Paul Pazandak did most of the design and implementation with Craig Thompson providing brainstorming and review. Steve Ford installed and demoed MBNLI at CoABS Workshops that Pazandak did not attend. Thompson completed the CoAX TIE demo.