AgentGram: Natural Language Interface for Agents
Project Summary
Paul Pazandak
and Craig Thompson
Object Services and Consulting, Inc.
17 June 2002
Contents
Executive Summary
Objective
The objective of the Agentgram project was to develop
a modular menu-based natural language interface (MBNLI) component
that can be used in client-server web environments as a front-end to agents
and other Internet resources (e.g., data sources). From the point
of view of Agility's goal to demonstrate agent capabilities that scale
to mass markets, the impact we were aiming at was to develop a technology
that enables humans, anywhere on the semantic web, to task and query remote
agents and Internet resources using complex but undertandable commands
in constrained natural language.
Specific scientific and engineering subgoals were:
-
enable MBNLI on any webpage as a way to communicate
with remote web resources, e.g., agents, databases, ...
-
semi-automate generation of MBNLI interfaces
-
prototype a companion speech interface to MBNLI
-
develop MBNLI as a component that can interface to
other components and can connect to the CoABS grid
-
demonstrate MBNLI in scenarios of interest to DoD
Our primary thesis is that MBNLI can operate on the desktop or web to provide
naive users with natural language interfaces they can actually use.
Thus, MBNLI is a signficant potential step towards the development of a
more semantic web and to "scaling agent systems to the masses."
Background
Menu-based natural language interface technology (MBNLI) combines
constrained grammars, a predictive parser, and interface technology to
provide users with a guided natural language query and command capability.
As explained in MBNLI
Overview, this approach bypasses frustrating habitability problems
that other NL interface technologies suffer from where users undershoot
or overshoot the NL systems' capabilities.
Technical Accomplishments
Under the DARPA CoABS contract, we extended MBNLI in the following ways:
Initial
AgentGram Prototype
An initial AgentGram prototype, developed in 1999,
was based on the notion of distributed agents which contain grammars that
can be dynamically composed. Users were able to construct (using
cascading menus) complex sentences (commands/queries) which simultaneously
involve the grammars of several agents. The grammars were dynamically loaded
from web-based agents on demand. This first implementation focused
on dynamically constructing restricted English phrases from the partial
grammars of multiple distributed agents simultaneously. The result is a
readable sentence which represents a complex executable command.
See screenshots of this AgentGram
prototype. This implementation was somewhat simplistic, using tree
grammars represented in XML. Later implementations extended the MBNLI
toolkit which permits attributed context free grammars.
Web-ready MBNLI
This task was the heart of the AgentGram project. The objective was
to enable humans anywhere on the semantic web to task and query remote
agents and Internet resources using complex but undertandable commands
in constrained natural language. The interfaces appear as annotations
on web pages. The system should scale to any number of users, grammars,
webpages, and target resources. The system should be deployable with
no effort by the user (no explicit downloading action). This is a
step in making agent technology pervasive. Making MBNLI web-ready,
required re-engineering several parts of the original system:
-
We re-implemented the front-end user interface. We added the ability
to support alternative interaction paradigms including cascaded menus (reimplementing
the Java Swing menus placement algorithm for cascading menus) and phrase
buttons (an alternative interaction paradigm to minimize screen real estate).
-
We worked on having thin MBNLI interfaces (little download and no install
overhead so no barrier of use). We considered several approaches
- refreshing whole pages, applets, and downloading the entire parser.
The first approach appears awkward. The last is OK for demos and
for users that want full service but not for establishing wide-spread adoption
by end-users.
-
We built a prototype applet that handles menu selection but where the parser
is remote. Initially we used two-way RMI but found that that involved
applets signing certificates and that it violated browser security. We
looked into executing the RMI version of the applet in IE (which required
downloading and installing IE5.0) but the applet wouldn't run and IE's
console window provided a cryptic message. We redesigned the
applet to eliminate two-way RMI. We downloaded and installed TinyWebServer,
a 48k HTTP web-server, so he could test the MBNLI Applet. We completed
an initial working applet with expert support. Different portable specs
are accessed via different applets which are parameterized with information
about the portable spec.
-
We later implemented a stateless C-based (cgi) front-end to the parser
which generates HTML/Javascript, no Java at all. It provides an interface
much like the applet version and has support for experts, but it is smaller
and faster.
-
We designed and implemented a web-based multi-threaded MBNLI parser farm
enabling parsers to be instantiated on the fly. The parser farm manages
the set of active parsers and routes user requests to the appropriate parser
based on the grammar and lexicon requested. If such a parser doesn't
exist, then a new parser is started. A basic security model was also
implemented.
-
We developed a grammar-on-the-fly capability for MBNLI. This allows the
user to select and change between grammars after MBNLI is up and running,
rather than requiring this information at start-up. We added APIs to the
parser and modified the NLI UI by adding menus and file dialogs to permit
selection of portable specs.
-
We developed a browser-based XML-driven dynamic interface for MBNLI.
Internet Explorer allows page updates without refresh by supporting data-driven
components. This approach enables MBNLI to use a single static interface
within a browser as opposed to refreshing for every change (involving the
server).
-
We reimplemented the experts API, adding new associated classes, and creating
several new experts (code and UIs to facilitate and constrain user input)
which can be invoked via definitions in the portable spec.
-
We extended the web version of MBNLI to support remote query execution
and local display of results using PHP
to handle CGI <-> ODBC.
-
We modified MBNLI to work on Win95 and NT machines with Winsock2 installed.
This involved converting the parser/lincoo from Unix/C++ to Win95/NT MSVC,
eliminating dependence on cygwin DLLs, and developing code to support Win32
sockets to allow them to be treated exactly like files. This reduced
the backend codebase from 900k to about 338k.
-
We benchmarked MBNLI and made various other improvements. However,
more work is still needed here before we get a good overall picture of
how to scale the design to 100s or 1000s of agents simultaneously using
the parser across the web.
MBNLI
Interface Generator for DBMSs
A subproblem in making MBNLI widely useful is generating
new MBNLI interfaces. If this requires specialized knowledge, it
will slow down the process of scaling the technology for widespread use.
The initial AgentGram prototype described above provides a simple way to
do this for very simple tree structured grammars represented in XML.
This is simple enough for many web developers to use as is. The original
MBNLI prototype provided a grammar parameterized with DBMS elements stored
in a .spc file in Lisp syntax but creating such files was tedious, error
prone, and required a Lisp background.
(defrel Elephant
:key-attrs
(Name Time )
:default-attrs
(Name Location Altitude Velocity AirTemp Humidity BodyTemp
BloodPressure Pulse BasalSkinResponse Time Herd )
:menu-string
((:default "elephants")
(of "of elephants")))
(defattr Elephant Name
:type
STRING
:menu-string
((:default "elephant's name")
(:short "name")
(:plural "names")
(:whose-is-default "where the elephant's name is")
(:whose-is-short "whose name is"))
:op-prop
( :comparable :groupable)
:trx "
Elephant.Name"
:expert
"DBCHOICE RelName=Elephant AttrName=Name")
...
A Fragment of a Portable Spec in Lisp from the
CoAX TIE
A first step was to develop an equivalent XML representation, as shown
in the following example.
<?xml version="1.0"
encoding="UTF-8" ?>
<!DOCTYPE grammar PUBLIC "-//OBJS//DTD
NLI-PSEditor//EN" "file://C:/MBNLI/PSE/portableSpec.dtd">
<portableSpec >
<relation
name="Elephant">
<attrInfo >
<keyAttrs >
<attrName name="Name"/>
<attrName name="Time"/>
</keyAttrs>
<defaultAttrs >
<attrName name="Name"/> <attrName name="Location"/>
<attrName name="Altitude"/> <attrName name="Velocity"/>
<attrName name="AirTemp"/>
<attrName name="Humidity"/> <attrName
name="BodyTemp"/> <attrName name="BloodPressure"/> <attrName
name="Pulse"/>
<attrName name="BasalSkinResponse"/> <attrName
name="Time"/> <attrName name="Herd"/>
</defaultAttrs>
</attrInfo>
<menuStrings >
<menuEntry type="default" name="elephants"/>
<menuEntry type="of" name="of elephants"/>
</menuStrings>
<relAttrs >
<relAttrChild name="Name" type="STRING" txString="
Elephant.Name" expert="DBCHOICE RelName=Elephant AttrName=Name">
<menuStrings >
<menuEntry type="default" name="elephant's name"/>
<menuEntry type="short" name="name"/>
<menuEntry type="plural" name="names"/>
<menuEntry type="whose-is-default" name="where the
elephant's name is"/>
<menuEntry type="whose-is-short" name="whose name
is"/>
</menuStrings>
<operator name="comparable"/>
<operator name="groupable"/>
</relAttrChild>
...
A Corresponding Fragment of a Portable Spec represented
in XML
Then, to largely automate the process of quickly
developing MBNLI interfaces to DBMSs, we developed the PSEditor GUI.
The PSEditor enables MBNLI users to quickly create and edit specifications
used by MBNLI to generate MBNLI interfaces to tables in a relational DBMS.
The PSEditor uses the XML-based format for SQL-related portable specifications.
It also accepts the original Lisp syntax. PSEditor is composed of about
60 Java classes and is about 131K (source code size). PSEditor
GUI can be useful for desktop or Internet-based deployment of MBNLI.
A screenshot of the PSEditor editing the NEO TIE tables is shown below.
The final step was to develop a
utility to read RDBMS catalogs. Amazingly, searching many documents,
KBs, and posts to newsgroups turned up nothing of use, and there does not
appear to be a standard catalog export format or utility. After considerable
experimentation, we completed an ODBC schema import and translation capability
so that, in a portable way (so far tested with Oracle 8 and Microsoft Access),
database schemas (tables, columns, primary keys, and joins) exported from
a relational DBMS can be used to automatically define initial natural language
interfaces for use with MBNLI. The capability has been integrated with
the MBNLI Portable Specification Editor, which allows editting of the generated
interface and translation to the MBNLI portable specification format (see
.avi
movie). This allows the rapid creation of new AgentGram interfaces
by relatively naive users.
Speech Interface
We also wanted to support a speech interface to MBNLI
so speakers could simply read the menu choices. We reviewed W3C
Voice Browser standards and then considered several commercial
speech interfaces, including JavaSoft's Java
Speech API, IBM's ViaVoice
Technology, and IBM's Speech
for Java (the only one of these products to support a Java API
at the time). We fairly rapidly completed a rough proof-of-concept
integration of MBNLI and IBM Via Voice based on IBM's Speech for
Java API with grammar rules dynamically defined using Sun's proposed grammar
standard JSGF. This enables users to compose sentences using speech
or via menu selection.
Gridifying MBNLI
CoABS Grid. The CoABS
grid is a JINI-based implementation of an agent interoperability platform
developed by GITI, the DARPA CoABS program integration contractor.
It is an important, on-going experiment in agent system interoperability.
As described elsewhere, we contributed architectural ideas to the grid.
But in addition, we developed three standalone agent components (eGents,
WebTrader, and AgentGram) that can play a role as grid components or services.
As part of the Agility AgentGram project, we developed the the grid-relevant
capabilities described below. At the same time, we note that AgentGram
can also standalone as a potentially pervasive capability that could be
tied into any future grid implementation.
-
MBNLI Interface to Grid Log. For
the CoABS Science Fair in November 1999, we demonstrated an interface to
the CoABS grid log that allowed users to query the log files via MBNLI.
This was done by first defining a database import facility to import the
CoABS grid log files from XML into an Access relational DBMS, then developing
an associated schema.
-
MBNLI Grid Agents. We
developed the following Grid agents, demonstrated at the CoABS Boston meeting:
-
MBNLIGridAgentTester - this agent has a GUI
and can register or deregister itself on the grid. Once registered,
it asks for all AgentGram agents on the grid and lets the user choose one,
then establishes an AgentGram session
-
MBNLIGridAgents - these are remote agents,
one per AgentGram interface (e.g., one for the DAVCO DBMS, another for
the Grid Log interface). These agents offer a programmatic interface
for controlling a session, accepting messages to get parse state, translation,
and results of an execution.
-
BrowserAgent - pops up a Netscape or IE browser
on the user's machine to permit the user to query the selected MBNLIGridAgent
and see query results.
-
Launch Page. In October, 2000, we
created a launch page capability
for the 7x24 grid - the page describes MBNLI and allows users to launch
MBNLI demos. In November 2000, we converted MBNLI to
function on the then latest version of the grid. MBNLI agents were
maintained for a year on the 24x7
Grid (see the grid archives available at that web site).
Technology Transition
We demonstrated evolving versions of MBNLI at all CoABS PI Workshops and
the CoABS Science Fair.
We applied MBNLI in the following DARPA CoABS Technology Integration
Experiments (TIEs):
NEO TIE
The Non-combatant Evaluation Order (NEO) TIE
involved an urban rescue effort and served to organize many CoABS program
activities in the first year of the CoABS program. The thrust was
on agent interoperability and rapid assembly of heterogeneous agent systems
to solve problems. Lessons learned and components from this exercise
where later incorporated into the CoABS grid.
Paul Pazandak (OBJS) participated in a NEO TIE
organizational meeting held at ISX in Agoura Hills (Los Angeles area) on
28-29 September 1998. Based on the meeting and subsequent discussions,
we determined that MBNLI should play a role, and worked with Steve
Minton (ISI Ariadne, TIE#2 coordinator) and Adam
Cheyer (SRI Open Agent Architecture/Multimodal Map aka MMM) to develop
a vignette (TIE #2) involving Find
Civilians, Get Them to Embassy. In this vignette, OBJS MBNLI
was used to query relationally formatted data. OAA/MMM was used to
provide a speech interface and as a general controller. USC/ISI Ariadne
was used to extract data on civilian locations from various web resources
into a relational format. Minton supplied a relational schema that
we used this to parameterize MBNLI to define a restricted language interface
to the Ariadne data. The interactions between Ariadne, MMM, and MBNLI
(as well as OBJS WebTrader) are shown in the figure below.
The TIE required the following extensions to MBNLI:
-
OAA-compatible MBNLI wrapper agent
-
limited class inheritance capability (IS-A) and improvements
to MBNLI grammar to project all join attributes for TIE joins,
e.g. making "List the people and their addresses" equivalent to "List the
people who have addresses -format including name, phone, address, latitude,
longitude"
The following figure shows the NEO TIE MBNLI interface:
CoAX TIE
In 2000 and 2001, we participated in the CoAX
TIE aimed at demonstrating CoABS technology in a coalition
scenario. Our work on AgentGram was featured in the Laki Safari Park
Vignette described below. In this vignette, OBJS eGents
(agents that communicate using email) send biosurveillance reports (e.g.,
location of elephants threatened by a planned UN firestorm in Safari Park
Binni Wildlife vignette) to a DBMS. AgentGram was used to find the location
of elephants near the planned UN firestorm. This was demoed at CoABS
Workshop in Miami and Nashua. See CoAX
TIE avi (.exe includes TechSmith TSCC Codec and viewer - 4.2MB).
This research is sponsored by the Defense Advanced Research
Projects Agency and managed by the U.S. Air Force Research Laboratory under
contract F30602-98-C-0159. The views and conclusions contained in this
document are those of the authors and should not be interpreted as representing
the official policies, either expressed or implied, of the Defense Advanced
Research Projects Agency, U.S. Air Force Research Laboratory, or the United
States Government.
© Copyright 1998, 1999, 2000, 2001 Object Services
and Consulting, Inc. All rights reserved. Permission is granted
to copy this document provided this copyright statement is retained in
all copies. Disclaimer: OBJS does not warrant the accuracy or completeness
of the information in this survey.
Last revised: June 2002. Send comments to
Craig
Thompson.
Acknowledgements: Paul Pazandak did most of the
design and implementation with Craig Thompson providing brainstorming and
review. Steve Ford installed and demoed MBNLI at CoABS Workshops
that Pazandak did not attend. Thompson completed the CoAX TIE demo.