Lessons Learned from Programming Languages will benefit Component-Based
Architectures
Andrew Tolmach, Dick Kieburtz, Tim Sheard
Pacific Software Research Center
Oregon Graduate Institute and Portland State University
{apt,dick,sheard}@cse.ogi.edu
Introduction
The programming language community has learned lessons about modular systems
and protocols that can help develop more robust and efficient component-based
architectures. This position paper suggests how.
Strongly Typed, Functional Glue
Languages
The language and type discipline under which components are "glued together"
has substantial impact on the character of the overall system. Most component-based
systems assume that components are implemented as stateful objects. But
global state is a notorious enemy of modularization, and distributed state
is difficult to maintain in the presence of failures. Purely functional
(i.e., side-effect-free) approaches are much better for building modular
distributed systems. The functional paradigm does not permit hidden state-based
interaction between components; all coupling between components must be
explicit. This restriction serves to discourage coupling and thus increase
modularity.
Languages supporting higher-order functional types, such as Standard
ML, Haskell, or Scheme, are particularly useful. At present, component
interfaces typically specify simple argument and return types for inter-component
communication, but leave more complex interactions between components unspecified.
For example, simple type systems lack a way to express the idea that a
particular interface function may ``call back'' the client component. This
deficiency leads to awkward notions like ``outerface.'' In the higher-order
functional paradigm, the client simply specifies the call-back function
as a higher-order argument.
Specifying Physical Protocols
Ideally, physical protocols for component communication should be pitched
at the highest possible level, i.e., specialized to express just the information
that must be transferred. This maximizes encoding efficiency and helps
guarantee that only correct and meaningful communications are attempted.
In practice, however, successful component-based systems have generally
adopted a "lowest common denominator" format for physical data transfer.
For example, systems as diverse as Unix pipes, CGI, ActiveX Automation,
and KQML rely on character strings as their basic communication medium.
Using strings vastly lowers the technical barrier for integrating a component
implemented in a new technology, and makes it much easier to debug systems
by snooping on inter-component communications.
String-based communication is only superficially unstructured; the communicated
strings must be in some format that sender and receiver can both understand.
But adherence to this structure is typically enforced by ad-hoc mechanisms
which are prone to error. By viewing communicated strings as sentences
in a formally-defined language, with a grammar and an independently-specified
semantics, protocol errors can be reduced and the clarity and security
of the system can be greatly enhanced. Parsers, semantic checkers, and
pretty printers for the protocol language can be generated automatically
and incorporated into the communicating components; application-level code
in these components can deal directly with structured data having to interpret
or generate strings directly. The protocol language specification can also
double as a partial behavior specification for the components that use
it. Applications can be debugged more easily by inserting language checkers
in between communicating components to verify that protocols are being
obeyed.
Improving System Performance
Modular systems naturally behave well with respect to some of the ``ilities,''
such as maintainability and manageability, but poorly with respect to others,
such as responsiveness, footprint size, and runtime performance. With few
exceptions, modular architectures trade away performance in return for
increased ease in construction and maintenance, and this should be more
frankly acknowledged by proponents of component-based systems. Still, improved
performance may be essential for some applications.
There has been considerable recent progress in whole-program optimization
of modular code based on the idea of specializing modules for use
at particular client sites. Unfortunately, this approach requires that
the module's code be available, and that module and client be written in
same language (or at least be compilable to the same intermediate language).
In component-based systems the code is not all available at all, so a different
approach is required. One promising possibility is to provide enhanced
specifications of components which indicate possibilities for optimized
use. Client code uses the simplest, cleanest specification, but the client
compiler has access to a lower-level, though still abstract, specification,
together with rules that indicate how to optimize high-level calls into
lower-level ones depending on context. For example, a database server component
might offer both one-off and (cheaper) incremental query facilities; the
compiler for a client making a series of related queries could optimize
client code to use the incremental facilities. Appropriate module specification
languages for this task are still a research question, but algebraic approaches
appear promising.
Conclusion
Support for modular system development has been present in programming
languages for many years. Good ideas like functional programming, formal
specification of protocol languages, and optimization by specialization
are not new, but they can help solve old problems now reappearing in component-based
systems.
Andrew P. Tolmach
Fri Nov 21 20:05:36 PST 1997