Lessons Learned from Programming Languages will benefit Component-Based Architectures

Andrew Tolmach, Dick Kieburtz, Tim Sheard Pacific Software Research Center Oregon Graduate Institute and Portland State University {apt,dick,sheard}@cse.ogi.edu

Introduction

The programming language community has learned lessons about modular systems and protocols that can help develop more robust and efficient component-based architectures. This position paper suggests how.

Strongly Typed, Functional Glue Languages

The language and type discipline under which components are "glued together" has substantial impact on the character of the overall system. Most component-based systems assume that components are implemented as stateful objects. But global state is a notorious enemy of modularization, and distributed state is difficult to maintain in the presence of failures. Purely functional (i.e., side-effect-free) approaches are much better for building modular distributed systems. The functional paradigm does not permit hidden state-based interaction between components; all coupling between components must be explicit. This restriction serves to discourage coupling and thus increase modularity.

Languages supporting higher-order functional types, such as Standard ML, Haskell, or Scheme, are particularly useful. At present, component interfaces typically specify simple argument and return types for inter-component communication, but leave more complex interactions between components unspecified. For example, simple type systems lack a way to express the idea that a particular interface function may ``call back'' the client component. This deficiency leads to awkward notions like ``outerface.'' In the higher-order functional paradigm, the client simply specifies the call-back function as a higher-order argument.

Specifying Physical Protocols

Ideally, physical protocols for component communication should be pitched at the highest possible level, i.e., specialized to express just the information that must be transferred. This maximizes encoding efficiency and helps guarantee that only correct and meaningful communications are attempted. In practice, however, successful component-based systems have generally adopted a "lowest common denominator" format for physical data transfer. For example, systems as diverse as Unix pipes, CGI, ActiveX Automation, and KQML rely on character strings as their basic communication medium. Using strings vastly lowers the technical barrier for integrating a component implemented in a new technology, and makes it much easier to debug systems by snooping on inter-component communications.

String-based communication is only superficially unstructured; the communicated strings must be in some format that sender and receiver can both understand. But adherence to this structure is typically enforced by ad-hoc mechanisms which are prone to error. By viewing communicated strings as sentences in a formally-defined language, with a grammar and an independently-specified semantics, protocol errors can be reduced and the clarity and security of the system can be greatly enhanced. Parsers, semantic checkers, and pretty printers for the protocol language can be generated automatically and incorporated into the communicating components; application-level code in these components can deal directly with structured data having to interpret or generate strings directly. The protocol language specification can also double as a partial behavior specification for the components that use it. Applications can be debugged more easily by inserting language checkers in between communicating components to verify that protocols are being obeyed.

Improving System Performance

Modular systems naturally behave well with respect to some of the ``ilities,'' such as maintainability and manageability, but poorly with respect to others, such as responsiveness, footprint size, and runtime performance. With few exceptions, modular architectures trade away performance in return for increased ease in construction and maintenance, and this should be more frankly acknowledged by proponents of component-based systems. Still, improved performance may be essential for some applications.

There has been considerable recent progress in whole-program optimization of modular code based on the idea of specializing modules for use at particular client sites. Unfortunately, this approach requires that the module's code be available, and that module and client be written in same language (or at least be compilable to the same intermediate language). In component-based systems the code is not all available at all, so a different approach is required. One promising possibility is to provide enhanced specifications of components which indicate possibilities for optimized use. Client code uses the simplest, cleanest specification, but the client compiler has access to a lower-level, though still abstract, specification, together with rules that indicate how to optimize high-level calls into lower-level ones depending on context. For example, a database server component might offer both one-off and (cheaper) incremental query facilities; the compiler for a client making a series of related queries could optimize client code to use the incremental facilities. Appropriate module specification languages for this task are still a research question, but algebraic approaches appear promising.

Conclusion

Support for modular system development has been present in programming languages for many years. Good ideas like functional programming, formal specification of protocol languages, and optimization by specialization are not new, but they can help solve old problems now reappearing in component-based systems.

Andrew P. Tolmach

Fri Nov 21 20:05:36 PST 1997