Survivability in Object Services Architectures (contract F30602-96-C-0330) is funded by DARPA ITO.
We are developing software mechanisms to ensure the survivability of such systems that go well beyond the traditional approaches of fault tolerance and replicated services. Those techniques, while valuable, are in themselves insufficient to respond to the full range of problems that can face a system since they create "islands of availability" but do nothing to address system-wide concerns. The following two examples illustrate the kinds of issues addressed by our survivability work.
Mission planning for a sortie in regional conflict with multiple coalition partners requires many resources, among them a map server. Assume the local map server becomes unavailable and that the backup map server is located at a remote location and reachable only over slow communication lines. There is a coalition map server available with good performance characteristics, but its data is considered to be of lower quality and the labels are specified in a foreign language. Under many circumstances, it would be desirable to use the coalition map server, but existing systems cannot switch an active connection and are limited to exact substitutes for a service. A survivable system needs to be able to switch compatible services in an established connection and substitute acceptable alternatives.
The ability to substitute services is only one aspect of survivability. Consider an information warfare attack focused on NT machines. As the NT machines begin to fail, essential processing must be moved over to UNIX machines. This in turn requires terminating or delaying non essential processing on those machines. However, there are many different threats, each with its own optimal response, and more than one threat may materialize at the same time. Addressing this in an ad hoc manner is not possible. A survivable system must be able to dynamically adapt to the threats in its environment to reallocate essential processing to the most robust resources.
This project is developing software mechanisms to make military and commercial software applications based on the popular Object Services Architecture (e.g., OMG's CORBA) model far more survivable than is currently possible, while at the same time maintaining the flexibility and ease of construction that characterizes OSA-based applications.
The keys to making systems survivable are:
OSA
Survivability Project - September 1996
This report describes the goals, approach, and anticipated results
of the project "Survivability in Object Services Architectures".
OSA
Compositon Model - September 1996
This report describes the static properties of a survivable object
abstraction that extends standard object models in several ways to
allow objects and applications to be reconfigured to recover from or protect
against failures. Survivable objects are the basis for constructing
survivable applications.
OSA
Evolution Model - September 1996
This report describes the dynamic properties of the survivable object
abstraction (introduced in OSA
Compositon Model) that it possible to safely migrate a running application
from one legitimate configuration into another legitimate configuration.
Both semantically identical and semantically similar transformations are
possible under this model, which allows applications to continue to survive
in degraded mode when system resources become unavailable due to attack
or failure.
OSA
Survivability Service - January 1997
This report describes the architecture of a Survivability Service that
manages survivable objects and applications constructed using the survivable
object abstraction. The Survivability Service is compatible with existing
work in failure detection and classification, fault tolerance, and highly
available systems. Portions of the Survivability Service are being
prototyped as part of this project.
QoS
& Survivability - March 1998, Revised August 1998
This report describes recent research in service-level quality of service
and the relationship between survivability and quality of service.
Notes
on a Command Post Scenario - 1998
This report is a working paper describing the likely software environment
of a future military command post, the connectivity between a command post
and its outside environment, and a typical activity that takes place within
the command post. This will be used to derive the survivability requirements
of a command post and define a scenario for demonstrating the Survivability
Service.
Survivability
is Utility - 1998
The paper explores how Utility Theory (a sub-discipline of microeconomics)
can be exploited to define metrics to evaluate the successfulness of survivable
systems and that can be used by Survivability Management Systems to plan
actions to ensure system survivability.
Estimating
Failure Probability - 1999
In the process of creating and maintaining survivable configurations,
the Survivability Service needs to predict the likelihood that, within
some time interval, a service will be damaged to an extent that it cannot
provide the required level of service. This paper discusses a basic model
of how services are provided by resources, how threats against those services
are modeled, and how the probability of service failure is computed from
the threat model.
This research is sponsored by the Defense Advanced Research Projects Agency and managed by Rome Laboratory under contract F30602-96-C-0330. The views and conclusions contained in this document are those of the authors and should not be interpreted as necessarily representing the official policies, either expressed or implied of the Defense Advanced Research Projects Agency, Rome Laboratory, or the United States Government.
Last updated 5/13/99 sjf