Wednesday, July 7, 1999

Catalog of Fault Mechanisms

In the previous blog entry, What Could Fault Expectant Programming Mean?, I outlined a framework abstracted out of recurring themes in a series of error detection, recovery, and avoidance mechanisms that I had compiled.  Here follows that list of mechanisms (in no particular order).
  • Distributed Systems: remove single point of failure by using multiple components
  • Loose Coupling: avoids failure from spreading via ripple effects or brittleness
  • Ack/Nak: verify results and, on failure, repeat request (e.g. protocols)
  • Fail Over: verify results and, on failure, repeat request but to different component
  • Do Over: set aside failed requests or out-of-bound results and retry them later
  • Auto Restart: on failure of a component, it should auto-reset/restart and continue
  • Leases: dead-man switch on any allocated resource of a component (including the attention of its partners a la protocol timeout)
  • Event Driven: handle non-deterministic order of returned results from components
  • Fault Tolerance: ignore failed requests or out-of-bound results and continue rather than generating errors/exceptions
  • Backtracking: processing that expects to hit dead-ends and so backtracks to try other approaches  (e.g. Prolog logic rules, parsers that employ backtracking algorithms)
  • Redundant Components: issue parallel requests to multiple components and take a majority-rules vote on the result (but must know if results are deterministic or not; i.e. if more than one result can be valid then different components may return different, but still valid, results)
  • Evolutionary Programming: issue parallel requests to multiple components and result is taken from the most fit component i.e. chosen by the quality of the result rather than the result itself
  • Auctions: issue parallel requests to multiple components that compete on cost of service as a definition of most fit
  • Neural Nets: issue parallel requests to multiple components and take a vote biased by each component's dynamically adjusted success rating (i.e. each component votes its stock).
  • Fuzzy Logic: result returned by component is biased by a probability/quality rating
  • Transactions: state transitions are confined to successful atomic steps
  • Exceptions: asynchronous notification and response to problems where there is the ability to either continue or abort current operation.
  • Game Theory: multiple conflicting rule sets are projected via min-max trees to find a balanced result
  • Blackboard Systems: multiple workers on problem process as much as each is able and share common result state i.e. individual workers are not expected to produce a complete result or even any result at all.  E.G. JavaSpaces 
  • Workflow Models: combination of state-transition models and PERT dependency models to keep track of progress at a global level given multiple parallel workers and to reset to some given state(s) if results are not converging
  • Belt & Suspenders: multiple independent methods of verifying results or progress
  • Mobile Agents: workers dynamically move to more appropriate environments e.g. load balancing, fail-over, seek more reliable communications, etc.
  • Design by Contract: assertions to detect failure on the part of either the requestor or the worker e.g. Cleanroom techniques, Component Interfaces, Strong Types, policy driven security managers, etc.
  • Pattern Matching: non-trivial matching of requestor interfaces with worker interfaces to allow more flexible and dynamic establishment of contracts e.g. KQML , JATLite

No comments: