Erin Hsu
Hugo Miranda
Julie Zhuo
SoCo 2003
Fault Tolerant Computing
Annotated Bibliography
Anthes, Gary H. “Brave New OS.” Computerworld. 11 February 2002. 06 September 2003. .
This article describes the Farsite project at Microsoft Corporation that embodies fault tolerance, self-tuning, and robust security. It also includes short descriptions of the Odyssey project at Carnegie Mellon University and the IBM Blue Gene research project.
“The Berkeley/Stanford Recovery-Oriented Computing (ROC) Project” UC Berkeley. 06 September 2003. .
This website provides an overview, recent news, contact information, and many publications regarding the Recovery-Oriented Computing project, which is headed mainly by UC Berkeley’s David Patterson and Stanford’s Armando Fox.
Candea, George. “Redundant Fault Tolerant Systems.” Stanford University. 17 November 1999. 11 September 2003. .
This site consists of lecture slides from George Candea’s CS444 course, entitled Principles of Dependable Computer Systems. It mainly focuses on Tandem’s Guardian operating system.
Gibbs, W. Wayt. “Autonomic Computing.” Scientific American. 06 May 2002. 06 September 2003. .
This article touches on solutions to the question raised by IBM's "manifesto" that are currently being researched by various universities and corporations, including IBM's Oceano experiment, Hewlett-Packard Lab's planetary computing project, and David Patterson of UC Berkeley's "Recovery-Oriented Computing" (ROC).
Haverkort, Boudewijn R. “Fault Tolerant Computer Systems.” Rheinisch-Westfälische Technische Hochschule Aachen. Summer 2002. 07 September 2003.
Contains a clear introduction on fault-tolerance computing, and gives some definitions regarding the subject.
Horn, Paul. “Autonomic Computing: IBM’s Perspective on the State of Information Technology.” IBM. October 2001. 09 September 2003 .
This pdf file contains the 38-page "manifesto" that argues that the complexity of current computing systems has begun to outpace the capabilities of human administrators to cope with them. It proposes a solution in the form of Autonomic Computing, based on self-management.
McCluskey, Edwards J. "Micros, Minis and Networks." Stanford, California, 1975.
McCluskey proposes techniques for fault-tolerant systems to be used in multiprocessor systems constructed of microprocessors through the study of multiprocessors which use either minicomputers or specifically-designed processors. The research takes place using the Plessy System 250, the Berkeley Prime System, the Carnegie-Mellon C.mmp, and the BBN Pluribus.
Mili, Ali. An Introduction to Program Fault Tolerance. New York: Prentice Hall, 1990.
This book introduces the basics of programming dependable computing, beginning with discrete mathematics needed for programming and ending with backward error recovery.
Proceedings of the 2002 International Conference on Dependable Systems & Networks. California: IEEE, Inc., 2002.
This 800-page resource is divided into 27 “sessions” that focus on different systems and aspects of dependable systems, such as “Consensus & Failure Detectors” and “Security and Fault Tolerance.” Within each session, there are four or more individual papers written on even more specific areas within the topic.
Roberts, Eric S. "Software Techniques for Practical Multiprocessors." Cambridge, Massachussets, 1980.
As microprocessors become more readily available and less expensive, parallel hardware structures provide greater reliability and efficacy for computer systems, providing a recovery measure in case of failure. The thesis postulated by Roberts particularly focuses on the Pluribus system used in the ARPA network and explains the system's importance in maintaining a reliable network.
Robinson, John G., Roberts, Eric S. "Software fault-tolerance in the Pluribus." Cambridge, Massachussets, 1978
Explores the methods of redundancy used in the BBN Pluribus system in order to produce more reliability. In particular, the work focuses on the software-side implementations of a fault-tolerant system and proposes new applications for the Pluribus system.
“Strategies for Fault-Tolerant Computing.” Windows White Papers 2003. 07 September 2003.
Microsoft document on fault-tolerant servers, specifically on their server product. Good background reading for what fault tolerance is.
“Tandem Computers.” Wikipedia 09 September 2003 .
This site provides a general overview of the life of Tandem computers from 1970s to the present and their main product, called Nonstop, that is still in existence at Hewlett-Packard.
XIV. Candea, George. “Redundant Fault Tolerant Systems.” Stanford University. 17 November 1999. 13 September 2003. .
This site contains lecture slides from the CS444a course at Stanford University. It focuses on Tandem’s NonStop technology.
XV. Fox, Armando. “Toward Recovery-Oriented Computing.” Stanford University. 13 September 2003. .
This pdf file contains a general overview of Recovery-Oriented Computing, written by one of the main directors of the research, Professor Armando Fox of Stanford University.
XVI. Krueger, Jeff. “A Look at Fault Tolerance: Tandem NonStop Systems.” 24 April 1997. .
This 24 page term paper from a Operating System Design course contains a very clear and thorough overview of the processes behind Tandem NonStop Systems.
Directory: peoplepeople -> Math 4630/5630 Homework 4 Solutions Problem Solving ippeople -> Handling Indivisibilitiespeople -> San José State University Social Science/Psychology Psych 175, Management Psychology, Section 1, Spring 2014people -> YiChang Shihpeople -> Marios S. Pattichis image and video Processing and Communication Lab (ivpcl)people -> Peoples Voice Café Historypeople -> Sa michelson, 2011: Impact of Sea-Spray on the Atmospheric Surface Layer. Bound. Layer Meteor., 140 ( 3 ), 361-381, doi: 10. 1007/s10546-011-9617-1, issn: Jun-14, ids: 807TW, sep 2011 Bao, jw, cw fairall, sa michelsonpeople -> Curriculum vitae sara a. Michelsonpeople -> Curriculum document state board of education howard n. Lee, Cpeople -> A hurricane track density function and empirical orthogonal function approach to predicting seasonal hurricane activity in the Atlantic Basin Elinor Keith April 17, 2007 Abstract
Share with your friends: |