State University of New York - New Paltz

Department of Electrical and Computer Engineering

Course Title: Fault-Tolerant Design of Digital Systems

Course Number: CSE 40534

Credit: 3

Prerequisite: Graduate Standing or Permission of instructor

Instructor: Dr. Baback Izadi

Office: 203 Resnick Engineering Hall

Phone: (845) 257-3823

FAX: (845) 257-3730

Email: bai@engr.newpaltz.edu

URL: http://www.engr.newpaltz.edu/~bai

Meeting Days: Monday and Wednesday

Meeting Time: 5:00 PM - 6:15 PM

Meeting Room: REH 110

Office Hours:

Monday 2:00 - 4:00 PM

Wednesday 2:00 - 4:00 PM

And by appointment

This course deals with designing and analyzing reliable digital systems. Various aspects of reliability in digital systems including fault tolerance, fault detection, diagnosis, and reconfiguration will be examined. The topics covered include faults and their manifestations, fault avoidance techniques, hardware redundancy, error detecting and correcting codes, time redundancy, software redundancy, reliability and availability analysis, Markov reliability modeling, system evaluation and performance reliability tradeoffs, real-time fault tolerance, and examples of practical systems.

Text:

Fault-Tolerant Computer System Design, D. Pradhan, Prentice-Hall, 1996.

NOTE: You need to order this book from www.amazon.com as soon as possible.

Some papers from the literature will be distributed during the term.

References:

  1. Design and Analysis of Fault-Tolerant Digital Systems, B. W. Johnson: Addison-Wesley, 1989.
  2. Reliable Computer Systems-Design and Evaluation, 2nd edition, D. Siewiorek and R. Swarz: Digital Press - Butterworth, 1992.
  3. Fault Tolerance in Distributed Systems, P. Jalote: Prentice Hall, 1994
  4. Performance and Reliability Analysis of Computer Systems, R. Sahner, K. Trivedi: Kluwer Academic, 1996
  5. Fault Tolerance through reconfiguration of VLSI and WSI arrays, R. Negrini: MIT Press, 1989.

Topics:

Subject

Faults and their manifestations
Performance and reliability evaluation techniques.
System evaluation and performance reliability tradeoffs
Hardware, software, code and time redundancy techniques
MIDTERM EXAM
Architecture of Fault-Tolerant Computers
Fault tolerance in distributed and Multiprocessor systems.
Real-time fault tolerance
Case study of fault-tolerant systems
Project presentations
FINAL EXAM

Grading Policy:

Homework 15 %
Course Project 25 %
3-tests 30 %
Final 30 %

Special dates:

Monday September 4: No class on this day.
Monday October 9: No class on this day.
Tuesday October 10: Class on this day.
Tuesday October 17: Mid-point of Fall 2000.
Wednesday November 22: No class on this day.
Friday December 1: Last day to withdrawal.
Wednesday December 6: Last day of class.
Final exam: Friday December 15, 3:30 - 5:30 PM.

Presentations by students - possible topics: