Abstract
The paper proposes to manage complexity and costs issues of the fault-tolerant programs not at a single program level but rather from the point of view of the whole set of such programs, which are to be run under hard time constraints. A concept of the multiple processor programs is used to model a fault-tolerant program structure. This model, in turn, is used to formulate the fault-tolerant programs scheduling problem under hard time constraints. Since the discussed problem is computationally difficult, three scheduling algorithms, based on three different metaheuristics, have been proposed. To evaluate the proposed algorithms computational experiment has been carried. The proposed global approach has been also compared with scheduling without search for the global optimum. Experiment results prove that the approach could be advantageous by producing more reliable schedules within hard time constraints.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
6. References
Ashrafi, N., Berman, O.: Optimization Models for Selection of Programs, Considering Cost & Reliability, IEEE Transactions on Reliability 41(2) (1992) 59–65
Avizienis, A., Chen, L.: On the implementation of the N-version programming for software fault tolerance during execution, Proc. IEEE COMPSAC 77 (1977) 149–155
Belli, F., Jędrzejowicz, P.: An Approach to the Reliability Optimization of Software with Redundancy, IEEE Transactions on Software Engineering, 17(3) (1991) 310–312
Błażewicz, J., Drabowski, M.,: Węglarz, J.: Scheduling Multiprocessor tasks to minimize schedule length, IEEE Transactions on Computers 35 (1986) 389–393.
Błażewicz, J., Ecker, K. H., Pesch, E., Schmidt, G.,: Węglarz, J.: Scheduling Computer and Manufacturing Processes, Springer, Berlin (1996)
Bondavalli, A., Giandomenico, F. Di., Xu, J.: Cost-effective and flexible scheme for software fault tolerance, Computer System Science & Engineering, 4 (1993) 234–244
Garey, M. R., Johnson, D. S.: Computers and Intractability: A Guide to the Theory of NPCompleteness, W.H.Freeman, New York (1979)
Glover, F., Laguna, M.: Tabu Search, Kluver, Boston (1987)
Ghosh, S., Melham, R., Mosse, D.: Fault-Tolerance through Scheduling of Aperiodic Tasks in Hard Real-Time Multiprocessor Systems, IEEE Transactions on Parallel and Distributed Systems 8(13) (1997) 272–284
Gutjahr, W.: Reliability Optimization of Redundant Software with Correlated Failures, Proc. IX Int. Symposium on Software Reliability Engineering, Paderborn (1998) 293–303
Hecht, M., Agron, J., Hochhauser, S.: A distributed fault tolerant architecture for nuclear reactor and safety functions, Proc. Real-Time System Symposium, Santa Monica (1989) 214–221
Jedrzejowicz, P.: Social Learning Algorithm, Technical Report nr 7/98, Computer Science Dept. Gdynia Maritime Academy, Gdynia (1998)
Kim, K. H.: Distributed execution of recovery blocks: an approach to uniform treatment of hardware and software faults, Proc. 4th International Conference on Distributed Computing Systems, IEEE Computer Society Press (1984) 526–532
Krishna, C. M., Shin, K. G.: On Scheduling Tasks with a Quick Recovery from Failure, IEEE Transactions on Computers 35(5) (1986) 448–455
Laprie, J. C., Arlat, J., Beounes, C., Kanoun, K.: Definition and Analysis of Hardware-and-Software Fault-Tolerant Architectures, IEEE Computer 23(7) (1990) 39–51
Liestman, A. L., Campbell, R. H.: A Fault-tolerant Scheduling Problem, IEEE Transactions on Software Engineering, SE-12(11) (1988) 1089–1095
Melliar-Smith, P. M., Randell, B.: Software reliability: the role of programmed exception handling, SIGPLAN Notices 12(3) (1977) 95–100
Michalewicz, Z.: Genetic Algorithms + Data Structures = Evolution Programs, Springer, 2nd ed., Berlin (1996)
Ramamritham, K., Stankovic, J. A., Zhao, W.: Distributed Scheduling of Tasks with Deadlines and Resource Requirements, IEEE Transactions on Computers 38(8) (1989) 1110–1123
Ramamritham, K., Stankovic, J. A., Shiah, P. F.: Efficient Scheduling Algorithms for Real-Time Systems, IEEE Transactions on Parallel and Distributed Systems, 1(2) (1990) 184–194
Randell, B., Xu, J.: The Evolution of the Recovery Block Concept, in: M. R. Lyu (ed.): Software Fault-Tolerance, J. Wiley, Chichester (1995) 1–22
Scott R. K., J. W. Gault, D. F. Mc Allister: Fault tolerant software reliability modelling, IEEE Transactions on Software Engineering, 13(5) (1987) 582–592
Shirazi, B., Wang, M., Pathak, G.: Analysis and Evaluation of Heuristic Methods for Static Task Scheduling, Journal of Parallel and Distributed Computing, 10 (1990) 222–232
Shirazi, B., Hurson, A. R., Kavi, K. M.: Scheduling and Load Balancing in Parallel and Distributed Systems, IEEE Computer Society Press, Los Alamitos (1995)
Tso, K. S., Avizienis, A.: Community error recovery in N-version software: a design study with experimentation, Digest of 17th FTCS, Pittsburgh (1987) 127–133
Xu, J., Randell, B.: Software Fault Tolerance: t/(n-1)-Variant Programming, IEEE Transactions on Reliability, 46(1) (1997) 60–67
Yau, S. S., Cheung, R. C.: Design of Self-Checking Software, Proc. Int. Conf. on Reliable Software, IEEE Computer Society Press (1975) 450–457
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Czarnowski, I., Jedrzejowicz, P., Ratajczak1, E. (1999). Scheduling Fault-Tolerant Programs on Multiple Processors to Maximize Schedule Reliability. In: Felici, M., Kanoun, K. (eds) Computer Safety, Reliability and Security. SAFECOMP 1999. Lecture Notes in Computer Science, vol 1698. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48249-0_33
Download citation
DOI: https://doi.org/10.1007/3-540-48249-0_33
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66488-8
Online ISBN: 978-3-540-48249-9
eBook Packages: Springer Book Archive