The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
Neste artigo, propomos um novo algoritmo de escalonamento para alcançar tolerância a falhas em sistemas multiprocessadores. Este algoritmo primeiro particiona um programa paralelo em subconjuntos de tarefas, com base na noção de altura de um gráfico de tarefas. Para cada subconjunto, o algoritmo duplica e agenda as tarefas no subconjunto sucessivamente. Provamos que os escalonamentos obtidos pelo algoritmo proposto podem tolerar uma falha de um único processador e mostramos que a complexidade computacional do algoritmo é O(|V|4) Onde V é o conjunto de nós de um gráfico de tarefas. Conduzimos simulações aplicando o algoritmo a dois tipos de gráficos de tarefas práticas (eliminação gaussiana e decomposição LU). Os resultados deste experimento mostram que a tolerância a falhas pode ser alcançada ao custo de um pequeno grau de redundância de tempo e que o desempenho no caso de falha do processador é melhorado em comparação com um algoritmo anterior.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copiar
Koji HASHIMOTO, Tatsuhiro TSUCHIYA, Tohru KIKUNO, "Effective Scheduling of Duplicated Tasks for Fault Tolerance in Multiprocessor Systems" in IEICE TRANSACTIONS on Information,
vol. E85-D, no. 3, pp. 525-534, March 2002, doi: .
Abstract: In this paper, we propose a new scheduling algorithm to achieve fault tolerance in multiprocessor systems. This algorithm first partitions a parallel program into subsets of tasks, based on the notion of height of a task graph. For each subset, the algorithm then duplicates and schedules the tasks in the subset successively. We prove that schedules obtained by the proposed algorithm can tolerate a single processor failure and show that the computational complexity of the algorithm is O(|V|4) where V is the set of nodes of a task graph. We conduct simulations by applying the algorithm to two kinds of practical task graphs (Gaussian elimination and LU-decomposition). The results of this experiment show that fault tolerance can be achieved at the cost of small degree of time redundancy, and that performance in the case of a processor failure is improved compared to a previous algorithm.
URL: https://global.ieice.org/en_transactions/information/10.1587/e85-d_3_525/_p
Copiar
@ARTICLE{e85-d_3_525,
author={Koji HASHIMOTO, Tatsuhiro TSUCHIYA, Tohru KIKUNO, },
journal={IEICE TRANSACTIONS on Information},
title={Effective Scheduling of Duplicated Tasks for Fault Tolerance in Multiprocessor Systems},
year={2002},
volume={E85-D},
number={3},
pages={525-534},
abstract={In this paper, we propose a new scheduling algorithm to achieve fault tolerance in multiprocessor systems. This algorithm first partitions a parallel program into subsets of tasks, based on the notion of height of a task graph. For each subset, the algorithm then duplicates and schedules the tasks in the subset successively. We prove that schedules obtained by the proposed algorithm can tolerate a single processor failure and show that the computational complexity of the algorithm is O(|V|4) where V is the set of nodes of a task graph. We conduct simulations by applying the algorithm to two kinds of practical task graphs (Gaussian elimination and LU-decomposition). The results of this experiment show that fault tolerance can be achieved at the cost of small degree of time redundancy, and that performance in the case of a processor failure is improved compared to a previous algorithm.},
keywords={},
doi={},
ISSN={},
month={March},}
Copiar
TY - JOUR
TI - Effective Scheduling of Duplicated Tasks for Fault Tolerance in Multiprocessor Systems
T2 - IEICE TRANSACTIONS on Information
SP - 525
EP - 534
AU - Koji HASHIMOTO
AU - Tatsuhiro TSUCHIYA
AU - Tohru KIKUNO
PY - 2002
DO -
JO - IEICE TRANSACTIONS on Information
SN -
VL - E85-D
IS - 3
JA - IEICE TRANSACTIONS on Information
Y1 - March 2002
AB - In this paper, we propose a new scheduling algorithm to achieve fault tolerance in multiprocessor systems. This algorithm first partitions a parallel program into subsets of tasks, based on the notion of height of a task graph. For each subset, the algorithm then duplicates and schedules the tasks in the subset successively. We prove that schedules obtained by the proposed algorithm can tolerate a single processor failure and show that the computational complexity of the algorithm is O(|V|4) where V is the set of nodes of a task graph. We conduct simulations by applying the algorithm to two kinds of practical task graphs (Gaussian elimination and LU-decomposition). The results of this experiment show that fault tolerance can be achieved at the cost of small degree of time redundancy, and that performance in the case of a processor failure is improved compared to a previous algorithm.
ER -