A Runtime Heuristic to Selectively Replicate Tasks for Application-Specific Reliability Targets

Loading...

Journal Title

Journal ISSN

Volume Title

Publisher

Open Access Color

Green Open Access

Yes

OpenAIRE Downloads

72

OpenAIRE Views

43

Publicly Funded

No
Impulse
Top 10%
Influence
Average
Popularity
Average

relationships.isProjectOf

relationships.isJournalIssueOf

Abstract

In this paper we propose a runtime-based selective task replication technique for task-parallel high performance computing applications. Our selective task replication technique is automatic and does not require modification/recompilation of OS, compiler or application code. Our heuristic, we call App_FIT, selects tasks to replicate such that the specified reliability target for an application is achieved. In our experimental evaluation, we show that App_FIT selective replication heuristic is low-overhead and highly scalable. In addition, results indicate that complete task replication is overkill for achieving reliability targets. We show that with App_FIT, we can tolerate pessimistic exascale error rates with only 53% of the tasks being replicated.

Description

Subasi, Omer/0000-0002-5373-7570;

Keywords

Àrees temàtiques de la UPC::Informàtica::Arquitectura de computadors, Parallel processing (Electronic computers), Selective replication, Processament en paral·lel (Ordinadors), Task parallelism, Dataflow programming, HPC and exascale computing, :Informàtica::Arquitectura de computadors [Àrees temàtiques de la UPC]

Fields of Science

0202 electrical engineering, electronic engineering, information engineering, 02 engineering and technology

Citation

WoS Q

Scopus Q

OpenCitations Logo
OpenCitations Citation Count
6

Volume

Issue

Start Page

498

End Page

505
PlumX Metrics
Citations

CrossRef : 3

Scopus : 7

Captures

Mendeley Readers : 6

SCOPUS™ Citations

7

checked on Jun 03, 2026

Web of Science™ Citations

6

checked on Jun 03, 2026

Page Views

1

checked on Jun 03, 2026

Downloads

5

checked on Jun 03, 2026

Google Scholar Logo
Google Scholar™
OpenAlex Logo
OpenAlex FWCI
1.77

Sustainable Development Goals