Resiliency in Distributed Workflow Systems for Numerical Applications

This thesis aims to conceive an environment for high-performance computing dedicated to numerical optimization applications.The design and optimisation tools belong to different academic and industrial teams that collaborate inside same projects. these tools must befederated in a cmmon environmentin orer to facilitate the access to researchers and engineers. The environment that we propose, to answer the above mentioned conditions, is composed of a workflow system and a distributed coputing system. The first goal is to facilitate the application design, while the second is is in charge of the application execution on distributed resources. Of course, a set of communication services between the two systems must be developed. The computation phase must be acheived efficiently with regards to the parallelism of some codes,synchronous and asynchroneousexection of tasks, data transfer and the available hardware and software resources. Moreover, the environment must provide a sufficient level of fault-tolerance, whether from hardware and software failures, to minimize their influence on the final result or the computation time; An important condition is to implement solutions for restarting the application afeter anerror occurs such that the time for treating the error remains inferior to the time needed to completely restrat the application. In our case, we focused on the Yawl workflow system, since it presents good characteristics in terms of i) hardware and software independence, ii) fault-tolerant mechanisms. Regarding the distributed execution, our tests were deployed on the Grid5000 platform, using up to 64 different machines located on 5 geographic sites. This document presents the design choices and the extensions performed on Yawl in order to run on a distributed platform.

Data and Resources

Additional Info

Field Value
Source https://theses.hal.science/tel-00912491
Author Trifan, Laurentiu
Maintainer CCSD
Last Updated May 8, 2026, 00:15 (UTC)
Created May 8, 2026, 00:15 (UTC)
Identifier tel-00912491
Language en
Rights https://about.hal.science/hal-authorisation-v1/
contributor Optimization and control, numerical algorithms and integration of complex multidiscipline systems governed by PDE (OPALE) ; Centre Inria d'Université Côte d'Azur ; Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Laboratoire Jean Alexandre Dieudonné (LJAD) ; Université Nice Sophia Antipolis (1965 - 2019) (UNS)-Centre National de la Recherche Scientifique (CNRS)-Université Côte d'Azur (UniCA)-Université Nice Sophia Antipolis (1965 - 2019) (UNS)-Centre National de la Recherche Scientifique (CNRS)-Université Côte d'Azur (UniCA)
creator Trifan, Laurentiu
date 2013-10-21T00:00:00
harvest_object_id cc9f9339-8c5c-4e11-92e6-4e5cfd7b487a
harvest_source_id 3374d638-d20b-4672-ba96-a23232d55657
harvest_source_title test moissonnage SELUNE
metadata_modified 2025-09-27T00:00:00
set_spec type:THESE