An Algebraic Approach for Scientific Workflows with Large Scale Data

Scientific workflows have emerged as a basic abstraction for structuring and executing scientific experiments in computational simulations. In many situations, these workflows are computationally and data intensive, thus requiring execution in large-scale parallel computers. However, the parallelization of scientific workflows is low-level, ad hoc and labor-intensive, which makes it hard to exploit optimization opportunities. To address the problem of optimizing the parallel execution of scientific workflows, we propose an algebraic approach to represent the workflow and a parallel execution model that together enable the automatic optimization of the parallel execution of scientific workflows. We conducted a thorough validation of our approach using both real applications and synthetic data scenarios. The experiments were run in Chiron, a data-centric scientific workflow engine implemented to parallelize scientific workflow execution. Our experiments demonstrated excellent parallel performance improvements obtained and evidenced through our algebraic approach several optimization opportunities when compared to ad hoc workflow implementation.

Data and Resources

Additional Info

Field Value
Source https://theses.hal.science/tel-00653661
Author Ogasawara, Eduardo
Maintainer CCSD
Last Updated May 21, 2026, 12:40 (UTC)
Created May 21, 2026, 12:40 (UTC)
Identifier tel-00653661
Language pt
Rights https://about.hal.science/hal-authorisation-v1/
contributor COPPE - SMT ; Universidade Federal do Rio de Janeiro [Brasil] = Federal University of Rio de Janeiro [Brazil] = Université fédérale de Rio de Janeiro [Brésil] (UFRJ)
creator Ogasawara, Eduardo
date 2011-12-19T00:00:00
harvest_object_id b37f608c-1176-49f4-94b5-fba2687317b9
harvest_source_id 3374d638-d20b-4672-ba96-a23232d55657
harvest_source_title test moissonnage SELUNE
metadata_modified 2025-08-26T00:00:00
set_spec type:THESE