Process Placement in Multicore Clusters: Algorithmic Issues and Practical Techniques

Current generations of NUMA node clusters feature multicore or manycore processors. Programming such architectures efficiently is a challenge because numerous hardware characteristics have to be taken into account, especially the memory hierarchy. One appealing idea to improve the performance of parallel applications is to decrease their communication costs by matching the communication pattern to the underlying hardware architecture. In this report, we detail the algorithm and techniques proposed to achieve such a result: first, we gather both the communication pattern information and the hardware details. Then we compute a relevant reordering of the various process ranks of the application. Finally, those new ranks are used to reduce the communication costs of the application.

Data and Resources

Additional Info

Field Value
Source https://inria.hal.science/hal-00803548
Author Jeannot, Emmanuel, Mercier, Guillaume, Tessier, François
Maintainer CCSD
Last Updated May 12, 2026, 04:47 (UTC)
Created May 12, 2026, 04:47 (UTC)
Identifier Report N°: RR-8269
Language en
Rights https://about.hal.science/hal-authorisation-v1/
contributor Laboratoire Bordelais de Recherche en Informatique (LaBRI) ; Université de Bordeaux (UB)-École Nationale Supérieure d'Électronique, Informatique et Radiocommunications de Bordeaux (ENSEIRB)-Centre National de la Recherche Scientifique (CNRS)
creator Jeannot, Emmanuel
date 2013-03-22T00:00:00
harvest_object_id b3d88a42-7929-44f0-8756-f1ccb244be9b
harvest_source_id 3374d638-d20b-4672-ba96-a23232d55657
harvest_source_title test moissonnage SELUNE
metadata_modified 2025-05-26T00:00:00
set_spec type:REPORT