A strategy for mapping unstructured mesh computational mechanics programs onto distributed memory parallel architectures
McManus, Kevin (1996) A strategy for mapping unstructured mesh computational mechanics programs onto distributed memory parallel architectures. PhD thesis, University of Greenwich.
|PDF - Published Version |
Available under License Creative Commons Attribution Non-commercial No Derivatives.
Download (7MB) | Preview
The motivation of this thesis was to develop strategies that would enable unstructured mesh based computational mechanics codes to exploit the computational advantages offered by distributed memory parallel processors. Strategies that successfully map structured mesh codes onto parallel machines have been developed over the previous decade and used to build a toolkit for automation of the parallelisation process. Extension of the capabilities of this toolkit to include unstructured mesh codes requires new strategies to be developed.
This thesis examines the method of parallelisation by geometric domain decomposition using the single program multi data programming paradigm with explicit message passing. This technique involves splitting (decomposing) the problem definition into P parts that may be distributed over P processors in a parallel machine. Each processor runs the same program and operates only on its part of the problem. Messages passed between the processors allow data exchange to maintain consistency with the original algorithm.
The strategies developed to parallelise unstructured mesh codes should meet a number of requirements:
The algorithms are faithfully reproduced in parallel.
The code is largely unaltered in the parallel version.
The parallel efficiency is maximised.
The techniques should scale to highly parallel systems.
The parallelisation process should become automated.
Techniques and strategies that meet these requirements are developed and tested in this dissertation using a state of the art integrated computational fluid dynamics and solid mechanics code. The results presented demonstrate the importance of the problem partition in the definition of inter-processor communication and hence parallel performance.
The classical measure of partition quality based on the number of cut edges in the mesh partition can be inadequate for real parallel machines. Consideration of the topology of the parallel machine in the mesh partition is demonstrated to be a more significant factor than the number of cut edges in the achieved parallel efficiency. It is shown to be advantageous to allow an increase in the volume of communication in order to achieve an efficient mapping dominated by localised communications. The limitation to parallel performance resulting from communication startup latency is clearly revealed together with strategies to minimise the effect.
The generic application of the techniques to other unstructured mesh codes is discussed in the context of automation of the parallelisation process. Automation of parallelisation based on the developed strategies is presented as possible through the use of run time inspector loops to accurately determine the dependencies that define the necessary inter-processor communication.
|Item Type:||Thesis (PhD)|
|Uncontrolled Keywords:||parallel computing, computational fluid dynamics, CFD, computer software, applied mathematics,|
|Subjects:||Q Science > QA Mathematics > QA76 Computer software|
Q Science > QC Physics
|School / Department / Research Groups:||School of Computing & Mathematical Sciences|
School of Computing & Mathematical Sciences > Centre for Numerical Modelling & Process Analysis
|Last Modified:||27 Sep 2012 12:22|
Actions (login required)