Using Machine Descriptors to Select Parallelization Models and Strategies on Hierarchical Systems Mark Yankelevsky, Walden Ko, Dimitrios S. Nikolopoulos, and Constantine Polychronopoulos, CSRD/University of Illinois Abstract: Clusters present the programmer with a complex hierarchy of hardware components, exploiting different levels of parallelism. The optimal parallelization strategy depends on several parameters, such as the number of nodes, processors per node, memory and communication bandwidth, and the overhead of orchestrating parallelism. A compiler using a detailed machine descriptor and static performance analysis can automate the selection of the best strategy. Experiments with the NAS benchmarks (parallelized using a combination of MPI and OpenMP) revealed performance patterns that drive the selection. Results and derived algorithms are presented in the poster and incorporated into the machine description of the PROMIS (HTTP://promis.csrd.uiuc.edu) compiler.