A4 Refereed article in a conference publication
A novel multi-level integrated roofline model approach for performance characterization
Authors: Tuomas Koskela, Zakhar Matveev, Charlene Yang, Adetokunbo Adedoyin, Roman Belenov, Philippe Thierry, Zhengji Zhao, Rahulkumar Gayatri, Hongzhang Shan, Leonid Oliker, Jack Deslippe, Ron Green, Samuel Williams
Editors: Rio Yokota, Michèle Weiland, David Keyes, Carsten Trinitis
Conference name: International Conference on High Performance Computing
Publisher: Springer Verlag
Publication year: 2018
Journal: Lecture Notes in Computer Science
Book title : High Performance Computing
Journal name in source: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Series title: Lecture Notes in Computer Science
Volume: 10876
First page : 226
Last page: 245
Number of pages: 20
ISBN: 978-3-319-92039-9
eISBN: 978-3-319-92040-5
ISSN: 0302-9743
DOI: https://doi.org/10.1007/978-3-319-92040-5_12
Self-archived copy’s web address: https://research.utu.fi/converis/portal/detail/Publication/32081756
With energy-efficient architectures, including accelerators and many-core processors, gaining traction, application developers face the challenge of optimizing their applications for multiple hardware features including many-core parallelism, wide processing vector-units and on-chip high-bandwidth memory. In this paper, we discuss the development and utilization of a new application performance tool based on an extension of the classical roofline-model for simultaneously profiling multiple levels in the cache-memory hierarchy. This tool presents a powerful visual aid for the developer and can be used to frame the many-dimensional optimization problem in a tractable way. We show case studies of real scientific applications that have gained insights from the Integrated Roofline Model.
Downloadable publication This is an electronic reprint of the original article. |