A1 Journal article – refereed
Performance tuning and sparse traversal technique for a cell-based fetch length algorithm on a GPU

List of Authors: Mika Murtojärvi, Olli S. Nevalainen, Ville Leppänen
Publisher: Wiley
Publication year: 2015
Journal: Concurrency and Computation: Practice and Experience
Volume number: 27


For determining distances (fetch lengths) from points to polygons in a two-dimensional Euclidean plane, cell-based algorithms provide a simple and effective solution. They divide the input area into a grid of cells that cover the area. The objects are stored into the appropriate cells, and the resulting structure is used for solving the problem. When the input objects are distributed unevenly or the cell size is small, most of the cells may be empty. The representation is then called sparse. In the method proposed in this work, each cell contains information about its distance to the nonempty cells. It is then possible to skip over several empty cells at a time without memory accesses. A cell-based fetch length algorithm is implemented on a graphics processing unit (GPU). Because control flow divergence reduces its performance, several methods to reduce the divergence are studied. While many of the explicit attempts turn out to be unsuccessful, sorting of the input data and sparse traversal are observed to greatly improve performance: compared with the initial GPU implementation, up to 45-fold speedup is reached. The speed improvement is greatest when the map is very sparse and the points are given in a random order.

Last updated on 2019-29-01 at 11:02