Reproduction of Numbers quoted in OpenCL Programming Guides by AMD and NVIDIA
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
Matthias Bach 795b15023e Regroup strides by offset from 64 bytes when plotting. 9 years ago
.gitignore Refactoring 10 years ago
README.md Script renaming 10 years ago
bandwidth.py Refactoring 10 years ago
kernels.cl Benchmark differend strides in SOA kernels. 10 years ago
runner.py Benchmark differend strides in SOA kernels. 10 years ago
sweepMemSize.py Script renaming 10 years ago
sweepStride.py Regroup strides by offset from 64 bytes when plotting. 9 years ago

README.md

OpenCL Bandwidth Measurements

This is a collection of OpenCL kernels that should be able to reproduce the global memory performance numbers given in the AMD Accelerated Parallel Processing OpenCL™ Programming Guide and NVIDIA's OpenCL Best Practices Guide. Those kernels are wrapped by python script handline all the boilerplate and the actual measurements.

Requirements

Usage

There are multiple scripts to choose from. For each script you will get additional invocation options by invoking it with --help:

  • bandwidth.py - Compare the bandwidth of multiple kernels for a given memory size.
  • ```sweepMemSize.py`` - Check the performance of a single kernel over a certain memory size range.