Assignment 4 - Histogram

Due Monday, December 4 at 11:59:59 PM PST


The objective of this assignment is to implement a histogram kernel and demonstrate the usage of atomic operations.


  1. For this lab, we will be using Github Classroom.
    Please join the classroom by clicking the following link: Once you join the classroom, a private github repository will automatically be created with the starter code.
    Simply git clone to copy the starter code to Bender or the GPGPU-Sim container.

  2. Edit the source files to complete the functionality of histogram on the device. The size of input and number of bins can be any size, but we will not test your code with bin size larger than 4096 or size of input greater than 1M.
    Please use a block size of 512.

  3. There are three modes of operation for the application. Check main() for a description of the modes (repeated below). You will support each of these modes using Histogram with a privitization pattern.

    • No arguments: The application will create a randomly initialized vector of 1 million elements and 4096 bins. After the device histogram is invoked, it will compute the correct solution matrix using the CPU, and compare that solution with the device-computed solution. If it matches (within a certain tolerance), if will print out "Test PASSED" to the screen before exiting.
    • One argument: The application will create a randomly initialized vector of specificed size. The number of bins will be 4096.
    • Two arguments: The first argument will specify the data input size, and the second argument specify the number of bins.
  4. Commit and push your completed histogram code to the private repository.

GPGPU-Sim analysis

There is no GPGPU-Sim component to this assignment. Yay!

Answer the following questions:

Assume you perform a histogram operation on an array of size N, 16 blocks, block size of B and 1024 bins. Assume a privatized histogram implementation.

  1. How many global memory reads are performed during the histogram kernel?

  2. How many global memory writes are performed during the histogram kernel?

  3. How many total atomic operations are performed?


  1. Commit and push your completed histogram code to the github repository. (You only need to modify
  2. Answer the previous questions by including a report document in the repository in either PDF, Plain text, or Markdown format only. Please name your report FirstName-LastName.pdf or FirstName-LastName.txt or, etc.

    Please also include your name in the report.