CS203 Winter 2020 - Assignment 2


Due: Monday, February 10 midnight (11:59:59 pm)

This assignment will be turned in through Github Classroom.

Objective

The goal of assignment 2 is to design and implement various branch predictors, including simple 1-bit and 2-bit predictors, as well as correlating (m,n) predictors. If you have any questions, please post to Campuswire as others may be having the same issue.

For this assigment, we will be using Github Classroom.
You can accept the assignment invitation by clicking the following link: https://classroom.github.com/a/SqsLQ33H
Once you accept the assignment, a private github repository will automatically be created with the starter code.

Since some trace files are fairly large (>50mb), the traces are contained in another repository as a submodule within your assignment repository.
Therefore, you will need to clone your repository and the submodules as follow:

git clone --recurse-submodules https://github.com/UCR-CS203/assignment2-branchpredictor-<username>.git

or

git clone https://github.com/UCR-CS203/assignment2-branchpredictor-<username>.git
cd assignment2-branchpredictor-<username>
git submodule init
git submodule update

Branch Traces

To facilitate the testing of your branch predictor, we will utilize two sets of traces collected from a run of gcc. The traces gcc-10K.txt, and gcc-8M.txt are available in the traces directory. The traces contain ~10 thousand and ~8.5 million entries. The 10K trace can be used to test your simulator, while the 8M trace is representative of an entire gcc execution. A sample of the trace is given below:

48d1f9 T
48d237 N
48d244 N
48d248 T
48be4b T
48bec6 N
48bed2 T
48c402 T
48bf2d T
48ca60 T

Each line in the trace file contains the Program Counter of the branch instruction (in the form of a 24-bit hexadecimal address), and whether that branch was actually taken (T) or not taken (N).

Branchsim

Similar to assignment 1, Branchsim is a trace-driven simulator. Branchsim takes an input trace of branch addresses and branch outcomes, as shown above.

The goal of Branchsim is to be able to simulate the behavior of a (m,n) branch predictor. Recall, that a simple 1-bit predictor is a (0,1) predictor and simple 2-bit predictor is a (0,2) predictor with 0 global history.
Therefore, Branchsim can take in a branch predictor type in the form of (m,n) and simulate it.

The command-line parameters for Branchsim are:

  • -i : The input trace file
  • -m : The global history size
  • -n : n-bit predictor
  • -a : The number of address bits (LSB) used to index into the branch history table
  • -d: The debug flag prints out the results of each branch prediction

To simplify the assignment, we can make the following constraints:

  • Use the last 8 bits of the PC address to index into the branch history table (a=8).

Extending Branchsim

Currently, Branchsim only implements a simple branch predictor that keeps track of an 6-bit global history (m=6) and simply uses the last observed branch outcome (the last bit) as the next prediction.
Your goal for this assignment is to extend Branchsim in order to support the functional simulation of a (m,n) branch predictor.

Code walkthrough

The main simulation loop simply grabs the next branch in the input trace and then calls the branch predictor to make a prediction. The address is returned as a string and the expected branch outcome is a bool where 1 is taken and 0 is not taken.
main.cpp, L43-L49

    // Execute trace through Predictor
    while(trace.getNextBranch(address,expected)){
        predicted = pred.makePrediction(address,expected);
        if(debug){
            printf("Branch address: 0x%s, predicted: %d, expected: %d\n", address.c_str(), predicted, expected);
        }
    }

The branch predictor (predictor.cpp) makes a prediction with makePrediction() (predictor.cpp, L25-L44) by first converting the address from a hexadecimal string to an unsigned int. Note that this address isn't currently used in the starter code. However, you will be using it for your implementation.
predictor.cpp, L26-L27

    // Convert Hex address to integer address
    unsigned int address = truncateAddress(hexToInt(input));

Next, the predictor checks for the last observed branch outcome and uses this for the prediction. This is achieved by and masking the last bit of the globalHistory. The globalHistory contains a history of previously observed branch outcomes, with each bit representing a prior branch. Again, 0 represents a not taken branch and a 1 represents a taken branch.
predictor.cpp, L29-L32

    // Currently, this simple branch predictor simulator simply takes 
    // the previous observed branch direction as the next prediction.
    // Predict branch based on last observed branch
    bool predicted = globalHistory & 1; 

Once we make the prediction, we update the state of the branch predictor. In this baseline case, we only need to update the global history with updateGlobalHistory(). This function left shift the branch history and then adds the outcome of the latest branch to the least significant bit. Then we trim the globalHistory to the size of the history by and masking.
predictor.cpp, L18-L23

void Predictor::updateGlobalHistory(bool expected){
    globalHistory = globalHistory << 1;
    globalHistory = globalHistory | expected;
    unsigned int mask = (1 << this->historyBits) - 1;
    globalHistory = globalHistory & mask; 
}

Extra credit

In reality, simulators should be flexible and configurable. All parameters inthis assignment should be configurable. Add support in your simulator for any number of LSBs (max 12) from the PC address to index into the BHT.

Submission

Please include a printout of your simulation run when executing gcc-10K.txt and gcc-8M.txt.
In addition, answer the following questions by adding a report document in the repository.

Please name your report FirstName-LastName.pdf or FirstName-LastName.txt or FirstName-LastName.docx, etc. Commit and push your completed code to the github repository. (You only need to modify pipeline.cpp.)

Please also include your name in the report.

Questions

  1. Assuming a simple 2-bit branch predictor that uses 8 bits from the PC to index into the BHT, how many entries does each BHT have? What is the size of the BHT (in bits)? Note: This question is conceptual and can be done by hand.

  2. Assuming a (4,2) predictor, how many total entries does this predictor have? What is the total size of this correlating predictor (in bits)? Note: This question is conceptual and can be done by hand.

  3. What is the misprediction rate for the given traces and predictor?

Trace (0,1) (0,2) (4,1) (4,2) (6,1) (6,2)
gcc-10K.txt            
gcc-8M.txt            
  1. Is every entry in our branch predictor utilized? For our (4,1) predictor, how many entries are utilized? In order to answer this question, you will have to instrument your simulator to keep track of how many entries are used. Please update the following print statement in the printStats() function in predictor.cpp to print out the result:

    // Update the following line to print out the number of BHT entires used.
    printf("BHT used %lu entries\n",0); 
  2. How does the global branch history help improve branch prediction rates? Will all applications benefit from using global branch history?

  3. What is a local predictor? How does it help with branch prediction? Can it be combined with the predictors that we implemented in assignment 2?