What is Branch Prediction

The branch prediction unit, as the name suggests, needs to perform a basic task—branch prediction. Before delving into the branch prediction unit, it is necessary to understand what branch prediction is.

Why Do We Need Branch Prediction?

There are mainly two reasons for branch prediction: one is that the program’s execution flow contains branch instructions, and the other is that high-performance processors use pipeline design.

Program’s Execution Flow Contains Branch Instructions

int x = 10;
int y = 20;
int result = 0;

if (x >= y) {
    result = x + y;
} else {
    result = x - y;
}

The above is a piece of C code. It first defines three variables x, y, and result, and then assigns a value to result based on the comparison of x and y. It can be observed that the program assigns values to variables in sequence in the first three lines. However, in the 5th line, due to the presence of the if instruction, the program branches, jumping directly from the 5th line to the 8th line to continue execution, which causes a branch in the program’s execution.

After translating into RISC-V assembly code, the code is as follows:

li  a0, 10               # x = 10
li  a1, 20               # y = 20
li  a2, 0                # result = 0

blt a0, a1, else_branch  # Jump to else_branch if x < y
add a2, a0, a1           # Execute result = x + y
j end                    # Jump to end
else_branch:
sub a2, a0, a1           # Execute result = x - y
end:

It can be seen that the program still maintains the previous branching behavior. In the first three lines of the code, instructions are executed in sequence. Then, in the 5th line of the program, a special instruction blt appears, which we call a branch instruction. It will determine whether to execute the next instruction based on the relationship between x and y, and the appearance of this instruction causes a branch in the program’s execution.

High-performance Processors Use Pipeline Design

Therefore, the concept of branch prediction arises. If we can accurately predict the address of the next instruction before the execution result is generated, the processor can continue to run efficiently.

Feasibility of Branch Prediction

Why can branch prediction be done? Let’s look at an example:

if (data >= 128)
    sum += data;

Assuming that this instruction will be executed repeatedly, and data is incremented from 0, i.e., data = 0, 1, 2, 3 … 128, 129…, let’s analyze the behavior of executing this instruction each time.

T = branch taken
N = branch not taken

data   = 0, 1, 2, 3, s, ... 126, 127, 128, 129, 130, ... 250, 251, 252, ...
branch = N  N  N  N  N  ...   N    N    T    T    T  ...   T    T    T  ...

       = NNNNNNNNNNNN ... NNNNNNNTTTTTTTTT ... TTTTTTTTTT  (easy to predict)

It can be seen that in the first 128 times, the branch is always Not Taken (the condition is not met), but after 128 times, the branch is always Taken (the condition is met). If we predict whether it is Taken based on whether it was Taken last time, we will only make one prediction error throughout the prediction process.

The occurrence of the above phenomenon is due to a basic fact—whether a branch instruction jumps is related to the past jumping behavior of that instruction. By summarizing the past jumping rules, we can make a relatively accurate prediction for this jump, which also makes branch prediction possible.

In fact, the jump of branch instructions is also related to factors such as the jumping situation of other branch instructions. Fully exploiting effective information to produce accurate prediction results is one of the main tasks of branch prediction.

Basic Types of Branch Prediction

In RISC-V, branch instructions include two types:

Conditional Branch Instructions (beq, bne, blt, bge, bltu, bgeu) For these instructions, whether to jump is determined by the condition in the instruction, and the jump target can be obtained from the instruction. Therefore, we need to predict whether the instruction will jump.
Unconditional Jump Instructions (jal, jalr) For these instructions, the jump is always executed when encountered, but the jump target may be specified by a register. Therefore, we need to predict the jump target of the instruction.

Fortunately, due to the concise design of the RISC-V architecture, we do not need to handle conditional jump instructions. Every jump instruction we need to predict is unconditional, which is also convenient for our design.

From the above analysis, we can summarize the two basic types of branch prediction—direction prediction and target address prediction.

Direction Prediction of Branch Instructions

Direction prediction of branch instructions corresponds to conditional branch instructions in RISC-V instructions. We need to predict whether it needs to jump, which is called direction prediction.

Two-Bit Saturation Counters

Direction prediction has a very simple and efficient prediction method called two-bit saturation counter. The basic idea can be seen in the figure below.

The two-bit saturating counter is regarded as a state machine, and we maintain such a state machine for each branch instruction. When a branch instruction is taken, the corresponding state in the diagram moves to the right; otherwise, it moves to the left. So, the next time we encounter this branch instruction, we first look up its two-bit saturating counter. If the state is more biased to the right, we predict it to be taken; otherwise, we predict it not to be taken.

Of course, it’s impractical to maintain a two-bit saturating counter for each branch instruction. Therefore, in practice, we usually use part of the PC or a hash method to index the two-bit saturating counter, as shown in the diagram below.

Branch History

Branch history is a very commonly used data in branch prediction and the basis of most branch prediction algorithms because it directly shows the past jumping behavior of instructions.

There are two basic types of branch history:

Local Branch History Maintain a set of registers for each branch instruction, recording the historical jumping behavior of that instruction.
- For example: 0101000000101 (0 means not taken, 1 means taken)
全Global Branch History All instructions share a set of registers, recording the branching behavior during program execution.
- For example:
```
    beg a0, a1, label1          not taken  record 0
    bne a1, a2, label2          not taken  record 0
->  beq a2, a3, label4          taken      record 1
```
  After executing these three different branch instructions, the global branch history becomes 001.

Branch Target Address Prediction

In the RISC-V architecture, branch target address prediction refers to predicting the target address of unconditional jump instructions (e.g., jal, jalr). Since these instructions always perform a jump operation, we need to predict their target address.

Branch Target Buffer (BTB)

BTB is a common method for predicting target addresses. Its core idea is to use a cache to store the target addresses of past unconditional jump instructions. When encountering the same unconditional jump instruction again, the BTB can be checked to see if there is a record for that instruction. If so, the recorded target address is used as the predicted target address for the current execution.

The diagram below illustrates this:

Predicting Instruction Types

As we know, in branch prediction, for conditional branch instructions, we need to predict their direction, and for unconditional jump instructions, we need to predict their target. However, there is a problem: when we get a PC that needs to be predicted, we don’t know whether the corresponding instruction is a normal instruction or a branch instruction. Therefore, we cannot predict it.

How to solve this? One way is to predict the behavior of the instruction after fetching it. But fetching from ICache or Memory may take several cycles, which is a major drawback of this method.

A better way is to directly predict the type of instruction. After getting a PC, we can directly predict whether this instruction is a branch instruction and predict its behavior. In this way, we don’t have to wait for fetching to complete, and the predicted result can also guide the CPU to fetch from the correct location.

The method of type prediction can be similar to BTB, where a field in the cache contains the type of instruction for use in the next prediction.

General Steps of Branch Prediction

Through the introduction in this section, we can summarize the general steps of branch prediction:

Get the PC.
Predict whether it is a branch instruction.
1. If it is a conditional branch instruction, predict its direction and target.
2. If it is an unconditional jump instruction, predict its target.

Note that since predicting the type of instruction is required in prediction, and we haven’t obtained the specific content of the instruction, predicting the target of a conditional branch instruction also becomes our task.

Last modified September 13, 2024: Update the picture of BPU Top. (431c050)