riscv-boom's loop

loop

The Loop Predictor specializes in predicting loop exit branches.It learns the iteration count of loops and predicts the loop exit when the learned count is reached.

Structure:

Small table (16 sets by default)
Each entry stores: tag, confidence, age, predicted count, speculative count
When speculative count matches predicted count with high confidence,
predict the opposite of the base predictor (i.e., exit the loop)

Update Policy:

On misprediction at loop exit: update predicted count, adjust confidence
Age-based replacement for conflicting entries

Loop Predictor

功能定位

检测固定次数的循环
纠正TAGE对循环边界的预测
处理for(i=0; i<N; i++)这类模式

核心问题

TAGE的限制：

for (int i = 0; i < 100; i++) {
    if (i < 99) {  // 这个分支
        // taken 99次
    } else {
        // not taken 1次
    }
}

TAGE看到的历史：TTTTTTT...TTTN（99个T，1个N）

历史中99%是T，TAGE会强烈偏向Taken
在第100次（应该Not Taken）时，TAGE可能预测错误

Loop预测器的作用：

记住”这个分支每100次循环后会Not Taken”
在第100次时反转TAGE的预测

存储结构

表项定义（loop.scala:31-37）：

class LoopEntry extends Bundle {
  val tag   = UInt(10.W)   // PC标签
  val conf  = UInt(3.W)    // 置信度（0-7）
  val age   = UInt(3.W)    // 年龄（用于替换）
  val p_cnt = UInt(10.W)   // 预测的循环次数（0-1023）
  val s_cnt = UInt(10.W)   // 当前循环计数（0-1023）
}

组织结构：

16个sets，每个set有多列（way）
全相联查找（在每个set内）

工作原理

预测阶段（loop.scala:81-85）：

// 默认：使用前一级预测器的结果
io.f3_pred := io.f3_pred_in

// 检查是否命中循环表
when (entry.tag === current_tag) {
  // 检查是否到达循环边界
  when (s_cnt === entry.p_cnt && entry.conf === 7.U) {
    // 反转预测！
    io.f3_pred := !io.f3_pred_in
  }
}

更新逻辑（loop.scala:95-106）：

// F4阶段（预测后的下一个cycle）
when (f4_fire && entry.tag === tag) {
  when (f4_scnt === entry.p_cnt && entry.conf === 7.U) {
    // 检测到循环边界，重置计数器
    entries(idx).s_cnt := 0.U
    entries(idx).age   := 7.U
  } .otherwise {
    // 循环内部，递增计数器
    entries(idx).s_cnt := f4_scnt + 1.U
    entries(idx).age   := Mux(age === 7.U, 7.U, age + 1.U)
  }
}

预测错误时的学习（loop.scala:114-167）：

when (mispredict && !doing_reset) {
  val entry = entries(update_idx)
  val tag_match = entry.tag === update_tag
  val ctr_match = entry.p_cnt === update_scnt

  when (entry.conf === 7.U && tag_match) {
    // 已学习，但预测错误 → 降低置信度
    wentry.conf := entry.conf - 1.U
    wentry.s_cnt := 0.U

  } .elsewhen (entry.conf === 7.U && !tag_match) {
    // 已学习，但tag不匹配 → 不做替换（保护高置信度entry）

  } .elsewhen (entry.conf =/= 0.U && tag_match && ctr_match) {
    // 学习中，计数匹配 → 增加置信度
    wentry.conf := entry.conf + 1.U
    wentry.s_cnt := 0.U

  } .elsewhen (entry.conf =/= 0.U && tag_match && !ctr_match) {
    // 学习中，计数不匹配 → 重新学习
    wentry.conf := 0.U
    wentry.p_cnt := update_scnt  // 更新预期循环次数
    wentry.s_cnt := 0.U

  } .elsewhen (entry.conf =/= 0.U && !tag_match && entry.age === 0.U) {
    // 学习中，tag不匹配，age低 → 替换
    wentry.tag := update_tag
    wentry.conf := 1.U
    wentry.p_cnt := update_scnt
    wentry.s_cnt := 0.U

  } .elsewhen (entry.conf =/= 0.U && !tag_match && entry.age =/= 0.U) {
    // 学习中，tag不匹配，age高 → 降低age
    wentry.age := entry.age - 1.U

  } .elsewhen (entry.conf === 0.U && tag_match) {
    // 未初始化，tag匹配
    when (ctr_match) {
      wentry.conf := 1.U  // 开始学习
      wentry.s_cnt := 0.U
    } .otherwise {
      wentry.p_cnt := update_scnt  // 记录循环次数
      wentry.s_cnt := 0.U
    }

  } .elsewhen (entry.conf === 0.U && !tag_match) {
    // 未初始化，分配新entry
    wentry.tag := update_tag
    wentry.conf := 1.U
    wentry.p_cnt := update_scnt
    wentry.s_cnt := 0.U
  }
}

状态机

Loop entry的学习过程：

[conf=0, 未初始化]
         ↓ 第一次预测错误，记录循环次数N
[conf=1, 学习中, p_cnt=N]
         ↓ 再次在第N次时预测错误
[conf=2, 学习中]
         ↓ 持续匹配
      ...
         ↓
[conf=7, 已学习] ← 开始生效，反转预测
         ↓ 如果再次错误
[conf=6, 降低置信度]

为什么有效

例子：

1
2
3

for (int i = 0; i < 100; i++) {
    if (i < 99) { ... }  // 分支A
}

TAGE的视角：

历史：TTTTTT...TTTTN（99个T，1个N）
预测：总是Taken（因为99%历史是T）
第100次：预测Taken，实际Not Taken → 错误

Loop的视角：

记录：这个分支每100次有一个边界
s_cnt: 0→1→2→…→99
当s_cnt=99时：检测到边界，反转TAGE的预测
预测：Not Taken → 正确

预测器的组合逻辑级联

代码实现

所有预测器的连接（config-mixins.scala:603-609）：

// 初始输入（全0预测）
val resp_in = (0.U).asTypeOf(new BranchPredictionBankResponse)

// 组合逻辑链
ubtb.io.resp_in(0)  := resp_in
bim.io.resp_in(0)   := ubtb.io.resp      // 组合逻辑
btb.io.resp_in(0)   := bim.io.resp       // 组合逻辑
tage.io.resp_in(0)  := btb.io.resp       // 组合逻辑
loop.io.resp_in(0)  := tage.io.resp      // 组合逻辑

// 最终输出
final_resp = loop.io.resp

Loop Predictor 循环预测器工作流程详解

1. 整体架构概览


┌──────────────────────────────────────────────────────────────────────────────────────────┐
│                              Loop Branch Predictor                                        │
│                                                                                           │
│  ┌─────────────────────────────────────────────────────────────────────────────────────┐ │
│  │                      多列结构 (bankWidth 个独立 Column)                              │ │
│  │                                                                                      │ │
│  │   Column 0           Column 1           Column 2          ...      Column N-1       │ │
│  │  ┌──────────────┐   ┌──────────────┐   ┌──────────────┐         ┌──────────────┐   │ │
│  │  │  entries[16] │   │  entries[16] │   │  entries[16] │   ...   │  entries[16] │   │ │
│  │  │              │   │              │   │              │         │              │   │ │
│  │  │  LoopEntry   │   │  LoopEntry   │   │  LoopEntry   │         │  LoopEntry   │   │ │
│  │  │  寄存器阵列   │   │  寄存器阵列   │   │  寄存器阵列   │         │  寄存器阵列   │   │ │
│  │  └──────────────┘   └──────────────┘   └──────────────┘         └──────────────┘   │ │
│  │         │                  │                  │                        │           │ │
│  │         └───────────���──────┴──────────────────┴────────────────────────┘           │ │
│  │                                         │                                           │ │
│  │                                         ▼                                           │ │
│  │                              ┌─────────────────────┐                                │ │
│  │                              │   Loop 覆盖逻辑     │                                │ │
│  │                              │ (高置信度时翻转预测) │                                │ │
│  │                              └─────────────────────┘                                │ │
│  └─────────────────────────────────────────────────────────────────────────────────────┘ │
│                                                                                           │
│  基础预测器: io.resp_in(0) 提供初始预测 (通常来自 TAGE/BIM)                               │
│  Loop 预测器: 在检测到稳定循环模式时覆盖基础预测                                           │
└─────────────────────────────────────��────────────────────────────────────────────────────┘

2. 数据结构详解


┌─────────────────────────────────────────────────────────────────────────────────┐
│                           LoopEntry (循环条目)                                   │
└─────────────────────────────────────────────────────────────────────────────────┘

┌────────────────────────────────────���────────────────────────────────────────────┐
│                           LoopEntry Bundle                                       │
├────────────┬────────────┬────────────┬────────────┬────────────────────────────┤
│  tag (10b) │  conf (3b) │  age (3b)  │ p_cnt (10b)│        s_cnt (10b)         │
│  地址标签   │  置信度    │  年龄计数  │ 预测循环次数│        当前迭代次数         │
└────────────┴────────────┴────────────┴────────────┴────────────────────────────┘

字段详解:
┌────────────┬───────────────────────────────────────────────────────────────────┐
│    tag     │ 分支地址的高位标签，用于匹配检查                                    │
│            │ 范围: 10 bits                                                      │
├────────────┼────────────────────────────────────────────────────────────────��──┤
│    conf    │ 置信度计数器 (0-7)                                                 │
│            │ conf = 7: 完全学习，可以进行预测覆盖                                │
│            │ conf = 0: 未学习/不确定                                            │
├────────────┼───────────────────────────────────────────────────────────────────┤
│    age     │ 年龄计数器 (0-7)，用于替换策略                                      │
│            │ age = 0: 可被替换                                                  │
│            │ age = 7: 最近活跃，不应被替换                                       │
├────────────┼───────────────────────────────────────────────────────────────────┤
│   p_cnt    │ 预测的循环次数 (Predicted Count)                                   │
│            │ 记录学习到的循环迭代次数                                            │
│            │ 范围: 0-1023 (10 bits)                                             │
├────────────┼───────────────────────────────────────────────────────────────────┤
│   s_cnt    │ 推测执行计数 (Speculative Count)                                   │
│            │ 当前循环已执行的迭代次数                                            │
│            │ 当 s_cnt == p_cnt 时，预测循环结束                                  │
└────────────┴───────────────────────────────────────────────────────────────────┘


LoopMeta (预测元数据):
┌───────��──────────────────────────────────────────────────────────────���──────────┐
│                           LoopMeta Bundle                                        │
├─────────────────────────────────────────────────────────────────────────────────┤
│                         s_cnt (10 bits)                                          │
│                     记录预测时的推测计数值                                        │
│                     用于误预测恢复时修复状态                                      │
└─────────────────────────────────────────────────────────────────────────────────┘

3. 循环预测核心原理


┌─────────────────────────────────────────────────────────────────────────────────┐
│                        循环预测原理                                              │
└──────────────────────────────��──────────────────────────────────────────────────┘

典型循环代码:
    for (int i = 0; i < 100; i++) {
        // loop body
    }

对应的分支行为:
    迭代 0:   Taken  (继续循环)
    迭代 1:   Taken  (继续循环)
    ...
    迭代 98:  Taken  (继续循环)
    迭代 99:  Not Taken (退出循环)  ← 这是关键点!

问题: 传统预测器 (BIM/TAGE) 难以预测最后一次退出
     因为前 99 次都是 Taken，会训练计数器趋向 Taken

Loop Predictor 解决方案:
    1. 学习循环的精确次数 (p_cnt = 100)
    2. 跟踪当前迭代 (s_cnt = 0, 1, 2, ... 99)
    3. 当 s_cnt == p_cnt 时，预测 Not Taken (覆盖基础预测器)

    迭代:     0     1     2    ...   98    99
              │     │     │          │     │
    s_cnt:    0     1     2    ...   98    99
              │     │     │          │     │
    p_cnt:   100   100   100   ...  100   100
              │     │     │          │     │
    匹配?:   No    No    No    ...  No    Yes!
              │     │     │          │     │
    动作:   透传   透传   透传  ...  透传  翻转!

4. 置信度状态机


┌─────────────────────────────────────────────────────────────────────────────────┐
│                        置信度 (conf) 状态机                                      │
└─────────────────────────────────────────────────────────────────────────────────┘

                    置信度等级与行为
    ┌─────┬─────┬─────┬─────┬─────┬─────┬─────┬─────┐
    │  0  │  1  │  2  │  3  │  4  │  5  │  6  │  7  │
    │未学 │ 低  │ 低  │ 中  │ 中  │ 高  │ 高  │完全 │
    │ 习  │置信 │置信 │置信 │置信 │置信 │置信 │学习 │
    └─────┴─────┴─────┴─��───┴─────┴─────┴─────┴─────┘
                                              │
                                              ▼
                                     只有 conf=7 时
                                     才会覆盖基础预测

状态转换:

    ┌──────────────────────────────────────────────────────────────────────────┐
    │                                                                           │
    │     新分支发现                 循环计数匹配              最终学习完成       │
    │         │                          │                        │            │
    │         ▼                          ▼                        ▼            │
    │     ┌───────┐  匹配且计数正确  ┌───────┐  匹配且计数正确  ┌───────┐       │
    │     │conf=0 │ ─────────────► │conf=1 │ ─────────────► │conf=7 │       │
    │     │ 未学习 │                │ 开始  │    ...         │ 完成  │       │
    │     └───────┘                └───────┘                └───────┘       │
    │         ▲                        │                        │            │
    │         │                        │ 计数不匹配              │ 误预测     │
    │         │                        ▼                        ▼            │
    │         │                   重置 p_cnt              conf = conf - 1    │
    │         │                   保持 conf                                   │
    │         │                                                               │
    │         └───────────────────────────────────────────────────────────────│
    │                          循环模式改变时完全重置                           │
    └──────────────────────────────────────────────────────────────────────────┘

5. 预测流水线时序


┌────────────────────────────────────────────────────────��────────────────────────┐
│                            预测流水线时序                                        │
└─────────────────────────────────────────────────────────────────────────────────┘

  周期     F2 (查找)        F3 (预测)         F4 (更新)
   │          │                │                 │
   ▼          ▼                ▼                 ▼
 ─────────────────────────────────────────────────────────────────
   │                           │                 │
   │  ┌──────────────────┐     │                 │
   │  │ f2_req_valid     │     │                 │
   │  │ f2_req_idx       │     │                 │
   │  │                  │     │                 │
   │  │ 读取 entries     │─────│                 │
   │  │ f2_entry =       │ Reg │                 │
   │  │ entries(idx)     │     │                 │
   │  │                  │     │                 │
   │  │ 旁路检查:        │     │                 │
   │  │ repair/mispredict│     │                 │
   │  └──────────────────┘     │                 │
   │                           │                 │
   │                    ┌──────▼──────┐          │
   │                    │ f3_entry    │          │
   │                    │ f3_scnt     │          │
   │                    │ f3_tag      │          │
   │                    │             │          │
   │                    │ 标签匹配?   │          │
   │                    │ tag == f3_tag│         │
   │                    │             │          │
   │                    │ 预测条件:   │          │
   │                    │ s_cnt==p_cnt│          │
   │                    │ && conf==7  │──────────│
   │                    │             │   Reg    │
   │                    │ 输出:       │          │
   │                    │ f3_pred     │          │
   │                    │ f3_meta     │          │
   │                    └─────────────┘          │
   │                                             │
   │                                     ┌───────▼───────┐
   │                                     │ f4_fire       │
   │                                     │ f4_entry      │
   │                                     │ f4_scnt       │
   │                                     │               │
   │                                     │ 更新 entries: │
   │                                     │ s_cnt++       │
   │                                     │ age 管理      │
   │                                     └───────────────┘
 ─────────────────────────────────────────────────────────────────

关键点:

F2: 读取条目，处理旁路
F3: 标签匹配，做出预测决策
F4: 推测更新 s_cnt (在预测被使用后)

6. 预测决策逻辑


┌─────────────────────────────────────────────────────────────────────────────────┐
│                         预测决策流程                                             │
└─────────────────────────────────────────────────────────────────────────────────┘

                         ┌─────────────────────────┐
                         │   f3_pred_in            │
                         │ (来自基础预测器的预测)   │
                         └────────────┬────────────┘
                                      │
                                      ▼
                         ┌─────────────────────────┐
                         │   f3_entry.tag == f3_tag?│
                         │   (标签匹配检查)         │
                         └────────────┬────────────┘
                                      │
                    ┌─────────────────┴─────────────────┐
                    │                                   │
                    ▼                                   ▼
              标签匹配                             标签不匹配
                    │                                   │
                    ▼                                   ▼
    ┌───────────────────────────────┐      ┌───────────────────────┐
    │  f3_scnt == f3_entry.p_cnt?   │      │  透传基础预测          │
    │  && f3_entry.conf == 7?       │      │  f3_pred = f3_pred_in  │
    └───────────────┬───────────────┘      └───────────────────────┘
                    │
       ┌────────────┴────────────┐
       │                         │
       ▼                         ▼
    条件满足                  条件不满足
       │                         │
       ▼                         ▼
┌──────────────────┐    ┌──────────────────┐
│ 翻转预测!         │    │ 透传基础预测      │
│ f3_pred =        │    │ f3_pred =        │
│   !f3_pred_in    │    │   f3_pred_in     │
└──────────────────┘    └──────────────────┘


代码对应 (loop.scala:78-85):

    io.f3_pred := io.f3_pred_in    // 默认透传

    when (f3_entry.tag === f3_tag) {
      when (f3_scnt === f3_entry.p_cnt && f3_entry.conf === 7.U) {
        io.f3_pred := !io.f3_pred_in    // 翻转!
      }
    }

7. 推测更新逻辑 (F4 阶段)

┌───────────────────────��─────────────────────────────────────────────────────────┐
│                         F4 推测更新                                              │
└─────────────────────────────────────────────────────────────────────────────────┘

目的: 在预测被前端使用后，更新循环计数器

触发条件: f4_fire (预测已发送且被采用)

                         ┌─────────────────────────┐
                         │       f4_fire           │
                         │   (预测被采用)           │
                         └────────────┬────────────┘
                                      │
                                      ▼
                         ┌─────────────────────────┐
                         │ f4_entry.tag == f4_tag? │
                         └────────────┬────────────┘
                                      │
                    ┌─────────────────┴─────────────────┐
                    │                                   │
                    ▼                                   ▼
              标签匹配                             标签不匹配
                    │                                   │
                    ▼                                   ▼
    ┌───────────────────────────────┐      ┌───────────────────────┐
    │  循环结束条件?                 │      │  不更新               │
    │  s_cnt == p_cnt && conf == 7   │      │                       │
    └───────────────┬───────────────┘      └───────────────────────┘
                    │
       ┌────────────┴────────────┐
       │                         │
       ▼                         ▼
    循环结束                  循环继续
       │                         │
       ▼                         ▼
┌──────────────────┐    ┌────────────────────────┐
│ 重置计数器:       │    │ 递增计数器:             │
│ s_cnt = 0        │    │ s_cnt = s_cnt + 1      │
│ age = 7 (刷新)   │    │ age = min(age+1, 7)    │
└──────────────────┘    └────────────────────────┘


示例 (循环 100 次):

    迭代     s_cnt    p_cnt    动作
    ───────────────────────────────────
      0        0      100     s_cnt++ → 1
      1        1      100     s_cnt++ → 2
      2        2      100     s_cnt++ → 3
     ...      ...     ...     ...
     98       98      100     s_cnt++ → 99
     99       99      100     s_cnt++ → 100 ✗ 应该重置!
    ───────────────────────────────────

    等等，上面有问题！因为预测翻转发生在 s_cnt==p_cnt 时
    实际上:

    迭代     s_cnt    p_cnt    预测        动作
    ───────────────────────────────────────────────
      0        0      100     Taken       s_cnt++ → 1
     ...      ...     ...     ...         ...
     99       99      100     Taken       s_cnt++ → 100
    100      100      100     翻转→NT!    s_cnt = 0 (重置)
    ───────────────────────────────────────────────

8. 误预测更新状态机

┌─────────────────────────────────────────────────────────────────────────────────┐
│                         误预测更新状态机                                         │
└─────────────────────────────────────────────────────────────────────────────────┘

输入信号:
  - update_mispredict: 分支误预测
  - tag_match: 标签是否匹配
  - ctr_match: 循环计数是否匹配 (p_cnt == update_meta.s_cnt)

状态转换表:

┌─────────────────┬───────────┬───────────┬─────────────────────────────────────────┐
│     当前状态     │ tag_match │ ctr_match │              操作                        │
├─────────────────┼───────────┼───────────┼─────────────────────────────────────────┤
│ conf=7 (已学习) │    Yes    │     -     │ conf--, s_cnt=0 (降低置信度)             │
├─────────────────┼───────────┼───────────┼─────────────────────────────────────────┤
│ conf=7 (已学习) │    No     │     -     │ 不操作 (保护高置信条目)                   │
├─────────────────┼───────────┼───────────┼─────────────────────────────────────────┤
│ conf∈[1,6]     │    Yes    │    Yes    │ conf++, s_cnt=0 (增加置信度)             │
├─────────────────┼───────────┼───────────┼─────────────────────────────────────────┤
│ conf∈[1,6]     │    Yes    │    No     │ conf=0, s_cnt=0, p_cnt=meta.s_cnt       │
│                 │           │           │ (循环次数改变，重新学习)                  │
├─────────────────┼───────────┼───────────┼─────────────────────────────────────────┤
│ conf∈[1,6]     │    No     │     -     │ if age==0: 替换条目                      │
│                 │           │           │ else: age-- (老化)                       │
├─────────────────┼───────────┼───────────┼─────────────────────────────────────────┤
│ conf=0 (未学习) │    Yes    │    Yes    │ conf=1, age=7, s_cnt=0 (开始学习)        │
├─────────────────┼───────────┼───────────┼─────────────────────────────────────────┤
│ conf=0 (未学习) │    Yes    │    No     │ p_cnt=meta.s_cnt, age=7, s_cnt=0        │
│                 │           │           │ (更新预测计数)                           │
├─────────────────┼───────────┼───────────┼─────────────────────────────────────────┤
│ conf=0 (未学习) │    No     │     -     │ 分配新条目: tag, conf=1, age=7,         │
│                 │           │           │ s_cnt=0, p_cnt=meta.s_cnt               │
└─────────────────┴─────────���─┴─��─────────┴─────────────────────────────────────────┘


状态机可视化:

                              误预测事件
                                  │
                                  ▼
            ┌─────────────────────────────────────────────────┐
            │                 conf == 7?                      │
            └─────────────────────┬───────────────────────────┘
                    ┌─────────────┴─────────────┐
                    │                           │
                    ▼                           ▼
                conf = 7                    conf < 7
                    │                           │
                    ▼                           ▼
            ┌─────────────┐           ┌─────────────────────┐
            │ tag_match?  │           │     conf == 0?      │
            └──────┬──────┘           └──────────┬──────────┘
               ┌───┴───┐                    ┌────┴────┐
               ▼       ▼                    ▼         ▼
             Yes      No                 conf=0    conf>0
               │       │                    │         │
               ▼       ▼                    ▼         ▼
          ┌────────┐ ┌────────┐      ┌──────────┐ ┌────────────┐
          │conf--  │ │不操作  │      │tag_match?│ │ tag_match? │
          │s_cnt=0 │ │(保护)  │      └────┬─────┘ └─────┬──────┘
          └────────┘ └────────┘       ┌───┴───┐     ���───┴───┐
                                      ▼       ▼     ▼       ▼
                                    Yes      No   Yes      No
                                      │       │     │       │
                                      ▼       ▼     ▼       ▼
                                 ┌────────┐ ┌────┐ ┌───┐ ┌────────┐
                                 │ctr_match│ │分配│ │ctr│ │age==0?│
                                 └────┬───┘ │新条│ │匹配│ └───┬───┘
                                  ┌───┴───┐ │目  │ │?  │   ┌─┴──┐
                                  ▼       ▼ └────┘ └─┬─┘   ▼    ▼
                                Yes      No      ┌──┴──┐ 替换 age--
                                  │       │      ▼     ▼
                                  ▼       ▼    Yes    No
                            ┌────────┐ ┌────────┐│      │
                            │conf=1  │ │更新    ││      │
                            │age=7   │ │p_cnt   │▼      ▼
                            └────────┘ └────────┘conf++ 更新p_cnt

9. 修复更新 (Repair Update)


┌─────────────────────────���───────────────────────────────────────────────────────┐
│                         修复更新 (Repair)                                        │
└─────────────────────────────────────────────────────────────────────────────────┘

场景: 推测执行被取消，需要恢复 s_cnt 到正确状态

    误预测发生
         │
         ▼
    ┌─────────────────────────────────────────────────────────────┐
    │  is_repair_update = true                                   │
    │  update_meta.s_cnt = 正确的循环计数值                        │
    └─────────────────────────────────────────────────────────────┘
         │
         ▼
    ┌─────────────────────────────────────────────────────────────┐
    │  检查条件:                                                  │
    │  1. tag_match (确保是同一个循环)                            │
    │  2. !(f4_fire && update_idx == f4_idx)                     │
    │     (避免与推测更新冲突)                                     │
    └─────────────────────────────────────────────────────────────┘
         │
         ▼
    ┌─────────────────────────────────────────────────────────────┐
    │  entries(update_idx).s_cnt := update_meta.s_cnt            │
    │  恢复到误预测前的计数值                                      │
    └─────────────────────────────────────────────────────────────┘


F2 阶段的旁路处理:

    当前读取的条目可能已经被修复/误预测更新影响
    需要实时旁路:

    f2_entry.s_cnt :=
        if (update_repair && idx_match)      → update_meta.s_cnt
        elif (update_mispredict && idx_match) → 0
        else                                  → entries(idx).s_cnt

10. 年龄替换策略


┌─────────────────────────────────────────────────────────────────────────────────┐
│                         年龄 (Age) 替换策略                                      │
└─────────────────────────────────────────────────────────────────────────────────┘

目的: 管理条目的生命周期，平衡新旧循环

年龄计数器行为:

    ┌────────────────────────────────────────────────────────────────────────────┐
    │                                                                             │
    │   增加 age (保护条目):                                                       │
    │   ├─ 每次成功预测使用: age = min(age + 1, 7)                                │
    │   ├─ 新分配条目: age = 7                                                   │
    │   └─ 循环结束重置: age = 7                                                  │
    │                                                                             │
    │   减少 age (老化):                                                          │
    │   └─ 误预测且标签不匹配: age = age - 1 (如果 age > 0)                       │
    │                                                                             │
    │   替换条件:                                                                 │
    │   └─ conf ∈ [1,6] && !tag_match && age == 0                                │
    │                                                                             │
    └────────────────────────────────────────────────────────────────────────────┘

替换流程:

    新循环分支需要条目
           │
           ▼
    ┌─────────────────────────────────────┐
    │  检查目标位置的条目                  │
    │  entry = entries(hash(pc))          │
    └────────────��────┬───────────────────┘
                      │
                      ▼
    ┌─────────────────────────────────────┐
    │  conf == 7?                         │
    └─────────────────┬───────────────────┘
                      │
         ┌────────────┴────────────┐
         │                         │
         ▼                         ▼
    conf == 7                  conf < 7
    (已完全学习)                (未完全学习)
         │                         │
         ▼                         ▼
    ┌─────────────┐        ┌─────────────────────┐
    │ 不替换!     │        │ age == 0?           │
    │ 保护高置信度│        └──────────┬──────────┘
    └─────────────┘                   │
                            ┌─────────┴───────��─┐
                            │                   │
                            ▼                   ▼
                        age == 0            age > 0
                            │                   │
                            ▼                   ▼
                    ┌─────────────┐      ┌─────────────┐
                    │ 替换!       │      │ age--       │
                    │ 分配新条目  │      │ 下次可能替换│
                    └─────────────┘      └─────────────┘

11. 完整数据流图


┌──────────────────────────────────────────────────────────────────────────────────────────┐
│                           Loop Predictor 完整数据流                                       │
└──────────────────────────────────────────────────────────────────────────────────────────┘

                              ┌─────────────────┐
                              │  取指请求        │
                              │  s2_idx         │
                              │  s2_valid       │
                              └────────┬────────┘
                                       │
                                       ▼
    ┌─────────────────────────────────────────────────────────────────────────────────────┐
    │                              F2: 条目读取                                            │
    │                                                                                      │
    │   ┌─��───────────────────────────────────────────────────────────────────────────┐   │
    │   │  对每个 Column (0..bankWidth-1):                                             │   │
    │   │                                                                              │   │
    │   │    f2_entry = entries(f2_req_idx)                                           │   │
    │   │                                                                              │   │
    │   │    旁路检查 (同一索引的更新):                                                  │   │
    │   │    if (update_repair && idx_match):                                         │   │
    │   │        f2_entry.s_cnt = update_meta.s_cnt                                   │   │
    │   │    elif (update_mispredict && idx_match):                                   │   │
    │   │        f2_entry.s_cnt = 0                                                   │   │
    │   └─────────────────────────────────────────────────────────────────────────────┘   │
    │                                                                                      │
    └───────────────────────────────────────────────┬──────────────────────────────────────┘
                                                    │ RegNext
                                                    ▼
    ┌─────────────────────────────────────────────────────────────────────────────────────┐
    │                              F3: 预测决策                                            │
    │                                                                                      │
    │   ┌─────────────────────────────────────────────────────────────────────────────┐   │
    │   │  f3_entry = RegNext(f2_entry)                                               │   │
    │   │  f3_tag = PC 高位标签                                                        │   │
    │   │  f3_scnt = s_cnt (可能被旁路更新)                                            │   │
    │   │                                                                              │   │
    │   │  io.f3_pred = io.f3_pred_in   // 默认透传                                   │   │
    │   │                                                                              │   │
    │   │  when (f3_entry.tag == f3_tag):                                             │   │
    │   │      when (f3_scnt == f3_entry.p_cnt && f3_entry.conf == 7):                │   │
    │   │          io.f3_pred = !io.f3_pred_in   // 翻转预测!                          │   │
    │   │                                                                              │   │
    │   │  io.f3_meta.s_cnt = f3_scnt   // 保存用于恢复                                │   │
    │   └─────────────────────────────────────────────────────────────────────────────┘   │
    │                                                                                      │
    └───────────────────────────────────────────────┬──────────────────────────────────────┘
                                                    │ RegNext
                                                    ▼
    ┌─────────────────────────────────────────────────────────────────────────────────────┐
    │                              F4: 推测更新                                            │
    │                                                                                      │
    │   ┌─────────────────────────────────────────────────────────────────────────────┐   │
    │   │  when (f4_fire && f4_entry.tag == f4_tag):                                  │   │
    │   │                                                                              │   │
    │   │      when (f4_scnt == f4_entry.p_cnt && f4_entry.conf == 7):                │   │
    │   │          // 循环结束                                                         │   │
    │   │          entries(f4_idx).s_cnt = 0                                          │   │
    │   │          entries(f4_idx).age = 7                                            │   │
    │   │                                                                              │   │
    │   │      otherwise:                                                             │   │
    │   │          // 循环继续                                                         │   │
    │   │          entries(f4_idx).s_cnt = f4_scnt + 1                                │   │
    │   │          entries(f4_idx).age = min(f4_entry.age + 1, 7)                     │   │
    │   └─────────────────────────────────────────────────────────────────────────────┘   │
    │                                                                                      │
    └──────────────────────────────────────────────────────────────────────────────────────┘

                              ┌─────────────────┐
                              │  更新路径        │
                              │  s1_update      │
                              └────────┬────────┘
                                       │
    ┌──────────────────────────────────┴───────────────────────────────────────────────────┐
    │                                                                                       │
    │   ┌────────────────────────────────┐      ┌────────────────────────────────┐         │
    │   │     update_mispredict          │      │       update_repair            │         │
    │   │     (误预测更新)                │      │       (修复更新)               │         │
    │   └───────────────┬────────────────┘      └───────────────┬────────────────┘         │
    │                   │                                       │                          │
    │                   ▼                                       ▼                          │
    │   ┌────────────────────────────────────────────────────────────────────────────┐    │
    │   │                                                                             │    │
    │   │  误预测: 根据 conf, tag_match, ctr_match 执行状态机                         │    │
    │   │          更新 conf, age, p_cnt, s_cnt, tag                                 │    │
    │   │                                                                             │    │
    │   │  修复: 仅恢复 s_cnt = update_meta.s_cnt                                     │    │
    │   │        (如果 tag_match 且无 F4 冲突)                                        │    │
    │   │                                                                             │    │
    │   └────────────────────────────────────────────────────────────────────────────┘    │
    │                                                                                       │
    └───────────────────────────────────────────────────────────────────────────────────────┘

12. Loop Predictor 与其他预测器的协作


┌─────────────────────────────────────────────────────────────────────────────────┐
│                    Loop Predictor 在预测器层次中的位置                           │
└─────────────────────────────────────────────────────────────────────────────────┘

                              ┌─────────────────┐
                              │    取指 PC       │
                              └────────┬────────┘
                                       │
                    ┌──────────────────┴──────────────────┐
                    │                                     │
                    ▼                                     ▼
            ┌───────────────┐                     ┌───────────────┐
            │     BTB       │                     │   BIM/TAGE    │
            │ (��标地址)    │                     │ (方向预测)    │
            └───────┬───────┘                     └───────┬───────┘
                    │                                     │
                    │                                     ▼
                    │                             ┌───────────────┐
                    │                             │    Loop       │
                    │                             │  Predictor    │
                    │                             │ (循环覆盖)    │
                    │                             └───────┬───────┘
                    │                                     │
                    │                                     │
                    └──────────────────┬──────────────────┘
                                       │
                                       ▼
                              ┌───────────────────┐
                              │   最终预测结果     │
                              │                   │
                              │ target = BTB      │
                              │ taken = Loop或BIM │
                              └───────────────────┘

数据流:

    io.resp_in(0)          io.resp
         │                    │
         │  ┌─────────────┐   │
         └─►│    Loop     │───┘
            │  Predictor  │
            └─────────────┘

    Loop 接收基础预测 (f3_pred_in)
    条件满足时翻转输出 (f3_pred)
    否则透传基础预测

优势:

BIM/TAGE 善于预测通用分支模式
Loop Predictor 专门处理循环退出
两者互补，提高整体预测精度

13. 关键参数总结

参数	默认值	说明
nSets	16	每列条目数
tagSz	10	标签位宽
nWays	4	(参数中定义但未使用)
threshold	7	(参数中定义但未使用)

LoopEntry 字段	位宽	说明
tag	10	地址标签
conf	3	置信度 (0-7)
age	3	年龄计数器 (0-7)
p_cnt	10	预测循环次数 (0-1023)
s_cnt	10	当前迭代次数 (0-1023)

14. 设计特点总结


┌─────────────────────────────────────────────────────────────────────────────────┐
│                         Loop Predictor 设计特点                                  │
└─────────────────────────────────────────────────────────────────────────────────┘

专用性:
- 专门针对循环分支设计
- 不试图预测所有分支，只处理学习到的稳定循环
保守性:
- 只有 conf=7 时才覆盖基础预测
- 需要多次正确观察才能建立置信度
- 减少错误覆盖导致的性能损失
推测执行支持:
- s_cnt 在预测使用后推测更新 (F4)
- 支持 repair 恢复误预测路径上的状态
- 旁路机制确保最新数据可见
老化与替换:
- age 计数器防止有用条目被快速替换
- 高置信度条目 (conf=7) 受到保护
- 未活跃条目逐渐老化可被替换
资源效率:
- 使用寄存器而非 SRAM (小规模 16 条目)
- 每个 bank 独立的 Column，避免冲突
- 简单的标签匹配，无复杂的多路选择

Loop Predictor 是一种专用预测器，专门用于处理循环分支的退出预测问题。它通过学习循环的精确迭代次数，在循环即将结束时覆盖基础预测器的预测，从而显著提高循环密集型代码的分支预测精度。

计算机体系结构

#分支预测

riscv-boom's loop

http://blog.luliang.online/2025/12/02/分支预测器loop/

作者

Luyoung

发布于

2025年12月2日

许可协议

riscv-boom's bp 上一篇

riscv-boom's tage 下一篇