← v1 MCU teardown

v2 — GNN surrogate cost-model

netlist → (area, power, leakage) regression on 26 OpenCores sky130 modules

Pipeline

  1. 10 OpenCores Verilog repos cloned (picorv32, serv, aes, sha256, sha1, chacha, uart, i2c, trng, siphash)
  2. 117 Verilog modules extracted; filtered to self-contained + has-FFs + has-clock-port: 41 candidates
  3. Per-module flow: yosys+abc (sky130 hd techmap) → Verilator (RTL random-vector sim, 500 cycles, .vcd dump) → OpenROAD (floorplan/place/parasitics estimate) → OpenSTA (read_vcd vector-based power)
  4. 26 successful end-to-end records (synth+pnr+power); 16 with vector-based switching activity
  5. Graph extraction: cells = nodes, shared bit-nets = edges (clock/power/reset nets dropped; star-topology to avoid O(N²))
  6. Custom 2-layer mean-aggregation GNN, sum-pool + cell-type-histogram concat, MLP head → 4-channel log10 PPA

Held-out correlations (full dataset, 21 train + 5 test)

channelpearson rinterpretation
log10(area_um2)+0.78strong — cell count dominates
log10(pwr_total_w)+0.66real signal; vector-power records lift this
log10(pwr_leakage_w)+0.19weak — leakage is ~0 at sky130 typical w/o per-cell Vt features
log10(n_cells)+0.81trivial — model recovers cell-count from structural features

Scatter (pred vs true, log10)

log10(area_um2) r=+0.78 predicted → true →
channellog10(area_um2)
pearson r+0.778
n26
log10(pwr_total_w) r=+0.66 predicted → true →
channellog10(pwr_total_w)
pearson r+0.659
n26
log10(pwr_leakage_w) r=+0.19 predicted → true →
channellog10(pwr_leakage_w)
pearson r+0.191
n26
log10(n_cells) r=+0.81 predicted → true →
channellog10(n_cells)
pearson r+0.807
n26

Honest caveats

Decision for Phase 3 (RL fine-tune with GNN reward)

This surrogate is good enough to demonstrate the end-to-end RL plumbing (matmul Verilog → yosys → graph → GNN → reward → GRPO update), but the power correlation (+0.66) and broken leakage (+0.19) are too noisy to actually drive Qwen toward higher-throughput matmul designs better than the real-synth reward did in our prior session. A pre-Phase-3 build of (a) per-module dependency tracker → ~150 module dataset, (b) leakage augmentation via Vt-mix synth runs, would lift correlations into the 0.85+ range where RL would actually benefit.