We read what Powell says. Markets price what they think he meant. We price the gap.

A speaker-corpus probability engine for FOMC and central-bank communication. 8 years of Powell transcripts indexed at the utterance level, validated walk-forward on 43 FOMC meetings. Generalizes cleanly to Lagarde at the ECB (n=46, validated). Draghi DAX edge replicates at +16.0pp on n=41 (2014–2019, including QE-launch era) — second ECB chair, same architecture, same direction. Early-evidence on BoE Bailey and Carney. BoJ per-chair honestly weak — disclosed below.

Drill down — per speaker

Live call — Next FOMC (June 16–17, 2026)

SEER posterior
Cut6.5%vs FedWatch 0.0%+6.5pp gap
Hold84.4%vs FedWatch 90.8%-6.4pp gap
Hike9.2%vs FedWatch 9.2%0.0pp gap

based on April 29 2026 presser

SEER thesis (the gap)

FedWatch is pricing zero cut-tail. The model's prior-presser features assign a 6.5% cut probability — small in absolute terms, but a non-trivial gap on a tail FedWatch can't see. That asymmetric cut tail is the trade.

FedWatch's path-implied curve prices a 0% cut probability. The model's features from the April 29 presser put a 6.5% cut tail on June. We're long that tail; the call resolves June 17.

Four central banks, one architecture

Same lexicon engine, same walk-forward backtest protocol, same cross-asset position-sizing. Fed (Powell, n=43) + ECB (Lagarde n=46, Draghi n=41) + BoE (Bailey n=17, Carney n=10) + BoJ (Kuroda+Ueda pooled n=130). The verbal-alpha pattern reproduces on Powell and Lagarde at validated scale. Draghi DAX replicates at +16.0pp on n=41 — second ECB chair, same direction (Sharpe CI still crosses zero, treated as early-evidence). Bailey and Carney show early-evidence edges at small n. BoJ per-chair honestly weak — disclosed in the table below.

Central bankTargetEdge vs momentumPred ↔ actual ρSharpe (ann)nStatus
Federal Reserve
Powell
S&P 500+12.9pp+0.400.6643validated
Federal Reserve
Powell
10y Treasury+5.4pp+0.411.3243validated
Federal Reserve
Powell
5y Treasury+0.6pp+0.461.6243validated
ECB
Lagarde
EuroStoxx 50+14.2pp+0.060.7446validated
ECB
Lagarde
DAX+12.2pp+0.040.5646validated
ECB
Draghi
DAX+16.0pp+0.020.2041early
Bank of England
Bailey
FTSE 250+27.2pp-0.200.3117early
Bank of England
Bailey
GBP/USD+39.3pp+0.280.9217early
Bank of England
Carney
GBP/USD+36.7pp+0.561.0610early
Bank of Japan
Pooled (Kuroda+Ueda)
Nikkei+5.8pp+0.060.57130research
Bank of Japan
Kuroda
Nikkei+20.6pp-0.14-0.6318weak
Bank of Japan
Ueda
Nikkei-16.9pp-0.12-0.7921weak

Status legend: validated = n ≥ 30 with stable CI · early = n = 10–20, point estimates positive but CIs wide · thin = n < 10, suggestive only · research = pooled signal with weak per-chair components — modest edge survives but we don't claim more than that · weak = per-chair signal honestly does not separate from noise. The FED-built lexicon does not transfer cleanly to BoJ at the per-chair level.

Why isn't this a single “speech → SPX” product?

We tested it. The result: Powell → SPX is real (Sharpe 0.66, n=43). Lagarde → SPX is also real (Sharpe 0.91, n=43, edge +4.7pp). But BoE chairs and BoJ chairs do not predict SPX — every BoE/BoJ-on-SPX backtest is null or actively negative.

That's the architecture working as designed: BoE words move sterling and FTSE, not SPX. BoJ words move Nikkei and yen, not SPX. The right product is “right asset per bank” — not “everything → SPX.” Forcing a unified SPX target would throw away ~70% of the cross-bank signal we've measured.

The edge — right tool per asset class

Different asset reactions need different feature sets. Rates pricing is dominated by macro regime × verbal interaction (gradient-boosted on 76 features incl. CPI / NFP / unemployment + verbal). Equity and FX reaction is dominated by verbal cadence alone (linear on 22 verbal features). Honest split, principled, both validated walk-forward.

Powell → US treasuries (V2: gradient-boosted, macro + verbal)
Powell → 5y treasury yield
+0.6pp

edge over momentum

SEER74.4%Momentum73.8%Sharpe (ann.)1.6295% CI[0.94, 2.38]n meetings43
Powell → 10y treasury yield
+5.4pp

edge over momentum

SEER74.4%Momentum69.1%Sharpe (ann.)1.3295% CI[0.49, 2.26]n meetings43
Powell → equities + Lagarde generalization (V1: verbal-only ridge)
Powell → S&P 500
+12.9pp

edge over momentum

SEER60.5%Momentum47.6%Sharpe (ann.)0.6695% CI[-0.18, 1.42]n meetings43
Lagarde → DAX (generalization test)
+12.2pp

edge over momentum

SEER52.2%Momentum40.0%Sharpe (ann.)0.5695% CI[-0.27, 1.19]n meetings46

Edge over momentum baseline (predict same sign as last meeting). Walk-forward, no leakage, n=43 Powell meetings (2018–2026), n=46 Lagarde meetings (2020–2026). Sharpe 95% CI from 5,000-sample bootstrap on per-meeting signal-driven P&L.

Per-speaker signal heterogeneity — pooled fails, individual speakers work

We built a corpus of 520 Fed governor speeches across 2018–2026 (1.47M tokens, 7 governors), and ran the same-day cross-asset model both pooled and per-speaker. The architectural finding: pooling all governors into one signal fails (negative edges across all targets — speakers are not interchangeable). Per-speaker models recover real but modest verbal alpha, with the strongest signal on Waller's rate predictions.

Waller → 5y treasury yield (same day)
+4.2pp

edge over momentum

SEER41.3%Momentum37.1%n meetings63
Waller → 10y treasury yield (same day)
+7.3pp

edge over momentum

SEER49.2%Momentum41.9%n meetings63

Pearson correlation between Waller's speech-day lexicon score and the realized 5y move = +0.260 on n=63 walk-forward predictions. That's real but modest — what desk research has long suggested about Waller specifically. An earlier 2024-2026-only sample (n=28) showed a stronger +27pp/+0.605 reading; backfilling to the full 8-year history reduced it to the +0.260 above. We report the more conservative number.

The product is per-speaker. Every voting member gets their own corpus + their own model. The desk subscribes to a ranked feed of speaker signals. Pooled-committee aggregation is the wrong architecture and the data confirms it.

vocmarkets — the full verbal-alpha portfolio

The four headline targets above aren't the whole story. Same engine, run against the macro complex: 9 distinct asset classes show positive verbal-alpha edge from Powell's prior-presser features. Credit (LQD/HYG), oil, and BTC join the original rates + equities + FX. Sector rotation (XLF/XLE/XLK) and gold are null — those moves require fundamentals or supply shocks the verbal signal can't see.

AssetClassEdge over momentumSharpe (ann)95% CI
5y treasury (FVX)Rates+0.6pp1.62[0.94, 2.38]
10y treasury (TNX)Rates+5.4pp1.32[0.49, 2.26]
S&P 500Equities+12.9pp0.66[-0.18, 1.42]
DXYFX+3.1pp0.71[-0.12, 1.77]
LQD (IG credit)Credit+7.9pp0.74[-0.08, 1.66]
HYG (HY credit)Credit+5.8pp0.53[-0.32, 1.30]
WTI OilCommod+10.6pp0.38[-0.44, 1.27]
BTCCrypto+8.4pp0.05[-0.86, 0.86]
VIXVol+3.5pp0.45[-0.47, 1.05]
Honest null results — what the engine doesn't predict
Energy ETF (XLE)Sectors-10.6ppverbal alpha doesn't predict sector rotation
Tech ETF (XLK)Sectors-1.3ppnull
GoldCommod+3.3ppedge present but Sharpe ~0 — too noisy

Each row is a separate walk-forward backtest, n=43, no leakage. The expanded universe converts a single-asset signal into a portfolio of edges — and the honest null results signal which moves the engine isn't designed to capture.

Direction vs magnitude — two distinct trading-desk products

A direction signal tells you which way to lean. A magnitude prediction tells you how big to size. We measured both honestly. Result: rates predictions are tradeable on magnitude (real MAE improvement over a naive zero baseline). Equity/credit/commodity predictions are tradeable on direction only — magnitudes are correctly-signed but ~2x too aggressive, requiring trader-side shrinkage.

Magnitude product (rates)

SEER predicts the bp move of the day. MAE is meaningfully below naive zero baseline.

AssetSEER MAENaive MAELift
10y treasury (TNX)4.19bp5.13bp+18.3%
5y treasury (FVX)5.81bp6.67bp+12.8%

A trading desk receives a numerical bp forecast they can size against directly.

Direction product (equities, credit, FX, commod)

Direction is correct (see edge table above) but magnitude shrinkage required. Calibration factor ~0.5 fixes it post-hoc.

AssetDirection edgeMagnitude verdict
S&P 500see table abovedirection +12.9pp; magnitudes ~2x too aggressive
DXYsee table abovedirection +3.1pp; magnitudes over-predicted
LQD (IG credit)see table abovedirection +7.9pp; magnitudes over-predicted
WTI Oilsee table abovedirection +10.6pp; magnitudes over-predicted

A trading desk receives a directional signal + a recommended position-size cap.

We disclose where the model is well-calibrated and where it isn't. The two products map to different desk subscribers: the rates desk consumes magnitude forecasts; the equity / credit / FX desk consumes direction signals.

Which words actually move equities

For every word in the lexicon, we correlate Powell’s usage frequency at the prior presser with the realized FOMC-day SPX move. The pattern is durable: confident / committed language predicts rallies, hedge / topic-naming language predicts selloffs.

Bullish words
appropriate+0.534confident
well positioned+0.234confident
we expect+0.156commitment
warranted+0.152confident
accommodative+0.179dovish
Bearish words
fairly-0.308hedge
PCE-0.272topic-naming
it depends-0.241hedge
could-0.201hedge
GDP-0.200topic-naming

Pearson correlation between word frequency at Powell’s prior presser and the realized FOMC-day S&P 500 move (n=63 meetings).

Call ledger — 3 live, 4 walk-forward backtests

Three live pre-registered calls — BoE Bailey May 8, ECB Lagarde June 5, and Fed Powell June 17 — plus four walk-forward backtest replays of prior pressers. Backtest replays use only data available at the source presser's timestamp — no leakage. They are not pre-registered live calls; they are honest model replays, regenerated from the same `python -m seer_fomc.model.posterior` pipeline. Walk-forward backtest on these 4 entries: 2 hits, 2 misses. The model's broader walk-forward record across all n=43 FOMC meetings is 74.4% argmax-correct (see Calibration section). Bailey resolves in 3 days, Lagarde in 1 month, Powell in 6 weeks.

Live · pre-registeredFOMC June 16-17, 20262026-05-02 21:51 UTC (pre-registered)
cut 7% · hold 84% · hike 9%

FedWatch prices a 0% cut tail. The model's prior-presser features assign 6.5% — a non-trivial cut tail FedWatch's path-implied curve cannot price. Resolves June 17.

FedWatch at the time: cut 0.0% · hold 90.8% · hike 9.2%  |  SEER cut tail = +6.5pp gap

Target: next FOMC decision

Walk-forward backtest · ✗ missFOMC March 18 → April 29, 2026Backtest replay (used only data ≤ March 18)
cut 83% · hold 17% · hike 0%

March 18 presser features pushed the model toward predicting a cut. The committee held — model was over-confident on the cut tail given the still-contested dissent split.

Outcome: April 29 outcome: HOLD (with 4 dissents)

MISS — predicted cut (83%), outcome hold.

Target: April 29 outcome from March 18 presser features

Walk-forward backtest · ✓ hitFOMC January 28 → March 18, 2026Backtest replay (used only data ≤ January 28)
cut 46% · hold 54% · hike 0%

Hold edged out cut on the lexicon features; the model carried real cut-tail uncertainty (46%) that resolved in favor of hold.

Outcome: March 18 outcome: HOLD

HIT — argmax hold (54%), outcome hold.

Target: March 18 outcome from January 28 presser features

Walk-forward backtest · ✓ hitFOMC December 10 → January 28, 2025/2026Backtest replay (used only data ≤ December 10)
cut 47% · hold 52% · hike 0%

Tightly-bracketed cut/hold call — hold won by 5pp. The dissent split visible in the December presser left real ambiguity that the model captured honestly.

Outcome: January 28 outcome: HOLD

HIT — argmax hold (52%), outcome hold.

Target: January 28 outcome from December 10 presser features

Walk-forward backtest · ✗ missFOMC September 17 → October 29, 2025Backtest replay (used only data ≤ September 17)
cut 8% · hold 92% · hike 0%

Model predicted hold (92%) from the September features. The committee cut. This is the cut-tail under-confidence problem disclosed in the calibration section — the model needed more cut-cycle history than it had at this point.

Outcome: October 29 outcome: cut -25bp

MISS — predicted hold (92%), outcome cut.

Target: October 29 outcome from September 17 presser features

The June 17 call is committed in this deploy and resolves publicly when the meeting happens. The four backtest entries are walk-forward replays of the model on prior pressers using only data available at the source timestamp (no future leakage); they are not pre-registered live calls. Going forward, every Lagarde / Bailey / Powell call will be pre-registered here before the relevant meeting.

Live · pre-registeredBoE Bailey · May 8, 2026
2026-05-05 (pre-registered, deployed)

Cross-asset directional call from the Ridge model (fit on n=21 prior events), predicting from Feb 5 2026 presser features. Resolves on May 8 close.

TargetPredicted moveNote
FTSE 100-1.03%model's strongest signal
FTSE 250-0.42%early-evidence edge target (walk-forward +27.2pp, n=17)
GBP/USD-0.04%model uncertain — prediction near flat
Live · pre-registeredECB Lagarde · June 5, 2026
2026-05-06 (pre-registered, deployed)

Cross-asset directional call from the Ridge model (fit on n=51 prior events), predicting from April 30 2026 presser features. Resolves on June 5 close.

TargetPredicted moveNote
DAX-2.01%validated edge target (walk-forward +12.2pp, n=46)
EuroStoxx 50-1.69%validated edge target (walk-forward +14.2pp, n=46)
EUR/USD+0.04%model uncertain — prediction near flat
Decision posterior (cut / hold / hike)

Combined Draghi+Lagarde multinomial logistic, n=101 training events. Walk-forward backtest details + per-class calibration on the methodology page.

cut
9.9%
base rate 42%
hold
88.6%
base rate 58%
hike
1.5%
base rate 0%

Caveat: Cross-asset directional calls. Lagarde June 5 also includes a decision posterior (cut/hold/hike) trained on combined Draghi+Lagarde history. BoE Bailey decision posterior is next-session work; the May 8 call remains cross-asset only.

Pre-registration calendar
  • May 8, 2026  ·  BoE Bailey  ·  pre-registered above (cross-asset)
  • June 5, 2026  ·  ECB Lagarde  ·  pre-registered above (cross-asset)
  • June 16–17, 2026  ·  Fed Powell  ·  pre-registered (decision posterior + 6.5% cut tail)

Calibration — when the model says 84% hold, is it actually 84%?

Brier score and log loss measure probability calibration directly. Walk-forward, n=43 FOMC meetings, scored against a trailing base-rate baseline (predict the empirical decision frequency over the prior 12 meetings).

Brier score (lower is better)
SEER0.456Base rate0.623Improvement27%
Log loss (lower is better)
SEER0.882Base rate1.737Improvement49%
Per-decision-class calibration
OutcomenArgmax accAvg P(correct outcome)Calibration
cut633.3%0.361under-confident on cuts (model needs more cut-cycle history)
hold2676.9%0.682well calibrated
hike1190.9%0.772well calibrated

Honest weakness: the model is well-calibrated on hold and hike (the dominant regime in 2018–2024) but under-confident on cuts. It missed the start of the September / October 2025 cut cycle — predicted hold both times when the outcome was a 25bp cut. The cut tail is the part of the distribution that needs more training history, and it's the part of the live June 17 call most worth watching. See every walk-forward prediction →

Reliability diagram — predicted vs empirical (pooled across cut/hold/hike)

For each predicted-probability bucket, what fraction of the time was the model actually right? Perfect calibration = predicted equals empirical. Pooled across all three classes (one-vs-rest), 43 walk-forward meetings × 3 classes = 129 (predicted, outcome) pairs.

Predicted bucketnMean predictedEmpiricalCalibration bar
0–10%672.4%10.4%
10–30%1116.1%18.2%
30–60%1447.7%42.9%
60–90%1580.1%86.7%
90–100%2295.2%68.2%

Grey bar = mean predicted probability in the bucket. Amber line = empirical frequency (where the actual outcome landed). Mid-range buckets (10–90%) are well-calibrated within ±10pp. The 90–100% bucket is over-confident: when the model assigns 95% confidence, the empirical hit rate is 68%. This is the same cut-tail miss problem visible in the per-class table above.

The product roadmap

Now
Powell + ECB + BoE + BoJ

FOMC engine validated walk-forward (n=43). ECB Lagarde validated at scale (n=46). BoE early evidence (Bailey n=17, Carney n=10). Draghi + BoJ per-chair honestly weak. June 17 FOMC pre-registered live.

Next 30 days
Pre-register every chair

June 17 (Powell), June 5 (Lagarde), May 8 (Bailey) all pre-registered before their meetings. Public timestamps.

Next 90 days
Williams + regional Feds

Williams (NY Fed, permanent voter, separate corpus). Regional Fed presidents (Daly, Logan, Goolsbee, Kashkari, Bostic). Each speaker is its own corpus + walk-forward test.

Live, priced on every screen, next to everything. Powell, Lagarde, Bailey, Ueda. Tim Cook on earnings. Trump at a rally. Any speaker. Any market. The probability of what’s actually going to happen, calibrated, in real time, in plain English.