SEER — The pricing layer for what's actually happening

SEER/ FOMC

We read what Powell says. Markets price what they think he meant. We price the gap.

A speaker-corpus probability engine for FOMC and central-bank communication. 8 years of Powell transcripts indexed at the utterance level, validated walk-forward on 43 FOMC meetings. Generalizes cleanly to Lagarde at the ECB (n=46, validated). Draghi DAX edge replicates at +16.0pp on n=41 (2014–2019, including QE-launch era) — second ECB chair, same architecture, same direction. Early-evidence on BoE Bailey and Carney. BoJ per-chair honestly weak — disclosed below.

Drill down — per speaker

Powell · Fed · n=43

Every walk-forward prediction, the full cross-asset table, the words that move equities.→

71 pressers parsed. Last 10 walk-forward predictions with hits and misses. 9-asset cross-asset edge table. Top bullish / bearish words.

Lagarde · ECB · n=46

The cross-bank generalization test. EuroStoxx Sharpe 0.74, DAX +12.2pp.→

52 pressers parsed (Dec 2019–Apr 2026). Cross-asset table for EuroStoxx, DAX, EUR/USD with honest weak read on FX. Last 8 EuroStoxx walk-forward predictions, top DAX bullish / bearish words.

Bailey · BoE · n=17

Early-evidence sterling signal. GBP/USD +39.3pp, Sharpe 0.92.→

22 pressers parsed (Nov 2020–Feb 2026). Cross-asset table for GBP/USD, FTSE 250, FTSE 100. Last 10 GBP/USD walk-forward predictions. Wide CIs, treated as early-evidence not validated.

Methodology

Architecture, validation protocol, honest weaknesses, roadmap.→

Partner-readable seven-section summary. Walk-forward protocol, lexicon construction, calibration, the cut-tail under-confidence we disclose, and what we won’t claim until n ≥ 30.

Live call — Next FOMC (June 16–17, 2026)

SEER posterior

Cut6.5%vs FedWatch 0.0%+6.5pp gap

Hold84.4%vs FedWatch 90.8%-6.4pp gap

Hike9.2%vs FedWatch 9.2%0.0pp gap

based on April 29 2026 presser

SEER thesis (the gap)

FedWatch is pricing zero cut-tail. The model's prior-presser features assign a 6.5% cut probability — small in absolute terms, but a non-trivial gap on a tail FedWatch can't see. That asymmetric cut tail is the trade.

FedWatch's path-implied curve prices a 0% cut probability. The model's features from the April 29 presser put a 6.5% cut tail on June. We're long that tail; the call resolves June 17.

Four central banks, one architecture

Same lexicon engine, same walk-forward backtest protocol, same cross-asset position-sizing. Fed (Powell, n=43) + ECB (Lagarde n=46, Draghi n=41) + BoE (Bailey n=17, Carney n=10) + BoJ (Kuroda+Ueda pooled n=130). The verbal-alpha pattern reproduces on Powell and Lagarde at validated scale. Draghi DAX replicates at +16.0pp on n=41 — second ECB chair, same direction (Sharpe CI still crosses zero, treated as early-evidence). Bailey and Carney show early-evidence edges at small n. BoJ per-chair honestly weak — disclosed in the table below.

Central bank	Target	Edge vs momentum	Pred ↔ actual ρ	Sharpe (ann)	n	Status
Federal Reserve Powell	S&P 500	+12.9pp	+0.40	0.66	43	validated
Federal Reserve Powell	10y Treasury	+5.4pp	+0.41	1.32	43	validated
Federal Reserve Powell	5y Treasury	+0.6pp	+0.46	1.62	43	validated
ECB Lagarde	EuroStoxx 50	+14.2pp	+0.06	0.74	46	validated
ECB Lagarde	DAX	+12.2pp	+0.04	0.56	46	validated
ECB Draghi	DAX	+16.0pp	+0.02	0.20	41	early
Bank of England Bailey	FTSE 250	+27.2pp	-0.20	0.31	17	early
Bank of England Bailey	GBP/USD	+39.3pp	+0.28	0.92	17	early
Bank of England Carney	GBP/USD	+36.7pp	+0.56	1.06	10	early
Bank of Japan Pooled (Kuroda+Ueda)	Nikkei	+5.8pp	+0.06	0.57	130	research
Bank of Japan Kuroda	Nikkei	+20.6pp	-0.14	-0.63	18	weak
Bank of Japan Ueda	Nikkei	-16.9pp	-0.12	-0.79	21	weak

Status legend: validated = n ≥ 30 with stable CI · early = n = 10–20, point estimates positive but CIs wide · thin = n < 10, suggestive only · research = pooled signal with weak per-chair components — modest edge survives but we don't claim more than that · weak = per-chair signal honestly does not separate from noise. The FED-built lexicon does not transfer cleanly to BoJ at the per-chair level.

Why isn't this a single “speech → SPX” product?

We tested it. The result: Powell → SPX is real (Sharpe 0.66, n=43). Lagarde → SPX is also real (Sharpe 0.91, n=43, edge +4.7pp). But BoE chairs and BoJ chairs do not predict SPX — every BoE/BoJ-on-SPX backtest is null or actively negative.

That's the architecture working as designed: BoE words move sterling and FTSE, not SPX. BoJ words move Nikkei and yen, not SPX. The right product is “right asset per bank” — not “everything → SPX.” Forcing a unified SPX target would throw away ~70% of the cross-bank signal we've measured.

The edge — right tool per asset class

Different asset reactions need different feature sets. Rates pricing is dominated by macro regime × verbal interaction (gradient-boosted on 76 features incl. CPI / NFP / unemployment + verbal). Equity and FX reaction is dominated by verbal cadence alone (linear on 22 verbal features). Honest split, principled, both validated walk-forward.

Powell → US treasuries (V2: gradient-boosted, macro + verbal)

Powell → 5y treasury yield

+0.6pp

edge over momentum

SEER74.4%Momentum73.8%Sharpe (ann.)1.6295% CI[0.94, 2.38]n meetings43

Powell → 10y treasury yield

+5.4pp

edge over momentum

SEER74.4%Momentum69.1%Sharpe (ann.)1.3295% CI[0.49, 2.26]n meetings43

Powell → equities + Lagarde generalization (V1: verbal-only ridge)

Powell → S&P 500

+12.9pp

edge over momentum

SEER60.5%Momentum47.6%Sharpe (ann.)0.6695% CI[-0.18, 1.42]n meetings43

Lagarde → DAX (generalization test)

+12.2pp

edge over momentum

SEER52.2%Momentum40.0%Sharpe (ann.)0.5695% CI[-0.27, 1.19]n meetings46

Edge over momentum baseline (predict same sign as last meeting). Walk-forward, no leakage, n=43 Powell meetings (2018–2026), n=46 Lagarde meetings (2020–2026). Sharpe 95% CI from 5,000-sample bootstrap on per-meeting signal-driven P&L.

Per-speaker signal heterogeneity — pooled fails, individual speakers work

We built a corpus of 520 Fed governor speeches across 2018–2026 (1.47M tokens, 7 governors), and ran the same-day cross-asset model both pooled and per-speaker. The architectural finding: pooling all governors into one signal fails (negative edges across all targets — speakers are not interchangeable). Per-speaker models recover real but modest verbal alpha, with the strongest signal on Waller's rate predictions.

Waller → 5y treasury yield (same day)

+4.2pp

edge over momentum

SEER41.3%Momentum37.1%n meetings63

Waller → 10y treasury yield (same day)

+7.3pp

edge over momentum

SEER49.2%Momentum41.9%n meetings63

Pearson correlation between Waller's speech-day lexicon score and the realized 5y move = +0.260 on n=63 walk-forward predictions. That's real but modest — what desk research has long suggested about Waller specifically. An earlier 2024-2026-only sample (n=28) showed a stronger +27pp/+0.605 reading; backfilling to the full 8-year history reduced it to the +0.260 above. We report the more conservative number.

The product is per-speaker. Every voting member gets their own corpus + their own model. The desk subscribes to a ranked feed of speaker signals. Pooled-committee aggregation is the wrong architecture and the data confirms it.

vocmarkets — the full verbal-alpha portfolio

The four headline targets above aren't the whole story. Same engine, run against the macro complex: 9 distinct asset classes show positive verbal-alpha edge from Powell's prior-presser features. Credit (LQD/HYG), oil, and BTC join the original rates + equities + FX. Sector rotation (XLF/XLE/XLK) and gold are null — those moves require fundamentals or supply shocks the verbal signal can't see.

Asset	Class	Edge over momentum	Sharpe (ann)	95% CI
5y treasury (FVX)	Rates	+0.6pp	1.62	[0.94, 2.38]
10y treasury (TNX)	Rates	+5.4pp	1.32	[0.49, 2.26]
S&P 500	Equities	+12.9pp	0.66	[-0.18, 1.42]
DXY	FX	+3.1pp	0.71	[-0.12, 1.77]
LQD (IG credit)	Credit	+7.9pp	0.74	[-0.08, 1.66]
HYG (HY credit)	Credit	+5.8pp	0.53	[-0.32, 1.30]
WTI Oil	Commod	+10.6pp	0.38	[-0.44, 1.27]
BTC	Crypto	+8.4pp	0.05	[-0.86, 0.86]
VIX	Vol	+3.5pp	0.45	[-0.47, 1.05]

Honest null results — what the engine doesn't predict

Energy ETF (XLE)	Sectors	-10.6pp	verbal alpha doesn't predict sector rotation
Tech ETF (XLK)	Sectors	-1.3pp	null
Gold	Commod	+3.3pp	edge present but Sharpe ~0 — too noisy

Each row is a separate walk-forward backtest, n=43, no leakage. The expanded universe converts a single-asset signal into a portfolio of edges — and the honest null results signal which moves the engine isn't designed to capture.

Direction vs magnitude — two distinct trading-desk products

A direction signal tells you which way to lean. A magnitude prediction tells you how big to size. We measured both honestly. Result: rates predictions are tradeable on magnitude (real MAE improvement over a naive zero baseline). Equity/credit/commodity predictions are tradeable on direction only — magnitudes are correctly-signed but ~2x too aggressive, requiring trader-side shrinkage.

Magnitude product (rates)

SEER predicts the bp move of the day. MAE is meaningfully below naive zero baseline.

Asset	SEER MAE	Naive MAE	Lift
10y treasury (TNX)	4.19bp	5.13bp	+18.3%
5y treasury (FVX)	5.81bp	6.67bp	+12.8%

A trading desk receives a numerical bp forecast they can size against directly.

Direction product (equities, credit, FX, commod)

Direction is correct (see edge table above) but magnitude shrinkage required. Calibration factor ~0.5 fixes it post-hoc.

Asset	Direction edge	Magnitude verdict
S&P 500	see table above	direction +12.9pp; magnitudes ~2x too aggressive
DXY	see table above	direction +3.1pp; magnitudes over-predicted
LQD (IG credit)	see table above	direction +7.9pp; magnitudes over-predicted
WTI Oil	see table above	direction +10.6pp; magnitudes over-predicted

A trading desk receives a directional signal + a recommended position-size cap.

We disclose where the model is well-calibrated and where it isn't. The two products map to different desk subscribers: the rates desk consumes magnitude forecasts; the equity / credit / FX desk consumes direction signals.

Which words actually move equities

For every word in the lexicon, we correlate Powell’s usage frequency at the prior presser with the realized FOMC-day SPX move. The pattern is durable: confident / committed language predicts rallies, hedge / topic-naming language predicts selloffs.

Bullish words

“appropriate”+0.534confident

“well positioned”+0.234confident

“we expect”+0.156commitment

“warranted”+0.152confident

“accommodative”+0.179dovish

Bearish words

“fairly”-0.308hedge

“PCE”-0.272topic-naming

“it depends”-0.241hedge

“could”-0.201hedge

“GDP”-0.200topic-naming

Pearson correlation between word frequency at Powell’s prior presser and the realized FOMC-day S&P 500 move (n=63 meetings).

Call ledger — 3 live, 4 walk-forward backtests

Three live pre-registered calls — BoE Bailey May 8, ECB Lagarde June 5, and Fed Powell June 17 — plus four walk-forward backtest replays of prior pressers. Backtest replays use only data available at the source presser's timestamp — no leakage. They are not pre-registered live calls; they are honest model replays, regenerated from the same `python -m seer_fomc.model.posterior` pipeline. Walk-forward backtest on these 4 entries: 2 hits, 2 misses. The model's broader walk-forward record across all n=43 FOMC meetings is 74.4% argmax-correct (see Calibration section). Bailey resolves in 3 days, Lagarde in 1 month, Powell in 6 weeks.

Live · pre-registeredFOMC June 16-17, 20262026-05-02 21:51 UTC (pre-registered)

cut 7% · hold 84% · hike 9%

FedWatch prices a 0% cut tail. The model's prior-presser features assign 6.5% — a non-trivial cut tail FedWatch's path-implied curve cannot price. Resolves June 17.

FedWatch at the time: cut 0.0% · hold 90.8% · hike 9.2% | SEER cut tail = +6.5pp gap

Target: next FOMC decision

Walk-forward backtest · ✗ missFOMC March 18 → April 29, 2026Backtest replay (used only data ≤ March 18)

cut 83% · hold 17% · hike 0%

March 18 presser features pushed the model toward predicting a cut. The committee held — model was over-confident on the cut tail given the still-contested dissent split.

Outcome: April 29 outcome: HOLD (with 4 dissents)

MISS — predicted cut (83%), outcome hold.

Target: April 29 outcome from March 18 presser features

Walk-forward backtest · ✓ hitFOMC January 28 → March 18, 2026Backtest replay (used only data ≤ January 28)

cut 46% · hold 54% · hike 0%

Hold edged out cut on the lexicon features; the model carried real cut-tail uncertainty (46%) that resolved in favor of hold.

Outcome: March 18 outcome: HOLD

HIT — argmax hold (54%), outcome hold.

Target: March 18 outcome from January 28 presser features

Walk-forward backtest · ✓ hitFOMC December 10 → January 28, 2025/2026Backtest replay (used only data ≤ December 10)

cut 47% · hold 52% · hike 0%

Tightly-bracketed cut/hold call — hold won by 5pp. The dissent split visible in the December presser left real ambiguity that the model captured honestly.

Outcome: January 28 outcome: HOLD

HIT — argmax hold (52%), outcome hold.

Target: January 28 outcome from December 10 presser features

Walk-forward backtest · ✗ missFOMC September 17 → October 29, 2025Backtest replay (used only data ≤ September 17)

cut 8% · hold 92% · hike 0%

Model predicted hold (92%) from the September features. The committee cut. This is the cut-tail under-confidence problem disclosed in the calibration section — the model needed more cut-cycle history than it had at this point.

Outcome: October 29 outcome: cut -25bp

MISS — predicted hold (92%), outcome cut.

Target: October 29 outcome from September 17 presser features

The June 17 call is committed in this deploy and resolves publicly when the meeting happens. The four backtest entries are walk-forward replays of the model on prior pressers using only data available at the source timestamp (no future leakage); they are not pre-registered live calls. Going forward, every Lagarde / Bailey / Powell call will be pre-registered here before the relevant meeting.

Live · pre-registeredBoE Bailey · May 8, 2026

2026-05-05 (pre-registered, deployed)

Cross-asset directional call from the Ridge model (fit on n=21 prior events), predicting from Feb 5 2026 presser features. Resolves on May 8 close.

Target	Predicted move	Note
FTSE 100	-1.03%	model's strongest signal
FTSE 250	-0.42%	early-evidence edge target (walk-forward +27.2pp, n=17)
GBP/USD	-0.04%	model uncertain — prediction near flat

Live · pre-registeredECB Lagarde · June 5, 2026

2026-05-06 (pre-registered, deployed)

Cross-asset directional call from the Ridge model (fit on n=51 prior events), predicting from April 30 2026 presser features. Resolves on June 5 close.

Target	Predicted move	Note
DAX	-2.01%	validated edge target (walk-forward +12.2pp, n=46)
EuroStoxx 50	-1.69%	validated edge target (walk-forward +14.2pp, n=46)
EUR/USD	+0.04%	model uncertain — prediction near flat

Decision posterior (cut / hold / hike)

Combined Draghi+Lagarde multinomial logistic, n=101 training events. Walk-forward backtest details + per-class calibration on the methodology page.

cut

9.9%

base rate 42%

hold

88.6%

base rate 58%

hike

1.5%

base rate 0%

Caveat: Cross-asset directional calls. Lagarde June 5 also includes a decision posterior (cut/hold/hike) trained on combined Draghi+Lagarde history. BoE Bailey decision posterior is next-session work; the May 8 call remains cross-asset only.

Pre-registration calendar

May 8, 2026 · BoE Bailey · pre-registered above (cross-asset)
June 5, 2026 · ECB Lagarde · pre-registered above (cross-asset)
June 16–17, 2026 · Fed Powell · pre-registered (decision posterior + 6.5% cut tail)

Calibration — when the model says 84% hold, is it actually 84%?

Brier score and log loss measure probability calibration directly. Walk-forward, n=43 FOMC meetings, scored against a trailing base-rate baseline (predict the empirical decision frequency over the prior 12 meetings).

Brier score (lower is better)

SEER0.456Base rate0.623Improvement−27%

Log loss (lower is better)

SEER0.882Base rate1.737Improvement−49%

Per-decision-class calibration

Outcome	n	Argmax acc	Avg P(correct outcome)	Calibration
cut	6	33.3%	0.361	under-confident on cuts (model needs more cut-cycle history)
hold	26	76.9%	0.682	well calibrated
hike	11	90.9%	0.772	well calibrated

Honest weakness: the model is well-calibrated on hold and hike (the dominant regime in 2018–2024) but under-confident on cuts. It missed the start of the September / October 2025 cut cycle — predicted hold both times when the outcome was a 25bp cut. The cut tail is the part of the distribution that needs more training history, and it's the part of the live June 17 call most worth watching. See every walk-forward prediction →

Reliability diagram — predicted vs empirical (pooled across cut/hold/hike)

For each predicted-probability bucket, what fraction of the time was the model actually right? Perfect calibration = predicted equals empirical. Pooled across all three classes (one-vs-rest), 43 walk-forward meetings × 3 classes = 129 (predicted, outcome) pairs.

Predicted bucket	n	Mean predicted	Empirical
0–10%	67	2.4%	10.4%
10–30%	11	16.1%	18.2%
30–60%	14	47.7%	42.9%
60–90%	15	80.1%	86.7%
90–100%	22	95.2%	68.2%

Grey bar = mean predicted probability in the bucket. Amber line = empirical frequency (where the actual outcome landed). Mid-range buckets (10–90%) are well-calibrated within ±10pp. The 90–100% bucket is over-confident: when the model assigns 95% confidence, the empirical hit rate is 68%. This is the same cut-tail miss problem visible in the per-class table above.

The product roadmap

Now

Powell + ECB + BoE + BoJ

FOMC engine validated walk-forward (n=43). ECB Lagarde validated at scale (n=46). BoE early evidence (Bailey n=17, Carney n=10). Draghi + BoJ per-chair honestly weak. June 17 FOMC pre-registered live.

Next 30 days

Pre-register every chair

June 17 (Powell), June 5 (Lagarde), May 8 (Bailey) all pre-registered before their meetings. Public timestamps.

Next 90 days

Williams + regional Feds

Williams (NY Fed, permanent voter, separate corpus). Regional Fed presidents (Daly, Logan, Goolsbee, Kashkari, Bostic). Each speaker is its own corpus + walk-forward test.

Live, priced on every screen, next to everything. Powell, Lagarde, Bailey, Ueda. Tim Cook on earnings. Trump at a rally. Any speaker. Any market. The probability of what’s actually going to happen, calibrated, in real time, in plain English.