Strategy

How the engine produces an edge

The methodology behind every recommended bet on the Today page. Cloned from kalshi-edge's walk-forward discipline; Monte Carlo pattern adapted from Thoosie's park simulator.

The TL;DR

  1. For every starting pitcher tonight, a Monte Carlo simulator runs 16,000 trials of their start, modelling K outcomes PA-by-PA.
  2. The K probability per PA blends six signals: pitcher rolling K% (recency-weighted), opponent's K% vs that handedness, ballpark K-factor, confirmed lineup's per-batter K rate, home-plate umpire bias, and the times-through-order penalty.
  3. The model's predicted distribution is compared to real K-prop lines from DraftKings / FanDuel / Bovada / Pinnacle (via PropLine API).
  4. For each real (book, line, side) combination, we compute Expected Value vs the book's price. Any combination with positive EV becomes a candidate bet.
  5. Stake size = quarter-Kelly off the model's win probability vs the book's payout, capped at your max-single-bet setting.
  6. The walk-forward harness (2022-2024 cached Statcast) validates each model variant before it goes live. Current champion: baseline-v3-ump at 64.76% hit, MAE 1.84 Ks.

The Monte Carlo simulator

Cloned from ~/crowd-sim/simulate.py (the Thoosie park sim). For each starter we simulate plate-appearance-by-plate-appearance K outcomes across 4 seeds × 4,000 trials = 16,000 sample games per pitcher.

per-PA K probability =
sigmoid(
logit(pitcher.k_rate)
+ 0.5 × (logit(opp_k_vs_hand) − logit(league_k))
+ ln(park_k_factor)
+ ln(weather_k_mult) # reserved
+ ln(ump_k_mult)
+ 0.05 × opp_top4_out
) × tto_mult[pa_index // 9]
per-PA blends with batter-specific K% (50/50 in log-odds) when lineup confirmed

Median of the 16,000 trials = our predicted K total. p10 and p90 form the credible interval shown on every Today card. Times-through-order curve (1.00, 0.885, 0.826, 0.703) is calibrated from 2024 Statcast — pitchers really do strike out fewer hitters their third time through.

Features (in order of measured lift)

  1. Pitcher recency-weighted K% (EWMA span 5 blended 60/40 with season-to-date) — captures form changes mid-season. Direct feature in the simulator.
  2. Opponent K% vs handedness — 60-day prior, split L/R. Half-weighted in log-odds so opponent lift is real but not dominant.
  3. Ballpark K-factor — prior season's home-park K rate vs league. e.g. Comerica +0.3 K/9.
  4. Confirmed lineup × per-batter K% — when the day's batting order is posted, simulate against each batter's actual K rate vs the pitcher's handedness, cycling through positions 1-9.
  5. Home-plate umpire bias — prior-season K%/PA when this ump was behind the plate. Mike Estabrook (highest 2024): +20% above league. Carlos Torres (lowest): −12%.
  6. Times-through-order curve — empirical K% drop from 1st (23.7%) to 3rd (19.6%) trip through batting order, applied PA-by-PA in the sim.

Deferred to next iteration: catcher framing (needs separate framing-runs dataset), pitch-mix matchup (whiff% by pitch type × lineup weakness), weather coupling, xgboost replacement for the sigmoid blend (1-2 day project — gated on more data accumulation).

Walk-forward validation

Discipline: every feature at prediction time T uses only data with game_date < T. This is non-negotiable — a prior MLB moneyline attempt died because full-season aggregates leaked future data into "historical" predictions.

Methodology: 10,537 starter-games across 2022-2024. For each game, the model predicts the K total using only data prior to first pitch. The "edge" is measured against a naive book proxy (pitcher K% × opp K% lift × park K-factor) — what a vanilla market would price. Real walk-forward against Vegas closing lines is the next milestone (we're now accumulating real PropLine prices daily).

Model progression (overall hit %, MAE Ks)
baseline-v0 (original) 61.40% 1.90
baseline-v0-strict-naive 61.46% 1.90 (calibration only)
baseline-v1 (+TTO + recency) 64.21% 1.85 +2.81 pp
baseline-v2 (+per-batter lineup) 64.79% 1.84 +0.58 pp
baseline-v3 (+umpire bias) 64.76% 1.84 marginal
Bonferroni α = 0.0100 across 5 non-empty divergence tiers — all material tiers clear it.

Sizing: quarter-Kelly with hard cap

For each candidate bet we compute the empirical hit rate in its divergence tier from walk-forward, then size using:

kelly_full = (p · b − (1 − p)) / b
where p = model win prob, b = decimal_odds − 1
stake_% = min(kelly_full × user_kelly_fraction, user_max_single_bet_pct)
stake_$ = bankroll × stake_%

Default user_kelly_fraction = 0.25 (¼-Kelly), capped at 2% of bankroll per single bet. The cap binds for high-edge picks; sub-cap-Kelly binds for low-edge picks. Both are tweakable in Settings.

Real book lines via PropLine

For each game with open K-prop markets we query PropLine (free tier, 1,000 req/day) and pull every book's price at every line. Our model's P(side) is compared against each (book × line × side) — the combination with highest EV becomes the recommended bet.

Pinnacle's no-vig fair probability (computed from their over/under prices via the proportional de-vig method) is shown as a sharp-market anchor. When the model and Pinnacle agree, conviction is higher; when they disagree, the play is either a real model edge or a model error — we flag both cases for review.

★ BET / WATCH / INFO — the recommendation filter

The engine produces a Monte Carlo prediction for every starter on the slate, but only flags a small subset as actual bets. Every other game is shown for context only. Criteria for each tier:

★ BET
All four required: (1) real book line available (we can actually place it); (2) real EV between +3% and +25%; (3) Pinnacle's no-vig fair probability within 30pp of our model (sharp-market agreement); (4) model divergence from naive baseline ≥ 0.5 Ks. Sized via quarter-Kelly.
WATCH
Real book line exists, but one of the BET criteria failed. Most common case: EV above +25% ceiling — books don't normally leave 50%+ edges sitting around, so an apparent +50% EV almost always means our model is wrong about that pitcher (stale form, missing feature, lineup quirk). Listed so we can monitor and learn. No stake recommended.
INFO
No open book line for this pitcher (market closed, late starter, or PropLine doesn't carry that book). Model prediction shown for context but cannot be bet from this surface.

Today's forecast tiles (tonight P/L, 7-day, 30-day, 90-day projections) include only ★ BET picks. The watch and info tiers don't contribute to the bankroll forecast — they're for visibility and post-game analysis.

Honest caveats — please read before staking

Code references