Inside The Model: LightGBM Totals, Poisson Strikeout Props

Model Card | June 6, 2026

Yankees Under 4.5, Dodgers Under 5.5, Ashcraft Over 5.5 K

Residual LightGBM team totals + Poisson strikeout layer

Most betting content tells you the pick. This one tells you how the number got made. Today's card runs through two engines, a residual gradient-boosted model for team totals and a Poisson layer for strikeout props, and the most honest thing a model can do is show its work, including the spots where it found nothing. So before the plays, a note on the day's quietest result: every no-run-first-inning and yes-run-first-inning candidate on the board graded negative expected value, and the model passed on all of them. That is not a missing bet. That is market efficiency caught on camera.

How A Residual LightGBM Reads A Team Total

The team-total engine does not predict runs directly. It predicts the residual, the gap between what the market line implies and what the inputs say the run column should be. That framing matters. A model that tries to out-forecast the closing total in raw runs is fighting the single sharpest number in the sport and will lose. A model that learns where the line systematically misses, given the starter, the lineup handedness splits, the park factor and the bullpen workload, is doing something the market structurally underweights. The LightGBM is trained on thousands of games to map those features onto the signed residual, and a play only reaches the card when the predicted residual clears the juice on the available price.

Two team totals cleared today. The first is the Yankees under 4.5 at -140, where the model projects the New York run column at 3.40 against Ranger Suarez. Suarez carries a low-2.00s ERA, a sub-1.00 WHIP and a strikeout-to-walk ratio above three to one, and the feature that drives the residual here is his walk suppression. The model has learned that team totals are a function of baserunner traffic far more than raw contact quality, and a lineup that cannot draw free passes against a command lefty loses the cheapest path to a crooked number. A 1.1-run gap between projection and line at -140 is a clean residual, and it earns the largest stake on the card at three units.

The second is the Dodgers under 5.5 at -115, projection 4.79. This is a smaller residual, roughly seven-tenths of a run, against a higher-variance lineup, so the model sizes it down to 1.5 units. The interesting part is the asymmetry. The Dodgers are a stronger offense than the Yankees in aggregate, yet the model lays a smaller stake, because the residual is the thing being bet, not the team. A great lineup priced fairly is a no-bet. A streaky lineup priced a tick high is the edge, and the stake follows the size of the gap, not the prestige of the logo.

The Poisson Strikeout Layer, Applied

Strikeout props live on a different engine. A pitcher's strikeouts in a start are well modeled as a Poisson process, where the rate parameter, the expected count, is built from the pitcher's strikeout rate per batter faced, the opposing lineup's strikeout rate against the relevant handedness, and the expected number of batters faced given projected innings. Once you have that rate, the Poisson distribution gives you the full probability mass across every possible strikeout total, and the over or under on a given line is just the sum of the tail. The edge is the difference between that model probability and the price the book is charging.

Prop	Line	Model projection	Stake
Braxton Ashcraft (PIT)	Over 5.5 K (-102)	6.75	2 units
Tatsuya Imai (HOU)	Under 4.5 K (+118)	4.0	1.5 units
Jack Leiter (TEX)	Over 5.5 K (+124)	6.17	1 unit

Ashcraft is the cleanest of the three. He has paired a strikeout rate in the high-20s as a percentage of batters faced with a sub-3.00 ERA across a dozen starts, and his recent form includes an eleven-strikeout outing. Feed his rate against the opposing lineup's whiff tendency into the Poisson, project his batters faced from his innings trend, and the rate parameter lands at 6.75. With a line of 5.5 and a near-even price of -102, the over tail is large enough to carry a two-unit stake. The Poisson does not need Ashcraft to be dominant. It needs his central tendency to sit a full strikeout above the line, which it does.

Imai is the inverse play, and it is a lesson in expected batters faced. The under on his strikeout prop is not a bet against his stuff, it is a bet on his volume and his command volatility. The model projects his rate parameter at 4.0 against a 4.5 line, and at +118 you are getting plus money on the side the distribution favors. When a pitcher's expected batters faced compresses because of inefficiency and a shorter projected outing, the strikeout count's central mass slides under the line even if the per-batter whiff rate is respectable.

Leiter is the small-stake, high-upside entry. He misses bats at an elite clip, with a strikeout-per-nine rate in double digits, but his walk and home-run issues shorten some starts and add variance to his batters faced. The Poisson rate lands at 6.17 against a 5.5 line, a real edge, but the wider error bar on his innings keeps it a one-unit play even at the attractive +124. The price is plus money, the projection clears the line, and the stake respects the variance. That is the system working as designed.

Why Every NRFI Graded Negative EV Today

The first-inning market is where model discipline gets tested, because it is tempting to force a play every day. The no-run-first-inning bet is a Poisson-style event too, governed by the probability that neither leadoff stretch produces a run, and the books price it tightly because it is a popular, heavily modeled market. Today the engine ran every game's first-inning run expectation against the posted NRFI and YRFI prices, and not one of them returned positive expected value. The lines were efficient. The implied probabilities matched the model's own first-inning estimates closely enough that the juice ate any theoretical edge.

The correct response to that is to bet nothing in the market, and that is what the card does. A model that manufactures a first-inning play on a day when the prices are fair is not a model, it is a gambler with a spreadsheet. The honest no-bet is a result, and on an efficient board it is the most valuable one.

What Beats These Numbers

Team totals die on one swing in a homer-friendly park, and both the Yankees and the Dodgers can clear their numbers without a rally. The strikeout props live and die on projected batters faced, so an early exit, a rain delay, or a quick hook turns an over into a loser regardless of the per-batter rate. The Poisson assumes a clean, expected-length start, and managers do not owe the model that. Lineups were projected, not confirmed, at publication, and any change to the handedness split shifts the rate parameters. The model prices these risks into the stake sizing, which is why the conviction grades range from one unit to three.

Final Verdict

The June 6 model card is the Yankees under 4.5 (3u), the Dodgers under 5.5 (1.5u), Ashcraft over 5.5 strikeouts (2u), Imai under 4.5 strikeouts (1.5u) and Leiter over 5.5 strikeouts (1u), with every first-inning market passed for lack of edge. The team totals come from a residual LightGBM that bets the gap between line and projection rather than raw runs, the strikeout props come from a Poisson layer that bets the tail of a modeled rate parameter, and the empty NRFI column is the day's quiet proof that an efficient market deserves a no-bet. For more model walkthroughs, see the Rockies team total prediction, the White Sox under against Joe Ryan, and the full prediction archive.