Prediction Without Disruption

The recent Stanford paper on Outcome-based Reinforcement Learning to Predict the Future¹ (RLVR) could be seen as both a product of and a contributor to the cycle of misinterpreting disruption, as I discussed in Why We Keep Misreading Disruption.² It’s advancing tools that improve prediction without necessarily addressing or understanding the foundational shifts that disruption entails.

Technically, it’s an impressive feat: reinforcement learning tuned to verifiable outcomes, calibrated on 110,000 real-world events, and reportedly matching the predictive performance of frontier-scale models. Conceptually, though, it’s telling. The entire proposition rests on a deep confidence in system legibility—that the world is sufficiently stable, observable, and mappable that we can optimise our way into foresight.

But disruption doesn’t work like that—navigating disruption involves seeing around corners, not mapping trends. The kind of change that matters most—when value migrates, systems realign, and categories dissolve—tends to defy prediction. It’s not just that the models aren’t accurate enough; it’s that the assumptions beneath them (what counts as a relevant outcome, which variables matter, how cause and effect operate) start to come unstuck.

That’s the paradox: the more energy we pour into refining our forecasts, the more we risk entrenching the very mindset disruption invalidates. The RLVR paper isn’t flawed on its own terms—it’s elegant, rigorous, and well-aimed at what it’s trying to do. But what it’s trying to do may itself be part of the problem. It reinforces the idea that the future is best approached as a prediction challenge rather than a strategic puzzle—solvable with more data, better incentives, smarter models.

In Why We Keep Misreading Disruption, I argued that we keep misreading disruption because we mistake surface volatility for system change, and tools for understanding. RLVR makes that dynamic visible in a particularly sharp way: a breakthrough in prediction that may leave us even less prepared for transformation.

Turtel, Benjamin, Danny Franklin, Kris Skotheim, Luke Hewitt, and Philipp Schoenegger. “Outcome-Based Reinforcement Learning to Predict the Future.” arXiv, May 26, 2025. https://doi.org/10.48550/arXiv.2505.17989. ↩︎
PEG. “Why We Keep Misreading Disruption.” Substack newsletter. The Puzzle and Its Pieces (blog), May 27, 2025. https://thepuzzleanditspieces.substack.com/p/why-we-keep-misreading-disruption. ↩︎

Leave a Reply