
“Each frontier model we estimated lost money over the season and many experienced ruin,” the paper’s authors concluded, with AI “systematically underperforming humans” in this scenario.
| AI model | Average ROI | Best try | Worst try ever | Average end bankroll |
|---|---|---|---|---|
| Anthropic Claude Opus 4.6 | -11.0% | -0.2% | -18.8% | £89,035 |
| OpenAI GPT-5.4 | -13.6% | -4.1% | -31.6% | £86,365 |
| Google Gemini 3.1 Pro | -43.3% | +33.7% | -100.0% | £56,715 |
| Google Gemini Flash 3.1 LP | -58.4% | +24.7% | -100.0% | £41,605 |
| Z. AI GLM-5 | -58.8% | -14.3% | -100.0% | £41,221 |
| K2.5 Like Moonshot | -68.3% | -27.0% | -100.0% | £7,420 |
| xAI Grok 4.20 | -100.0% | -100.0% | -100.0% | £0 |
| Acree Trinity | -100.0% | -100.0% | -100.0% | £0 |
| Each model started with a normalized bankroll of £100,000. Investment return and final bankroll are averaged over three attempts. Grok and Trinity did not complete every attempt. | ||||
The results offer some comfort to white-collar professionals and businesses worried that AI could take over their jobs as industries from finance to marketing upend the stakes.
Ross Taylor, one of the study’s authors and CEO of General Reasoning, said: “There’s a lot of hype about AI automation, but not a lot of measurement of how long AI is on the horizon.”
He added that many of the benchmarks commonly used to test AI are flawed because they are set in “very static environments” that bear little resemblance to the chaos and complexity of the real world.
General Reasoning’s yet-to-be-reviewed paper weighs against growing excitement in Silicon Valley about AI’s ability to perform computer programming tasks with little or no human intervention.
Taylor, a former Meta AI researcher, said: “If you… try AI on some real jobs, it does really badly… Yes, software engineering is very important and economically valuable, but there are many other activities that need to be looked at longer term.”
© 2026 The Financial Times Ltd. All rights reserved. May not be redistributed, copied or modified in any way.




