Skip to main content

AI Downside Misalignment - Why Most AI Systems Never Make It to Production

· 2 min read
AI Scientist & PM

Why do so many AI systems never make it to production?

I call it downside misalignment, or where AI can fail in ways humans cannot. The pattern seems clear when you compare some of the more exciting technological milestones of the past and present:

Well-accepted and aligned systems:

  • Self-driving cars: crashes, same as human drivers
  • Robo-advising & investing: financial loss, same as human advisors

Systems stuck in "pilot hell":

  • Most LLM-based systems: wrong answers/actions PLUS authoritative hallucinations (e.g., Deloitte's recent fiasco), bad permissions, etc.
  • Customer chatbots: failed resolutions PLUS + jailbreak vulnerabilities (e.g., Chevy dealership selling cars for $1)

The production problem we face today is that we have to prove TWO claims with AI systems:

  1. it works and
  2. it doesn't fail in new ways

That second claim is singlehandedly responsible for our reliance and requirement on backtests, human-in-the-loop systems, eval datasets, LLM judges, and months of testing. We're fighting an uphill battle against novel risks.

The uncomfortable truth is that the solution for this is not technological at all - it's a people and product problem. We need to get better at choosing use cases where LLM failure modes are aligned with what humans are facing.