The Eval Harness Problem: Why Your AI Demo Won't Survive Week 2
Most AI projects look great in a Loom demo and fall apart in week 2 of real traffic. The reason is the same every time: no eval harness. Here's what that means and how to build one.
Antor
Founder, NextBangla Ltd
I've reviewed enough AI proofs-of-concept by now to recognize the pattern. The Loom demo is great. The user testing video is great. The first twenty real-traffic conversations are great. By week two, the team is in a war room debugging why the model is suddenly worse, except the model didn't change — the inputs did, in ways nobody was watching.
What is an eval harness?
An eval harness is a measurable definition of 'good' for your AI feature, plus the infrastructure to run that measurement on every change. Without it, you're picking models on vibes and hoping the production input distribution matches what you tested on.
Full post coming soon — placeholder content during Phase 10.
Written by
Antor
Md. Ersaduzzaman Antor — founder of NextBangla Ltd and 10 AI startups. Building from Nilphamari, Bangladesh, with team experience across the UK and Luxembourg.
Related posts.

Building 10 AI Startups in Parallel: What Year One Taught Me
When you commit to shipping 10 AI products simultaneously, the constraints stop being technical and start being structural. Here's what worked, what didn't, and why I'm still doing it.
Why I Moved NextBangla Toward AI-First Products in 2024
After 13 years of running a multi-disciplinary services agency, I made the call to go AI-first. Here's the reasoning, the resistance, and what changed.

From Nilphamari to London: Lessons from Multi-Country Operations
Running a 50-person team across Bangladesh, the UK, and Luxembourg taught me about timezone-as-feature, hiring-distance-as-cost, and why most remote-first advice is wrong for South Asian operators.
Newsletter
Notes from building 10 AI startups.
Roughly twice a month: lessons from shipping AI in production, the unit-economics of voice models, and what working from Bangladesh taught me about distribution. No fluff, no sponsored links.
Wiring lands in a future phase. For now signups are logged but not stored.
