Community··6 min read

I Built Kodebase in a Month with AI: What I'd Do Differently

The metrics look impressive. But behind them are mistakes, wrong turns, and lessons learned the hard way. Here's my honest retrospective.

M

Miguel Carvalho

Founder

Share:

You've probably seen the stats. 382 artifacts. 199k lines of code. 10 days. Elite-tier DORA metrics.

What you haven't seen is everything I got wrong along the way building Kodebase in November 2025.

This isn't a victory lap. It's a post-mortem on the mistakes I made while building Kodebase with AI, and what I'd do differently if I started over tomorrow.


Mistake #1: I Underestimated Planning Time

I thought the hard part would be execution. I was wrong.

The hard part was writing good artifacts. Each one required thinking through acceptance criteria, edge cases, integration points, and validation steps. A "simple" feature often took 30-45 minutes just to specify properly.

Early in the sprint, I rushed this. I wrote vague artifacts thinking the AI would figure it out. It didn't. I got back vague implementations that needed extensive revision.

What I'd do differently: Budget 40% of your time for planning. It feels like you're not making progress, but every minute spent on a clear artifact saves three minutes of rework later.


Mistake #2: I Ignored Architectural Context Until Day 7

By day 5, I had three different patterns for the same problem scattered across the codebase. The AI was implementing each artifact in isolation, making locally reasonable decisions that created global inconsistency.

The fix was obvious in hindsight: write down the architectural conventions before you start building. What patterns do we use? How do modules communicate? What's the error handling strategy?

I didn't do this until day 7, after the damage was done. Spent hours refactoring.

What I'd do differently: Create a standards/ directory on day 1. Document the conventions you want before the AI has a chance to invent its own.


Mistake #3: I Trusted the AI Too Much (At First)

Day 2 was intoxicating. The AI produced beautiful, working code. I merged PRs quickly, barely skimming the implementations. Look at this velocity!

Day 4, I found a race condition bug that the AI had introduced. It was subtle. The code looked correct, but it failed under concurrent load. The AI had written technically valid code that didn't survive the real world.

After that, I reviewed everything more carefully. Found two more subtle bugs hiding in code I'd already merged.

What I'd do differently: The AI writes correct code, not robust code. Always review. Always test under realistic conditions. Never assume "it compiles, it works."


Mistake #4: I Didn't Trust the AI Enough (Later)

After the bug scare, I overcorrected. I started second-guessing everything. Rewrote implementations that were actually fine. Added unnecessary defensive code. Slowed way down.

It took a few days to find the balance: trust the AI for standard patterns, scrutinize anything involving state, concurrency, or external systems.

What I'd do differently: Develop a mental model for "high-risk" vs. "low-risk" code. Review intensity should scale with risk, not be uniform.


Mistake #5: I Built Features Nobody Asked For

Around day 6, I had a burst of inspiration. "What if we added X? What if we supported Y?" I created artifacts for features that seemed cool but weren't in the original scope.

A week later, I deleted most of that code. It was technically impressive and completely unnecessary.

AI makes it so easy to build things that you can lose sight of whether you should build them.

What I'd do differently: Stick to the plan. Write down the MVP scope before you start and resist the urge to expand it mid-sprint. The speed is a trap: it makes scope creep feel free when it's not.


Mistake #6: I Didn't Take Breaks

Day 6 almost broke me. I was exhausted, overwhelmed, and seriously considered quitting. Looking back, the warning signs were there on days 4 and 5: shorter temper, sloppier reviews, declining judgment.

AI-led development has a different exhaustion profile than traditional coding. You're not tired from typing. You're tired from deciding. Every artifact is a decision. Every review is a decision. Every merge is a decision.

Decision fatigue is real, and I didn't respect it.

What I'd do differently: Schedule rest days. After 3-4 intense days, take a day off. Your judgment on day 6 depends on it.


Mistake #7: I Worked Alone

I told myself this was a solo experiment. No distractions. Pure focus.

But working alone meant no one questioned my decisions. No one noticed when I was too tired to review properly. No one pushed back on unnecessary features.

The methodology worked, but it would have worked better with a second set of eyes.

What I'd do differently: Even in a solo sprint, find someone to check in with daily. Share what you're building. Get feedback. Isolation isn't focus. It's a vulnerability.


The Stuff I Got Right

It wasn't all mistakes. Some things worked well:

Starting with the hardest problem. I tackled the core artifact system first, not the easy edges. This meant the foundation was solid when I built everything else on top.

Atomic artifacts. Each artifact was one thing. One feature. One fix. No "and also" creep. This made parallelization possible.

Git-native from day 1. Every artifact, every decision, every change, version controlled. When I needed to understand why something was built a certain way, git blame told me.

Honest acceptance criteria. I wrote criteria I could actually test, not vague "it should work well" nonsense. This made review binary: either the criteria were met or they weren't.


Would I Do It Again?

Yes. Despite everything.

The mistakes were painful but fixable. The methodology itself is sound. What failed was my execution of it.

If I ran the same sprint again with these lessons, I'd cut the timeline to 7 days and have fewer bugs to show for it.

The real insight isn't that AI can build software fast. It's that humans are still the bottleneck, just in different ways than before.

We used to be bottlenecked on typing speed and debugging time. Now we're bottlenecked on specification clarity, review quality, and decision-making stamina.

Different problems. Same need for discipline.


For Those Starting Out

If you're about to try AI-led development for the first time:

  1. Invest heavily in planning. More than feels comfortable.
  2. Write your standards before you write your first artifact.
  3. Review everything, but calibrate intensity to risk.
  4. Resist scope creep. The speed is a trap.
  5. Take breaks. Your judgment depends on it.
  6. Find someone to check in with.

And most importantly: expect to make mistakes. The methodology reduces errors in code. It doesn't eliminate errors in judgment. You're still human. Act like it.

retrospectivelessons-learnedai-developmenthonesty
M

Miguel Carvalho

Founder