AI coding tools are as good (and bad) as human developers

A variant on this post has been written so many times over the past couple of years, but I feel compelled to write up my recent experience on this subject.

I should open by saying that using coding agents with decent models behind them has to be the future. The speed we can prototype, update, document, and deliver functional code is quite something. It certainly feels like magic.

It has to be said also that normally, the view is that it is "garbage in, garbage out", and it has ever been thus.

My experience yesterday, however was subtly different.

I went to a good quality agent, using a paid-for, powerful coding foundational model, and described what I needed, with all the expected behaviours, pointed it at the API for the system I needed the small web app to talk to, and explained which parts where needed and gave it the relevant credentials to build and test the app.

What it created looked good. No, it looked great. What it then did was a bunch of unit tests, claimed it worked, and when it got an error that is documented (referring to the API having rate limiting built in) it claimed it was due to excessive testing (LOL) and I should simply wait.

I then spent hours debugging.

Why did it not work? Because firstly it had missed out an important feature in the API documentation which said that if a particular call worked, it would simply return HTTP 200 OK, and no body content, whereas the code the agent had created expected a JSON message by return, and not doing so resulted in an error condition. Secondly, the agent's code was making a status request, then immediately pushing a request for the API to do something - breaching the stated 1 request per second rate limiting. So by design, it would fail. And the agent's response was to hallucinate a reason for the error.

I had to make the API calls manually to see that all was working correctly, and my experience led me to find the root cause.

Now, whether I would be able to build better prompts to push the agent to get to where I managed to go I do not yet know, but it was amusing to see the same mistakes, the same assumptions, and same problems that human coders face. I found my way through this, but how would a non-coder trying to vibe-code their way to a prototype fix this?

This aside, at least I managed to cheer mysdelf up as I got to mutter under my breath to the agent, RTFM. Just RTFM...

Related posts