Coding with an AI Assistant: My Ongoing Journey into Vibe Coding

07 Aug 2025 - tsp
Last update 30 Oct 2025
Reading time 9 mins

In this blog post I want to capture my evolving experience with what is called “vibe coding”: a collaborative and conversational approach to programming alongside large language models (LLMs). Think of it as pair programming with a non-human partner that remembers nothing between sessions, makes odd mistakes - but still amazes you daily.

The term vibe coding captures the feel of it: You don’t just write code, you vibe with your assistant. You explain, it responds. You discuss, it restructures. You get stuck, it offers insights. Sometimes it nails a design pattern. Sometimes it hallucinates. But if you approach it with a blend of structure, clarity, and improvisation—it works.

This article will grow over time (hopefully). I’ll update it regularly as I use more tools, tackle different kinds of projects, and see how my workflow changes.

Initial Skepticism and Model Choices

Like many engineers, I was skeptical at first - especially after having experience with programming for over 30 years and having seen many beginners and their typical mistakes. Could an AI really help me write real code? Not just toy examples or StackOverflow-style snippets, but full-blown applications?

To my surprise, the answer is: Yes, but with caveats.

Today, I use several models depending on the task:

Anthropic Claude Sonnet 4 is currently my favorite for coding. It’s fast, competent, and balances reasoning and structure well.
GPT-4o and GPT-4.5 are excellent for documentation, creative writing, and structured texts.
Local models with up to 96B parameters can be useful when privacy is essential or latency matters - but they’re not competitive with the top-tier commercial models. And economically it’s not possible to buy hardware for large models in a sane way for most companies, not even talking about individuals.

Keep in mind: you are sending your code to a third party unless you’re using on-premise models. I do most vibe coding on personal projects. At work, I switch to local gateways or air-gapped setups with smaller models to avoid privacy or NDA issues. Limited hardware means reduced context windows and more friction—but it’s workable.

My Current Stack - The Tools

My current stack includes:

Code-OSS (VSCode) with the Codex extension - Lightweight, not as scriptable as Kilo but it works pretty well with your ChatGPT Plus/Pro abonement in case you don’t prefer per token cost.
codex-cli - Very lightweight and perfect for command line usage, especially when you like to work with tools like vim (which I prefer in contrast to a multi gigabyte IDE that provides no additional features except for eye candy).
Code-OSS (VSCode) with the Kilo extension - Lightweight and scriptable
Cursor – Polished UI, very good for LLM-assisted coding. Update: As it turned out the tool looks very polished on top but other frontends like Code-OSS or CLI tools like codex-cli are in my opinion usually a better choice if you don’t use Cursors backend abonement.

In the Kilo extension, I configure multiple backends:

OpenAI and Anthropic APIs for public model access
Local Ollama or gateway models at work (with reduced context, unfortunately)
MCP extensions to access project-specific documentation, databases, etc.

After some usage I started to utilize a API gateway in between the tools and my backends. This makes everything more flexible and easier to monitor. In addition context size is a massive factor. Having 200k tokens available makes everything smoother. Working with 16k context on some on-premises models or 64k context at some ollama model feels like driving blind. Except for some very small tasks it’s not worth the hassle.

Project Architecture: Don’t Skip the Planning Phase

Starting a new project? Don’t jump straight into writing code - just like you wouldn’t with a human team or when working alone.

The architecture mode in Kilo is especially good. Here, I:

Outline requirements, languages, frameworks, and constraints
Specify forbidden patterns or libraries
Describe external APIs or systems

The agent will often generate a Markdown structure (README, spec descriptions, module breakdowns) and even mermaid diagrams. From there, it’s a conversation. You refine. It updates. You modularize.

But be warned: architecture mode can lead to massive duplication if you’re not careful. Also the agent sometimes overshoots with modularization or specific patterns for a given application (there is no need to implement a full blown MVC pattern with abstract controllers and hierarchical executors for a simple single-shot weekend application). You must actively guide the LLM to avoid bloated or redundant module design.

Clear technical descriptions are vital. The LLM is only as good as your instructions. It cannot guess technological decisions when you just give it plain everyday-English descriptions of your application. You need to know how one builds software in a proper way.

Code Mode: Boilerplate Heaven, Debugging Hell

Once the architecture is set, code mode helps with:

Initial project scaffolding (creating all directories and initial files)
Boilerplate generation
Editing isolated functions or files

For small projects or clean modules, it’s magic. You can fix bugs, navigate stack traces, and refactor with ease. But when your codebase grows, context becomes extremely relevant.

You must:

Keep files small
Provide context (via the generated Markdown files, manually written descriptions or preambles in the code)
Avoid relying on the LLM to search the entire repo - it’s inefficient and costly

Each new request starts from scratch unless you prepare documentation files the LLM can ingest. If you start over all the time this will get very slow and costly extremely fast.

Error Hunting: Flow Before Fix

Debugging is where the magic can happen - if you do it right.

Here’s the trick: Before asking the LLM to fix an issue, describe it in detail and ask it to explain how the system works.

How does control flow?
How does data flow?
What assumptions does each module make?

Only then does it have enough structure in its working memory to spot real bugs. If you skip this step, you’ll pay the price - in API tokens and your own patience.

This is where tools like Kilo shine: They embed this reasoning pattern into preconfigured prompts.

Cost vs Value

Let’s talk money.

A typical weekend project (done fully with vibe coding) costs me around $10–20. The more you do manually the cheaper it gets.
A single debugging session on a large project can eat $3–5 quickly if not scoped well.
An average developer spends around $100 per month at Anthropic according to their own statistics.

This is cheaper than a night at the pub, but more expensive than doing it all yourself. So it makes sense only if you save time **********or********** gain value - e.g., if you can move faster and deliver something that earns or matters (or of course just for the fun of it).

Practical Tips for Working Efficiently

A few patterns I’ve learned:

Always read diffs. Even if the LLM wrote the code, you are the reviewer. Never trust, always verify. Never accept code that you do not truly understand.
Disable automatic writes unless you’re in a controlled context. Edits should be reviewed manually all time.
Avoid automatic command execution unless side effects are impossible (e.g., formatting code, searching, fetching git logs, …). Don’t let an LLM run programs or configure systems without oversight. You’ve heard the stories of an AI agent deleting production databases. This is a sign that you made a huge mistake, not the AI agent (this goes further - who would let oneself allow access to a production system from ones development setup? These tools amplify your capabilities, but they also amplify your mistakes if used carelessly)
Always let the assistant write unit tests for your code. Always. It can execute them automatically (my approach: after it asks you to allow execution) and in a loop automatically interpret any runtime errors or failed assertions. This can then be used to re-iterate on fixing the code with minimal effort on your side.
Decide if you want to be too polite. Saying “please” and “thanks” costs tokens. But it also influences the tone of generated documentation.
Keep API docs and specs available. Whether in context or via tool plugins, LLMs need this info to stay grounded.

Ask Mode and Iterative Ingestion

One of my favorite features is the “Ask” mode. Instead of ingesting all code at once, the LLM steps through files and builds understanding iteratively.

This is incredibly efficient when, say, reverse engineering a protocol or untangling legacy code. But yes, cost adds up the more files it has to parse.

Final Thoughts (So Far)

This whole journey has changed how I work on some projects. Coding with an AI assistant is not about replacing humans (yet). It’s about:

Boosting productivity
Reducing friction on repetitive tasks
Providing second opinions
Helping you think clearer when stuck
Boosting productivity and learning for experienced users, but not skipping fundamentals for beginners

That said, it’s not magic.

It doesn’t understand your entire repo unless you spoon-feed it.
It doesn’t persist memory across tasks.
It can hallucinate with confidence (and we are talking here of massive unwarranted confidence sometimes communicating authoritative nonsens).
It cannot replace a skilled programmer and it cannot magically give you the skills you require (beginners beware, code manually and learn the stuff before using magical tools!)
And yes, it costs money.

But all in all? I’m fascinated. This technology has evolved near superluminal speeds of the last months to years - and development speeds up even further. I think there is a huge hype about AI in general and vibe coding of course - but it’s nothing that one should ignore or that will not transform the way we work.

Coding with an AI Assistant: My Ongoing Journey into Vibe Coding

Initial Skepticism and Model Choices

My Current Stack - The Tools

Project Architecture: Don’t Skip the Planning Phase

Code Mode: Boilerplate Heaven, Debugging Hell

Error Hunting: Flow Before Fix

Cost vs Value

Practical Tips for Working Efficiently

Ask Mode and Iterative Ingestion

Final Thoughts (So Far)

Related articles

Can LLMs Replace an Entire Software Company? A Reality Check

mini-apigw: A Lightweight Gateway for Multi-Model AI Infrastructure

How to use Frama-C to proof correctness of AVR microcontroller code

How I Use Large Language Models (LLMs) in My Daily Work and Hobbies

Another quick glance on the OpenAI API to ChatGPT using function calling

A Rant from the Old Forge: On the State of Software Development

Producing Structured Output (JSON mode) with Anthropics API: A Practical Solution

Why pure JavaScript web apps are neither the future nor a good idea

Also on this blog

The 'No Time Fading' Effect: Why Autistic Minds Don’t Move On Like Others Do

How to extrude polygons in OpenJSCAD

Simple time lapse with ffmpeg

GPU size estimation for LLMs