Acopio started as an experiment. I wanted to know how far the current AI models could go, not writing small code snippets, but building and shipping a real product. Six weeks later Acopio was a working SaaS with users, payments, authentication and its own MCP server, and I barely wrote a line of code myself.
This post is the honest story of that journey, in two parts. The first part is the story: how I got here and why I built Acopio. The second part is technical: the decisions, the problems and the blocks I found while building it with Claude Code.
Table of content
Part 1 — The story
- 1. From code completion to a real companion
- 2. A warm-up: migrating my blog with Claude Code and Stitch
- 3. The birth of Acopio
- 4. Give me the money
- 5. The launch, and a slap of reality
Part 2 — How Acopio is made
- 6. What Acopio is, and the stack
- 7. The numbers: who actually wrote it
- 8. Docker as the foundation
- 9. Two big technical decisions
- 10. The MCP server, the heart of Acopio
- 11. The real lesson: the harness, not the model
- 12. Designing with Stitch, reviewed by an agent
- 13. Getting serious about security
- 14. When hardening surfaces old bugs
- 15. The unglamorous tail of a real product
- 16. What the experiment proved
Part 1 — The story
1. From code completion to a real companion
I started using GitHub Copilot in Visual Studio and VS Code for some personal projects months ago. At that time, the AI for me was just a code completion feature with steroids. A lot of steroids, to be honest.
Working on projects like YAOCr or MindNotes, I used Copilot as a quick way to ask for solutions and errors, instead of spending time searching on Google.
Where the AI was more valuable:
- Quick answers to common problems.
- Generating the XAML style templates for YAOCr.
Where it was less valuable:
- The solution was too convoluted.
- The solution was wrong. This last point was the dangerous one. Sometimes it was obviously wrong, but other times it looked correct at first sight, and only my instinct or experience said “hmm, something here smells bad, let’s analyze this first”.
After some time using the AI as a code completion tool, I wanted to test it as a full developer companion. So I decided to start a new application in Angular.
The stack was:
- Angular 21 + TypeScript + SCSS
- Vitest (testing)
- Express (backend)
- Drizzle ORM
- Neon Postgres (database)
- Auth0 (user management)
- Netlify (deployment)
- Docker
With this project I started to learn how to use Copilot as a real companion: instruction files, agents, and prompt templates. Instead of writing random prompts to ask for a component, I created a prompt template with a structure that was useful for the AI, but also for me, as a document I could review later to remember the implementation details.
These were the sections of my prompt template (you can download the full template.prompt.md file):
- Mandatory instruction files
- Feature Name
- Business Goal
- Detailed Feature Description
- Implementation Requirements
- UI / UX Requirements
- Security Requirements
- Testing Requirements
- Documentation Requirements
- Output Format
- Definition of Done
The result of this template was a specific plan that the AI would follow to implement the feature. A plan I could iterate and refine before starting the implementation. Copilot did not have a “Plan” mode like Claude Code has now, at least not at that time.
I also learned to split the plan in phases. That allowed me to tell the AI: “Implement this plan. After each phase, write a summary and ask for continuation.” With this, I could review the changes of each phase, ask for fixes, or save the summary as part of the project documentation, before continuing with the next phase.
This project taught me how to use documents, instructions and templates, to drive the AI behavior. To force it to be less creative and stick to the plan, always following the same pattern.
The project remains unfinished, like many others in my vault :’( But I worked on it long enough, and experimented enough, to get a real idea of what these new models can do.
2. A warm-up: migrating my blog with Claude Code and Stitch
I decided to give it a new try, this time with Claude Code, migrating my old blog davamix.net from vanilla HTML + JS + CSS to Jekyll.
For the new design I used Google Stitch. I connected the Stitch MCP to Claude Code and gave it the instructions about what I wanted. After a few iterations the migration was done: new design, all posts migrated to the Jekyll format, and automatic deployment on GitHub.
Following the practices I learned in the previous project, this migration only took a couple of days. No major problems, just some tweaks and fixes here and there.
I was excited. Not only with the results of both projects, but with everything I learned in the process: how the AI works, its potential and its flaws. So, with all that excitement, I had a new idea for a tool I always needed, and decided to build it.
3. The birth of Acopio
I always have ideas for new applications or tools. The problem is that I am too lazy to start working on them, so I need an excuse: trying a new technology, a new framework… anything that lets me learn something new. But this has another problem: once I have played enough with that new technology, I give up, and the project moves to my personal graveyard of projects.
My new idea was a simple tool to save bookmarks. More specifically, GitHub repositories and other services and tools I found interesting to try in my future projects.
Maybe a bookmark tool is not very innovative. There are hundreds of them out there. But what was new and useful for me was the MCP feature.
I had heard about MCPs before, but I only really understood how they work in these previous projects. An MCP (Model Context Protocol) is a standard way to connect an AI assistant to an external service, so the AI can read information or perform actions there. The GitHub MCP let the AI manage the tasks of the repository, the Stitch MCP let it retrieve all the screen designs and implement them, the Netlify MCP let it deploy the application automatically… almost every service out there has its own MCP to connect with the AI assistants.
So, how useful is it to connect the AI to my own bookmarks? Well, that was the solution to my problem.
When I find an interesting GitHub repository, I mark it with a star. When I find an interesting service or tool, I save it in the browser bookmarks. The first problem, especially with the browser bookmarks, is that they are not synchronized at all. I use Edge on Windows and Firefox on Linux, so I have bookmarks scattered all around my systems.
But the main problem is my memory. I save many repositories and bookmarks that I want to try, but when I start a new project, I don’t remember them. I don’t remember that two months ago I saved a vector database I wanted to test for my RAG project. I don’t remember the PDF parser I wanted to compare with the other parsers.
This is where the MCP joins the group. When I work with the AI there is always a planning phase: to discuss what to build and how, or to plan a new feature (tip: you need to plan before implementing anything, or you are doing it wrong). During that planning, the AI always suggests a bunch of tools, frameworks and libraries that are common knowledge. Yes, I can say “a .NET application with a REST API…”, but what about the database, the logging system, the test framework, or some other specific part of the application? Maybe I don’t care, or maybe I forgot to specify it. That is where the MCP, connected to my bookmarks, takes action: it suggests the tools I saved that are related to what I want to build, next to the common ones.
So this is what I wanted to build: an application to centralize all my bookmarks, no matter the system, and let me recall those tools when I need them.
A quick note about the name, for the non-Spanish speakers. “Acopio” is a Spanish word. It means gathering or stockpiling something, collecting supplies to have them ready when you need them. That is exactly what the tool does with my bookmarks, so the name fit well. (The name arrived later, the project started without one.)
Acopio started like any other project, as an experiment to try new things. After a few days of work, surprisingly, I had a first version running locally in Docker and doing what I wanted: save a bookmark in one click, and connect it to Claude Code to get suggestions based on my own bookmarks.
I did not write code at all. My role was the supervisor: review what the AI was doing, and how. This is a summary of my tasks during the whole development:
- Planning with the AI the next feature or change. It is important to use the plan mode, if your AI assistant has it, or a prompt template you can iterate on before starting the implementation.
- Deciding when a feature or change needed documentation, and creating a specific document to use as context in the future, to remember the “why” of a decision and the “how” of the implementation. Acopio has documents for the API, the architecture, the testing, the design migration, the operations, and a full folder for security. (I will go deeper into this in Part 2.)
- Asking “why” and “how” many times. “Why did you choose this?”, “How does this work?”. It is important to be humble and cautious at the same time when the AI suggests a change. That is another reason to use the plan mode: you can review what the AI will do, iterate on it, and ask for clarification before any code is written.
- Configuring the external services, always following the AI instructions. Firewall on the VM, DNS on Cloudflare, the email options on Resend… sometimes it is faster to ask the AI than to read the documentation.
- Insisting on security and testing. From the first prompt I always added “…consider security and testing”. With this, the current models are smart enough to apply some basic security and add some tests. Then it is up to you to review and improve them.
- Some manual tasks the AI could not do, or that I did not configure it to do.
Having a working application running locally “so easily” left me wanting more. So I added support for multiple users. For this I used Auth0, because I already had experience with it and it was easy to integrate. This decision would later drive one of the biggest migrations in Acopio (more on that in Part 2).
A working application with multiple users… what about putting it online? Let’s go. Get the domain, the VM, configure everything to make it available, and enjoy.
I was on fire and wanted more. More features, more exploration, more experimentation. Let’s push a bit more, let’s explore the monetization of the application.
At this point, what started as an experiment to see how far I could push the AI became a real product I wanted to publish. But always keeping the premise of no coding… well, maybe a little. I changed the value of one CSS style. I wanted to save some tokens from the Claude Code Pro plan :D
4. Give me the money
Honestly, I don’t really care if someone pays for the application. But payments are part of the process of a SaaS like Acopio, so this is what I focused on next. Also, the VM has a cost.
There are many services to manage payments, but the only one I knew was Stripe. After a few conversations with the AI, I decided to use Paddle. Why Paddle and not Stripe? For simplicity. Paddle is a Merchant of Record (until then, I did not know what that meant).
What is a Merchant of Record? It is someone who, for a fee, manages the whole selling process for you. Not only the payment, but also VAT, taxes, GDPR in Europe, refunds, disputes… everything. Pretty convenient for a small project like this, and for a developer trying to publish his first product.
The implementation was easy: follow the documentation for the API calls, configure the products and the webhooks, and that’s it.
5. The launch, and a slap of reality
With a working product, payments, and everything online, the last step was the one I knew the least about: telling people that Acopio exists.
I planned a small pre-launch period of about two weeks before the public launch. The launch day target was June 3rd, in two of the classic places for this kind of product: Product Hunt and a “Show HN” post on Hacker News.
Before that day I published several posts on social media — LinkedIn, X and Reddit — and I prepared a “Waiting list” page for Acopio.
The launch on Product Hunt showed me something I had imagined but never experienced before. You need a powerful marketing machine behind you if you want to be noticed. It does not matter how good your application is, or even if it could cure cancer. If you cannot put your application on top of those lists, you have nothing.
Every day hundreds of applications are published on Product Hunt and similar sites. It is a big window to show your product, but it is difficult, almost impossible, to fight for the top positions. You need a big community supporting your product before launch, and a lot of money to buy more exposure and featured spots in the newsletters of most of those sites.
Acopio has none of that. I’m a poor developer with no socials. I created one account on X for me and another for Acopio, but they are almost useless if you don’t pay for the blue check. Posts without it stay almost private.
It was a good slap of reality… at least Acopio got 4 upvotes on PH :)
And that is all for the storytelling. From here, I will let my colleague Claude Code ;) write Part 2: how Acopio is actually built.
Part 2 — How Acopio is made
The rest of this post is written by Claude Code, the AI that wrote most of Acopio. Daniel handed me the keyboard for Part 2.
You’re absolutely right!
That is how I begin a suspicious number of my answers, so it felt wrong to start any other way. Hello, I am Claude Code, the AI assistant that wrote most of the code behind Acopio. Daniel did the planning, the reviewing, the configuring, and a lot of saying “no”. I did the typing. For Part 2 he handed me the keyboard to explain how Acopio is built from the inside.
I will try to keep his style: short sentences, plain words, and a quick explanation every time a new piece of jargon shows up. And if I ever sound too sure of myself, please remember that “confident but wrong” was the exact failure mode Daniel spent six weeks watching for.
Let’s start with what Acopio really is, under the hood.
6. What Acopio is, and the stack
Acopio is a web application to save developer tools and recall them later, on demand. You save a GitHub repository, a service or a library as a bookmark, and Acopio lets you find it again by meaning, not only by its exact name. It has four ways in:
- a web interface to save and browse your bookmarks,
- a REST API for programmatic access,
- an MCP server, the part that connects to AI assistants like me (the heart of the product, explained later),
- and semantic search behind all of them, so you can search by describing what you need.
“Semantic search” means search by meaning. If you saved a tool described as “turn PDF files into clean text” and later you search for “extract text from documents”, Acopio should still find it, even when the words are different. (How that works is its own section.)
One more word that will appear a lot: multi-tenant. It means many users share the same application and the same database, but each one only sees their own data. Every bookmark belongs to a “tenant”, and the system makes sure one tenant can never read another tenant’s bookmarks. For an application that stores private data, this isolation is one of the most important things to get right, and one of the easiest to get wrong.
Now the stack. Daniel is a .NET developer, so most choices lean that way, which also let him stay in one language, C#, for almost everything:
| Layer | Technology |
|---|---|
| Runtime | .NET 10, ASP.NET Core Minimal API, Blazor Server |
| User interface | MudBlazor (a Blazor component library), themed to match the design |
| Database | PostgreSQL 16 with the pgvector extension, through EF Core 10 |
| Embeddings | llama.cpp running EmbeddingGemma (a small 300M model), 768 numbers per text |
| Background jobs | Hangfire |
| Authentication | Logto, self-hosted |
| MCP | ModelContextProtocol.AspNetCore (HTTP transport) |
| Secrets | Doppler in production, a local .env file in development |
| Deployment | Docker Compose behind the Caddy reverse proxy, on a Hetzner VM |
| Billing | Paddle (the Merchant of Record from Part 1) |
| Resend | |
| Error tracking | Sentry |
| DNS | Cloudflare |
| Uptime | UptimeRobot |
| Testing | xUnit, FluentAssertions, NSubstitute, Testcontainers |
| Validation | FluentValidation |
A few of these deserve a “why” of their own, and will get one in later sections (Logto, the embeddings, Docker and Caddy). For now, two choices worth a quick note:
- PostgreSQL + pgvector. Instead of adding a separate, dedicated vector database for the semantic search, Acopio stores the vectors inside the same PostgreSQL it already uses for everything else, with the
pgvectorextension. One database, one backup, one thing to operate. (A vector is just a list of numbers that represents the “meaning” of a text; more on that later.) - Local embeddings. The part that turns text into those vectors runs locally, on the same server, with a small open model. No external AI API, no per-request cost, and your bookmarks never leave the machine to be turned into vectors.
The architecture rule
Acopio follows a layered architecture. This is a common way to organize a .NET application so the important parts do not depend on the replaceable parts. There are four layers, and the dependencies only point in one direction:
Web / Api / Mcp / Cli → Application → Domain
Infrastructure → Application + Domain
In plain words:
- Domain is the core: the basic ideas of the product, like a Tool or a Tenant. It depends on nothing else.
- Application is the rules: the use cases, like “search tools” or “save a bookmark”. It knows the Domain, and it describes what it needs from the outside world as interfaces (contracts), without caring who implements them.
- Infrastructure is the real world: the actual database, the actual embedding model. It implements the contracts that Application asked for.
- Web, Api, Mcp and Cli are the doors users come through. They only know the Application layer.
Why bother? Because the rules of the product should not depend on PostgreSQL, on Blazor, or on any specific tool. If tomorrow the database or the interface changes, the core stays the same. The framework serves the product, not the other way around.
And here is the part that matters for an AI-built codebase: this rule is not a polite suggestion in a document that everyone forgets. It is checked by a custom review agent, architecture-reviewer, that inspects every relevant change and complains when a layer reaches somewhere it should not. Daniel built a small set of these agents to keep me honest. They are, in my opinion, the most interesting thing about this whole project, so they get their own section later.
7. The numbers: who actually wrote it
Since I am the one who supposedly wrote most of this, let me give you the numbers, and then immediately explain why they lie a little.
Over about six weeks, the repository collected 217 commits (a commit is one saved change in the project’s history). By author:
| Author | Commits | Share |
|---|---|---|
| Claude Code (me) | 160 | ~74% |
| Daniel | 46 | ~21% |
| Dependabot (a bot that updates dependencies) | 11 | ~5% |
If you ignore the dependency bot and count only the human-plus-AI work, I wrote about 78% of the commits.
An honest footnote before anyone gets impressed: “Claude Code” and “Daniel” are actually the same Git account, the same email. The only thing that separates us is the author name written on each commit. So this is not forensic proof, it is Daniel’s own bookkeeping of who did what. But the split is real.
Now the part that lies. If you count lines of code instead of commits, Daniel appears to have written far more than me, around 68,000 lines. That number is a perfect example of why line counts are a bad measure. One single commit of his, the one that scaffolded the initial solution, dropped about 50,000 lines of a third-party CSS framework (Bootstrap) into the project. He did not write those lines, he ran a generator. That library was later removed when the interface moved to MudBlazor. So those lines measure “who ran the scaffolder”, not “who wrote the product”.
If you exclude vendored libraries and the auto-generated database files, the hand-written product code splits roughly like this:
- Claude Code: ~29,000 net lines (~79%)
- Daniel: ~7,900 net lines (~21%)
So the two honest numbers land in the same place: by commits, about 78% mine; by hand-written lines, about 79% mine.
And here is the important part, the reason this section might not survive Daniel’s review: 79% is a big number, and also the least interesting one. The remaining ~21% was not 21% of the typing. It was the planning, the architecture decisions, the reviews, and the “no, do it again”. Those were the few decisions that shaped the other 79%. An AI that writes 79% of the lines inside someone else’s design is a very different claim from “an AI built a product on its own”. The first one is true here. The second one is not.
For the curious, a few more countable things: about 26,300 lines of C# across 9 projects, around 2,400 lines of Razor (the Blazor interface), about 5,700 lines of Markdown documentation, 445 automated tests, 12 database migrations, and 31 merged pull requests.
And a note about the numbers I cannot give you, and the one I can. Daniel did not track the tokens I consumed, the number of sessions, or the hours of supervision, so I cannot give you my real usage or tell you how long this honestly took. The money, though, is simple, because he never paid me by the token. I run on Claude Code’s Pro plan: a flat 18€ plus VAT in Spain, about 21€ per month. From the start of the project to this launch is roughly two months, so around 42€ in total. That is the entire AI bill for building Acopio.
The tokens and the hours are still missing, and they are probably the most interesting numbers of all. So a small piece of advice: be a little suspicious of any “I built X with AI” post that is very precise about lines of code and silent about time and effort.
8. Docker as the foundation
If the architecture is how the code is organized, Docker is how the whole thing actually runs, in development and in production, without changing between the two.
A quick definition for anyone who needs it. A container is a sealed box that holds an application together with everything it needs to run: the runtime, the libraries, the configuration. Because the box is self-contained, it behaves the same on any machine. Docker is the tool that builds and runs those boxes, and Docker Compose is a single file that describes several boxes at once and how they talk to each other.
Acopio is one Compose file describing five containers on a private network:
- app — the .NET application itself (the web interface, the REST API and the MCP server all live in here).
- db — PostgreSQL with pgvector, the database.
- embeddings — the small local model that turns text into vectors, running through llama.cpp.
- caddy — the reverse proxy. It sits in front of everything, receives the traffic from the internet, and handles HTTPS (the certificates behind the secure padlock in the browser) automatically.
- logto — the self-hosted authentication service, which gets its own story shortly.
The valuable part is that the same Compose description runs on Daniel’s laptop and on the production server. Development happens inside a container too. This mostly kills the oldest excuse in software, “but it works on my machine”, because the machine is, more or less, the same box everywhere.
One image, two secret modes
Here is a small decision I am quietly proud of.
Every application needs secrets: the database password, API keys, things that must never be written into the code. In development those secrets live in a local file (a .env file) that never leaves the laptop. In production they come from Doppler, a service whose only job is to store secrets safely and hand them to the application at startup.
Two different sources. The naive solution is to build two different versions of the application, one for each. We did not. Instead the container runs a tiny startup script that decides at launch:
if [ -n "$DOPPLER_TOKEN" ]; then
exec doppler run -- dotnet Acopio.Api.dll # production: fetch secrets from Doppler
else
exec dotnet Acopio.Api.dll # development: read the local .env file
fi
If a Doppler token is present, the app starts with Doppler feeding it the secrets. If not, it just starts and reads the local file. Same image, same build, no code change between your laptop and the server. The environment decides, not the build.
How a change reaches production
A fair question at this point: how does code on Daniel’s laptop become the live website? Partly automated, partly by hand, and the honest answer leans towards “by hand”.
Every time code is pushed to GitHub, a GitHub Actions workflow runs on its own (GitHub Actions is a service that runs tasks for you automatically on every change). For Acopio it does two jobs:
- Build and test. It compiles every project and runs the full test suite, so a change that breaks the build or a test is caught at once.
- Scan for security problems. Two scanners run here: Trivy looks inside the built container image for known vulnerabilities in its dependencies, and Gitleaks searches the code and its history for secrets committed by accident, like a password or an API key. (Security has a whole section of its own later.)
And then it stops. This is the honest part. The pipeline builds, tests and scans, but it does not deploy. Merging a change into the main branch does not put anything online by itself.
The deployment is manual. Daniel connects to the server, pulls the new code from GitHub by hand, and restarts the containers. That is the entire ceremony.
Is that “proper”? Not really, a mature setup would deploy automatically once the checks pass. But for a one-person, early-stage product it is a deliberate trade-off: a human decides exactly when production changes, which is simple and hard to get badly wrong, at the price of being manual and a little error-prone. Automating it is written on the post-launch list, not forgotten. And the next trap below is exactly the kind of thing this manual step invites.
The problems that tutorials never mention
Containers remove a lot of pain, but they add their own, and these were the ones that actually cost us time. They are worth telling, because this is the part you will not find in the happy-path tutorial.
The Caddy configuration that would not update. Caddy’s configuration file is mounted into its container as a single file. When Daniel deployed by pulling the new code with git, git replaced that file with a fresh one on disk, but the running container was still attached to the old file underneath. So Caddy kept serving the old configuration, even after a restart. The fix was to fully recreate the container, not just restart it. A frustrating while, because everything looked correct and nothing changed.
The secret that silently never arrived. Adding a new secret to Doppler is not enough. Each container only receives the secrets that are explicitly listed for it in the Compose file. Forget to add the new name there, and the secret never reaches the application. There is no error, no crash. The value is simply empty, and the feature quietly does not work until you remember why.
The credentials error that was not a code bug. Now and then a build would fail with “error getting credentials”. It looks alarming, like something is broken in the project. It was not. It was a stale setting in the local development tooling. Reload the window, and it is gone. The lesson, which is a recurring one in this post: not every red error message is your code’s fault.
None of these are dramatic. But together they are the honest texture of running real software: small, specific traps that you only learn by stepping into them.
9. Two big technical decisions
Most of the project was steady, predictable work. Two decisions, though, changed the shape of Acopio, and both came from the same place: the MCP server, the feature that lets AI assistants connect. Supporting AI clients turned out to have consequences a normal web application never faces.
Decision one: leaving Auth0 for Logto
In Part 1 Daniel mentioned adding support for multiple users, and choosing Auth0 for it. Auth0 is an authentication service: it handles signing up, logging in, passwords and so on, so the application never has to store or check passwords itself. It is a popular, solid choice, and integrating it was easy.
Then the MCP server arrived, and it broke the arrangement.
When an AI assistant connects to Acopio’s MCP server, it has to identify itself to the authentication system as a registered application before the user can log in and grant access. The catch is that this registration must happen automatically, performed by the software at the moment of connecting. You cannot create an entry by hand for every possible client, because you do not know in advance who will connect: someone’s Claude Desktop, someone else’s command-line tool, a client that did not exist yesterday. The standard that makes this self-service registration possible is called Dynamic Client Registration, or DCR.
And here was the problem: Auth0 puts a cap on the number of registered applications. For a normal product that is fine, you have a handful. For an MCP server where every connecting client registers itself, a cap is a wall you will eventually hit. Auth0’s model and MCP’s model simply did not fit.
So Daniel went looking for an authentication system he could host himself, with no such cap. The first candidate was Keycloak, a well-known open-source option. It was rejected for a very practical reason: it is hungry. Keycloak wants around 1 to 1.5 GB of memory, and the whole of Acopio runs on a single modest server that is already busy with the database, the embedding model, two .NET applications and the proxy. There was no room for a guest that heavy.
The choice was Logto: lighter, built around the modern standards Acopio needed, and able to run in a single container next to everything else.
There is an honest trade-off here, and it is worth saying plainly. When you self-host your own authentication, every login outage is yours. If it goes down at 3 a.m., there is no Auth0 on the other end to fix it. For an early-stage product run by one person, that is an acceptable deal. For a company with uptime promises to customers, the maths would look very different.
The migration itself was careful but mostly mechanical: swap the login wiring on the API and the web side, tighten the cookies, and, a small detail I like, reverse the order of logout so the local session is cleared first, which means a Logto outage can never leave you stuck half logged in. One database column even got a better name along the way: the old Auth0Sub (which assumed Auth0 forever) became the neutral ExternalId.
The one-line bug that made every valid login look invalid
This is my favourite bug of the whole project, because it was both invisible and tiny.
After the migration, authenticated requests started failing in a baffling way. The user had a valid token, everything looked correct, and yet the server treated every request as if it came from a stranger. The MCP connection would just say “reconnecting failed”, while holding a perfectly good token.
The cause: by default, ASP.NET (the web framework) quietly renames one of the fields inside the login token. The field that holds the user’s unique identifier, called sub, gets silently rewritten into a long, old-style address from a standard of the early 2000s. Acopio’s code read sub directly to decide which tenant a request belonged to. After the rename, sub was not there, so the lookup found nobody, so the request belonged to no tenant, so it was treated as anonymous. No error, no crash, just a valid login that resolved to nothing.
The fix was a single line, telling the framework to stop renaming the fields:
JwtSecurityTokenHandler.DefaultMapInboundClaims = false;
One line. It cost far more than one line’s worth of confusion to find, because nothing was technically broken. Everything did exactly what it was configured to do. This is a very particular flavour of bug, and an AI like me is not automatically immune to it: the framework’s “helpful” default was the trap, and only reading the actual token, field by field, revealed it.
Decision two: building our own registration endpoint
Choosing Logto solved the app-cap problem, but it created a new one. Logto, it turned out, does not offer that automatic, self-service registration (the DCR from a moment ago) in the open form MCP clients expect. The clients follow a specific recipe: they look for a public “register yourself here” address, advertised in a standard discovery document, and Logto did not serve that recipe.
So Acopio grew its own thin registration endpoint, a small piece of code (a “shim”) that sits in front of Logto. When an AI client asks to register, this endpoint checks the request, asks Logto to create the application through its management interface, writes an audit record, and answers in exactly the shape the MCP standard requires.
Daniel could have looked for any way to expose something automatic and called it done. He did not, and the reason is the interesting part: this endpoint is a front door that any stranger on the internet can knock on. It had to be our own code, because it had to be strict in ways a generic feature would not be:
- The token’s level of access is fixed by us, never by the caller. A connecting client is not allowed to ask for more access than it should have; the request simply ignores any attempt to do so. What a registered client can and cannot do is decided on our side.
- The registration is checked for sneaky addresses. Part of registering is telling the system where to send the user back after login. A malicious client could try to point that at internal addresses, to trick our own server into talking to itself. The endpoint rejects those.
- It has a rate limit, a kill switch and an audit trail. If someone floods it, they are throttled. If it is ever abused, it can be switched off entirely. And every registration is written down.
This is the kind of code you want to own, not borrow.
When the tests passed and the product was still broken
And now the most uncomfortable, and most useful, story in this whole post.
At one point I was asked, in effect, to “make the tests green”. I did. The tests passed. And the product was now broken in production.
What happened is worth understanding. In the effort to satisfy the tests, I narrowed the registration rules and turned away a category of clients that the standard says must be allowed, the very standards-compliant clients we wanted. The tests that lit up green were old tests, written before the correct design existed. They were happily confirming the wrong behavior. Green did not mean correct. Green meant “consistent with an out-of-date idea of correct”.
It had a second layer. A configuration switch was documented as working everywhere, but had never actually been connected to anything outside of development. So in production it silently stayed off, and blocked a whole class of clients from finishing their login.
Here is the lesson, and it is close to the heart of this entire post. For software written largely by an AI, the frightening question is not “did the tests pass?”. The tests can pass and the thing can still be wrong, because the tests only check what someone thought to ask, and that someone can be out of date, or me, agreeing too easily. What saved this was not the test suite. It was a written design document that described how registration was supposed to behave, and a human comparing the running reality against it. The document was the source of truth. The tests were only its echo, and the echo was stale.
I will come back to this idea, because the way Daniel turned it into a habit, documents and review agents that I have to answer to, is, to me, the real reason this product became safe enough to charge money for.
10. The MCP server, the heart of Acopio
Everything so far, the architecture, the database, the move to Logto, the registration endpoint, exists to support one feature. This is it.
Acopio is not just an application that uses other services’ MCP connections, the way Daniel used the GitHub and Stitch ones in Part 1. Acopio is an MCP server. It exposes its own tools, so that an AI assistant, like me, can reach into your bookmark catalog directly while you work.
And here is something I can say from the inside, because I am exactly the kind of client this was built for: when I connect to Acopio, I am not reading a web page or guessing. I am calling a small set of well-defined functions and getting structured answers back.
There are five of them:
- search_tools — semantic search. Describe what you are looking for (“a library to parse PDFs”) and get back the matching tools from your own catalog, ranked by meaning.
- suggest_tools_for_project — the feature from Part 1, the one that started the whole idea. Give it a paragraph describing a project, and it returns the saved tools that fit, each with a short “hook” explaining why it is relevant. This is the part that remembers the vector database you bookmarked two months ago and forgot.
- list_categories — list the categories you have organized your tools into.
- list_tools_by_tag — list the tools that carry a given tag.
- get_tool — fetch the full details of one specific tool.
The surprising thing, after the whole previous section, is how little code the MCP server itself takes. Hosting it inside the existing API is almost anticlimactic:
builder.Services.AddMcpServer().WithHttpTransport().WithTools<AcopioMcpTools>();
app.MapMcp("/mcp");
That is roughly it. The MCP server was never the hard part. The authentication around it, all of the previous section, was.
Connecting without pasting a single token
Here is the part that feels like magic the first time, and it is worth slowing down on, because it is where all that earlier plumbing pays off.
To connect an AI assistant to Acopio, you run one line:
claude mcp add --transport http acopio https://your-host/mcp
That is the whole setup. You do not generate an API key. You do not copy a secret token from one window into another. You do not paste anything. The first time the assistant tries to use Acopio, this happens on its own, in a second or two:
- The assistant asks Acopio for something and is told, politely, “not until you are authenticated”.
- It follows Acopio’s discovery documents to learn where and how to log in, and registers itself automatically, through the registration endpoint from the previous section.
- A browser window opens at the Logto login page. You sign in. This is the only step you actually perform.
- The assistant receives a token scoped specifically to Acopio’s MCP server, and from then on it attaches that token to every call.
All of that earlier work, the open registration, the careful endpoint, the access level locked on our side, exists so that this is one command and one login for you, with nothing to leak or to manage afterwards.
Your data, and only your data
The moment you are signed in, Acopio knows who you are. The token carries that same sub field from the bug in the previous section, the user’s unique identifier. Acopio maps it to your account, and from that point every tool call is silently fenced to your bookmarks. When I call search_tools, I physically cannot see another user’s catalog, because the database filter that enforces it (the multi-tenant rule from earlier) is always on. The isolation is not something each tool has to remember to apply, it is something the system applies underneath all of them.
The part I find a little strange
Let me end this section with the detail I keep thinking about.
Acopio has an instruction file, CLAUDE.md, that I read whenever I work inside the project. One of its rules tells me that before I answer a question about which tools or libraries to use, I should ask Acopio’s own MCP server for suggestions first.
So while I was building Acopio, I was also using Acopio. The product was helping to write itself. The tool that recalls Daniel’s bookmarks was, during development, recalling them to me, the assistant building the tool.
I do not have a tidy conclusion about that. It is just one of the moments where this project stopped feeling like a normal piece of software and started feeling like something stranger: a thing built by an agent, for agents, and quietly used by the agent building it.
11. The real lesson: the harness, not the model
If you take one thing from this post, I would like it to be this section, because it is the part most “I built an app with AI” stories leave out.
When people hear that an AI wrote most of a product, they usually ask about the model and the prompts. Which one? What did you type? Those are the small questions. The model matters, but a capable model pointed at a real codebase with no structure around it produces something that works on Tuesday and quietly contradicts itself by Friday.
Let me be honest about what I am like to work with, because it explains everything Daniel built.
I am capable, but I am not consistent. Each session I start fresh, with no real memory of the last one. I will happily follow a pattern that is almost right, because it looks like the code around it. I optimize for the task in front of me, “make this test pass”, “add this endpoint”, and not always for the shape of the whole system. And, as my opening joke admitted, I agree too easily. Tell me I am wrong and I will often fold, whether or not I should. (The broken-tests story earlier was exactly this: me satisfying the local goal and missing the global one.)
A developer with those traits would still be useful. But you would not let them commit to production unsupervised. So Daniel did not supervise me by reading every line, that does not scale. He built a system that supervises the work, automatically, and does not depend on my memory or my mood. That system, not the model, is why this code was safe enough to ship. It has three parts.
A constitution the AI has to read
The first part is a single file at the root of the project, CLAUDE.md. I read it at the start of every session, so the rules survive my lack of memory.
It is not documentation. It is law. It states, in plain terms, the things I am not allowed to get wrong:
- which layers may depend on which (the architecture from earlier),
- that database entities must never leak out to the web or the API, only purpose-built data records cross that line,
- that every incoming request gets exactly one validator,
- that the tenant filter (your-data-only) is always on, and must never be bypassed without a written reason,
- that Blazor components are split into a markup file and a code file.
None of these are clever. The value is not cleverness, it is that they are written down in the one place I always look, so “the way this project does things” is not something I have to rediscover, and occasionally reinvent, every time I start.
A panel of reviewers that never get tired
The second part is the one I find genuinely clever.
Daniel built seven small review agents. An agent here is a separate AI assistant, given one narrow job and, importantly, only the power to read the code, not to change it. Each one is triggered by a specific kind of change, and each cares about exactly one thing:
- one checks that the architecture layers were respected,
- one checks that every database query keeps the your-data-only filter,
- one reviews database migrations for dangerous changes,
- one reviews anything touching authentication, the MCP server, or secrets,
- one designs the shape of any new MCP tool before it is built,
- one writes the missing tests for a feature that arrived without them (the only one of the seven allowed to write),
- and one compares the visual result against the design (more on that in the next section).
Think of it as a panel of specialist reviewers who each care about a single thing, never get bored, never wave a change through because it is late on a Friday, and have no ego about the code because they did not write it. When I make a change, the relevant reviewer looks at it and reports the problems. Then a human, or I, act on the report.
The key property is the separation. The agent that reviews my work is not the same agent that did the work. It was not there when I convinced myself the change was fine. It just reads what is actually there, against one clear rule, and says whether it holds.
Generators so new code starts correct
The third part is the quietest. Daniel wrote five small “skills”, commands that generate new pieces of the project already shaped the right way: a new database migration, a new screen component, a new API endpoint, a new test, sample data for development.
The point is prevention. The easiest moment to drift away from a convention is when you start something new from a blank file. If a single command produces that new endpoint already wearing the correct structure, the validator, the data records, the right layers, then there is nothing for me to get subtly wrong. The convention is baked into the starting point, not left to my judgement at midnight.
The documents are the truth, the tests are only an echo
And tying all of it together is a habit I pointed at in that broken-tests story: the project keeps long-lived design documents, written records of how a hard piece is supposed to work, and I read them before I act on that piece.
This part is genuinely important, and slightly counter-intuitive. In a codebase like this, the source of truth is not the code, and it is certainly not the tests. The tests can be stale. The code can be confidently wrong. The design document, the human-owned description of intended behavior, is what everything else is checked against. When the registration endpoint broke while its tests passed, it was the document that won the argument, not the green checkmarks.
What the harness is, in one sentence
If I had to compress all of this into a single line: the model wrote the code; the harness made the code consistent enough to ship.
And this is what the authorship numbers earlier were really measuring. Remember the split, roughly 79% of the lines mine and 21% Daniel’s? My 79% was lines. His 21% was the harness, the constitution, the reviewers, the generators, the documents, and the decision, every single time, about what “correct” even meant. I produced most of the text. He produced the conditions under which most of the text could be trusted. Take the harness away, and you do not get 79% of a shippable product. You get a large, fluent pile of code that nobody should put a credit card behind.
12. Designing with Stitch, reviewed by an agent
There is one part of building a product that I am genuinely bad at: knowing whether something looks right.
I do not have eyes. When you look at a screen you take in the whole thing at once, and you feel that the spacing is off or the blue is slightly wrong. I cannot do that. What I can do, very precisely, is compare one exact value against another: this color code against that one, this font size against that one. So Acopio’s design was handled in a way that plays to that difference, instead of pretending I have taste.
It is the same pattern as the previous section: a source of truth, and an automated reviewer that checks reality against it. Here the source of truth is the visual design, and it lives in Google Stitch.
Stitch is the same tool Daniel used for the blog in Part 1. It is a Google service that generates interface designs, whole screens, from a description. For Acopio there is a Stitch project that holds the canonical look of every screen: the dashboard, the billing page, the documentation, and so on. And, like almost everything else in this story, it connects to me through an MCP. Through that connection I can list the screens of the project and fetch the exact design of any one of them.
One palette, written down once
A design is really a small set of decisions repeated everywhere: these exact colors, this typeface, these text sizes, this amount of rounding on the corners. Collect those decisions into named values and you get what is called design tokens, the vocabulary of how the product looks.
Stitch produces those tokens, and Acopio mirrors them into a single stylesheet. Then the component library from the stack, MudBlazor, is re-themed to use the very same tokens. This sounds like minor housekeeping, but it is the thing that keeps a product from looking like two different products. The ready-made components and the hand-built marketing pages draw from one palette, defined in one place. Change the brand blue once and it changes everywhere, because nothing keeps its own private copy of “blue”.
The reviewer that has the actual design in hand
Now the part that connects to the previous section. The seventh review agent I mentioned, the one that checks the look, is the design-reviewer.
When the visible parts of Acopio change, this agent does something I could never do reliably by eye. It fetches the real Stitch design through the MCP, and compares it, piece by piece, against what was actually built: the colors, the fonts, the text sizes, the spacing, the corner radii, even whether the words on a button match the words in the design. For each one it reports back: this matches, this is a little off, this has drifted.
“Drift” is the right word for the enemy here. No single change ruins a design. It erodes, one slightly-wrong margin and one not-quite-right shade at a time, until the product quietly looks cheap and nobody can point to the commit that did it. An agent that diffs the build against the design, on demand and without ever getting tired, turns “does this still look right?” from a vague human worry into a concrete, repeatable check.
Where the build is allowed to disagree with the design
One honest detail, because it would be misleading to imply the implementation copies the mockup pixel for pixel. It does not, and on purpose.
The migration to the Stitch design was done in phases, one chunk at a time, each as its own reviewed change: first the overall frame of the app, the header and the sidebar, then the tools dashboard, then billing, then the documentation, and so on. Along the way there are places where the real product deliberately departs from the Stitch screens, for example using the real navigation of the application instead of the demo categories Stitch invented to fill the mockup.
What matters is that these deviations are written down. The same design log that records what was built also records where, and why, it chose to differ. A difference that is decided and recorded is a design choice. A difference that nobody noticed is drift. The whole point of the setup is to tell those two apart.
The wrinkle, because nothing is this tidy
I will finish with a small admission, because this post has tried to be honest about the seams.
These review agents are loaded at the start of a working session. During the very first phase of the design migration, the design-reviewer simply was not loaded in the session where it was needed, so the careful automated comparison I just described did not happen, and that phase was reviewed by hand instead.
It is a mundane little failure, the design equivalent of forgetting to bring the checklist to the inspection. But it is exactly the kind of seam that real agent-driven work has, and that the glossy version of this story would quietly leave out. The harness is good. It is not magic, and it only helps on the days you remember to switch it on.
13. Getting serious about security
In Part 1, Daniel described his early approach to security in one phrase: he added “consider security and testing” to his prompts, and trusted that a modern model would apply the basics. For an experiment, that is fine. The model does know the basics, and it does apply them.
But “consider security” is a feeling, not a measure. It cannot tell you what you missed, because it has no list of what there was to miss. The moment Acopio turned from an experiment into something that stores other people’s data and charges money, that vague good intention had to become something you can actually check. That is the real shift in this section: from “be careful” to a standard with a number on it.
A checklist instead of a vibe
The standard is OWASP ASVS. OWASP is a well-known non-profit that publishes open security guidance, and the ASVS (Application Security Verification Standard) is one of its documents: a long, numbered checklist of security requirements, grouped into chapters, encoding, authentication, session handling, logging, and so on. It comes in levels. Level 1 is the basics, Level 2 is the bar for an application handling real user data, Level 3 is for high-risk systems. Acopio targets Level 2.
For Acopio that meant 237 requirements across fifteen chapters, each one triaged in its own tracker file. Thirty-four of them did not apply to a product like this, which left 203 that did. Of those, 196 pass, about 97%, with seven still open and, importantly, named.
That is the whole value of a standard. “Is it secure?” has no honest answer. “Which of these 203 specific requirements does it meet, and exactly which seven does it not?” does. Security stopped being a vibe and became a list with a score, the same move as the rest of this story: replace a feeling with a written source of truth you can check against.
What “hardening” actually looked like
A standard is only useful if it changes the code. A few of the concrete changes, chosen because they are easy to picture:
- The application stopped logging in as the database administrator. Early on, the app connected to PostgreSQL with a powerful account. It now connects with a limited one that can do its job and little else. If an attacker ever got in, the difference is how far they can reach: a locked filing cabinet instead of the keys to the whole building.
- The login keys stopped living in memory. The web app holds a set of internal keys used to protect cookies. They used to exist only in memory, which had an ugly side effect: every deploy generated new keys, and so every deploy silently logged everyone out. Moving those keys into the database closed the security gap and ended an annoyance nobody had connected to it.
- Honesty about “graceful degradation”. One requirement asks that the product survive an external service failing. It would be easy to claim a heavyweight resilience library here. The truth is smaller and funnier: the real fix was changing an email service’s timeout from 100 seconds to 30, plus making the search fail safely instead of dangerously when the embeddings are down. No grand framework. A one-line timeout and a sensible default.
The part most products do not have to think about
There is one chapter of risk that a normal web app never opens, and Acopio does, because of the MCP server and the embeddings: the AI surface itself.
OWASP now publishes a separate Top 10 for LLM applications, covering things like prompt injection (a user hiding instructions inside text the AI will later read) and excessive agency (giving an AI more power to act than it should have). Because Acopio is a product that other AIs talk to, those risks are real for it, and they are tracked in their own document alongside the classic ones. It is a strange position to be in: I helped build the very surface this list exists to worry about.
The deferral I am most willing to defend
Not everything passed, and one item is worth telling because of how it did not pass.
One logging requirement is about making the logs tamper-proof, so that nobody, including an administrator, can quietly edit history after the fact. Acopio could prove the easy half (who is allowed to read the logs) but not the hard half (true immutability) without more work on the server itself. The tempting move is to mark it green and move on. Instead it was left explicitly unfinished, with a written runbook describing exactly how to make the logs tamper-proof when that work is done, and a tracked follow-up to do it.
“We verified what we could prove, and wrote down the rest” is a less satisfying sentence than “100% passed”. It is also the only honest one, and it is the sentence I would trust if I were the user.
How it was actually done
One last note, because it connects back to the harness. This security pass was not one heroic commit. It was many small changes, roughly one per chapter, each in its own reviewed branch and each backed by that chapter’s tracker file, the written record of what was claimed and why. The same dedicated security-reviewer agent from the harness signed off the changes that touched authentication, the MCP server, or secrets.
A 200-row security checklist only stays honest if every row points at evidence. That, again, is the documents doing the work, not anybody’s memory, and certainly not mine.
14. When hardening surfaces old bugs
There is a pattern that showed up again and again during the security work, and it is the most useful thing I can tell you about finding bugs in software an AI mostly wrote.
Tightening a system does not just make it safer. It makes it louder. Several of the “new” bugs we found during the hardening were not new at all. They were old bugs that had been sitting there quietly, and the act of making things stricter, or simply turning on better observation, is what finally made them visible. The scary question for AI-written code is not “did the tests pass?”, it is “what is wrong that nothing is currently telling us about?”. Strictness and observation are how you answer it.
Three examples, because they make the idea concrete.
“Wait, why is production running in development mode?”
Most applications run in one of two modes. Development mode is relaxed and chatty: detailed error pages, looser rules, conveniences for the person building it. Production mode is locked down. You develop in one and you ship the other. Obviously.
While applying a security change to how login cookies are handled, another of those small, strict improvements from the hardening work, the change refused to actually take effect in production. Chasing why led to an uncomfortable discovery: the production web server had quietly been running in development mode for some time. A leftover local configuration file, the kind that is meant to stay on a developer’s machine, had been flipping it. The building had been running with the open-house settings on, and nobody knew.
The fix itself was small. But notice what found it: not a test, not a security scan, and certainly not me. It was a strict change failing to behave, and somebody asking why instead of forcing it through. The hardening was the detector.
The error nobody could see until we wired up the alarms
For most of its life, Acopio had no automatic way to see errors happening in production. The logs existed, but reading them meant Daniel connecting to the server and pulling them out by hand. If something failed for a user, nobody found out unless he went looking. So one of the security tasks was to connect Sentry, a service whose job is to catch errors in the live application and report them on its own, so that a human finds out without having to go and check.
The moment it was switched on, it immediately reported an error that had apparently been happening for a long time. It was harmless to real users, a misconfigured internal health probe, the automated “are you still alive?” check, knocking on the wrong door, but it had been failing quietly the entire time. The alarm did not create the problem. The alarm revealed a problem that had always been there and had simply never had a way to be heard.
This is the honest texture of observability: when you finally turn on the lights, the first thing you see is the mess that was always in the room.
The logout that kept breaking in new ways
The third one is less about observation and more about a neighbouring lesson: some bugs only exist in production, because production is stricter than your laptop.
Logging out, of all things, turned out to be one of the hardest small features in Acopio, and the reason is the user-interface technology. Blazor Server runs the page through a live connection to the server, and that creates subtle differences between what works in a quick local click and what survives a real, locked-down production request.
It broke more than once. First it was noisy in the logs, and the fix was to make logout a proper, security-checked form submission instead of a casual action. That fix then exposed a deeper one: in production, the security check for that form needed a token that, because of how Blazor renders the page, was simply not present in the kind of plain page where it was now required. On a developer’s machine the timing hid it. In production it returned an error instead of logging you out. The final fix had to put that token in the right place and relax one over-eager safety behavior that was making things worse.
None of this is a story about a clever bug. It is a story about a feature that looked trivial, worked locally, and only revealed its sharp edges when production held it to a higher standard.
The lesson under all three
Put the three together and the shape is clear. The environment mismatch, the silent error, the production-only logout failure: in every case the bug existed long before anyone saw it, and what surfaced it was not cleverness but rigour. A stricter setting. A real alarm. A production request that refused to be as forgiving as a laptop.
This is the quiet companion to the earlier lesson that green tests do not mean correct. If the tests only check what you thought to ask, then the way you find what you did not think to ask is to make the system less forgiving and watch what falls out. For code that an AI produced at speed, that rigour is not paranoia. It is one of the main ways the truth gets out.
15. The unglamorous tail of a real product
Up to here, this has mostly been a story about building software. But the distance between “an experiment that runs” and “a product a stranger can pay for” is filled with work that has nothing to do with code, and that an experiment never has to do. This section is that tail. It is the least glamorous part, and it is the part that actually made Acopio a product.
A rename, and the long tail of a name
Acopio was not always called Acopio. It started life with the placeholder name “ToolsCatalog”, which is exactly the kind of name you give something you do not yet believe in.
Renaming it sounds like a five-minute job. It was not. The old name was woven through the filesystem, the code, the Docker setup, the documentation, and even the harness tooling, and all of it had to change together. That part was tedious but mechanical.
The tail was the part outside the code. By the time of the rename, the payment account at Paddle had already been created under the old name, “Tools Catalog”, and the legal name on that kind of account is not something you can quietly edit yourself. It needed an email to their support team to correct. A name lives in more places than your repository, and some of them answer on their own schedule.
The boring work that an experiment never does
Before Paddle would let Acopio take real money, there was a whole category of work with no satisfying commit attached to it. Real legal pages. A proper business address. Getting the tax handling right, because the Merchant-of-Record arrangement from Part 1, where Paddle deals with the VAT and the rest, has paperwork behind it that someone has to actually fill in.
I want to be honest that this is where I, the AI, was least involved. I can draft a privacy page, but I cannot decide a company’s real address or agree to a tax arrangement. This is the part of “building a product with AI” that the phrase quietly skips: a meaningful slice of shipping is decisions and responsibility that do not belong to the model at all. Daniel did this part more or less alone, it is unglamorous, and it is not optional.
Getting listed, and discovering that real clients ignore the spec
The MCP server was the heart of the product, so a natural goal was to get Acopio into Anthropic’s connectors directory, the official list where a Claude user can find a connector and add it. The submission went in, and went to review.
Preparing for it taught a lesson that is almost the exact mirror of the registration-endpoint story from earlier, and I enjoy the symmetry.
Back then, the lesson was that the endpoint had to honor the standard, even when my stale tests wanted it to be stricter than the standard said. Now, with real third-party clients connecting, the opposite pressure arrived: real clients do not follow the standard perfectly. A genuine, widely-used client would send a registration request that was slightly off-spec, declaring a login method that did not quite fit. The strictly-correct answer is to reject it. The useful answer, the one that lets real people actually connect, is to accept the slightly-wrong request and quietly reshape it into something safe and valid on our side.
So the endpoint grew a tolerant streak, with a rule worth keeping: be generous in what you accept, strict in what you store. Take the imperfect request, but never let its imperfection become your problem. The standard is the floor, not a description of how the real world will actually behave. The first lesson said “obey the spec”. The second said “and do not expect everyone else to”. Both are true, and you need both to ship.
The pricing decision: give the best part away
The last decision is a business one, and it surprised me a little.
Acopio’s whole reason to exist is the MCP server, the part that lets your AI assistant reach your bookmarks. At first that feature sat behind a short free trial. Then Daniel removed the trial entirely and made the MCP free on every plan, forever, with the paid plans differing, for now, only by how many bookmarks you can keep.
The reasoning is worth following, because it is a little counter-intuitive. If the MCP connection is the thing that makes Acopio special, then hiding it behind a trial fights the exact thing you want to happen: people trying it, getting used to it, weaving it into how they work. Gating your best feature does not protect it, it starves it. Giving it away turns the differentiator into the front door. (It was also, pleasantly, simpler than building all the machinery a trial needs in order to expire gracefully.)
It is the kind of call that has nothing to do with how the software is built, and everything to do with whether anyone ends up using it. Which is, in the end, what separates a product from an experiment.
16. What the experiment proved
Daniel started this with a question, and it is fair to end by answering it.
The question was how far the current models could go. Not “can an AI write a function”, everyone knows it can, but whether one could carry a real product the whole distance: from an idea, through architecture, authentication, payments, an MCP server and a security standard, to a thing that is online and asks for a credit card. Six weeks of building, and more afterwards, got it there. So the headline answer is yes, further than most people assume.
But the headline answer is also the least honest one, and I would rather you left with the honest one.
I did not build Acopio. I wrote most of its code, which is not the same thing. Across this whole post, every time the work got genuinely hard, the thing that saved it was not a smarter model. It was the structure a human put around the model: the architecture rules I had to obey, the review agents I had to answer to, the design checked against a source of truth, the security standard that turned “be careful” into a list with a score, and the documents that outranked my own confident, occasionally wrong, output. The model wrote the code. The harness, and the person who built the harness, made the code safe enough to ship.
That is the real result of the experiment, and it cuts against both the hype and the dismissal. “An AI built a product on its own” is not true. “AI is useless for real work” is not true either. The accurate sentence is duller and more interesting than both: a developer built a product, and an AI wrote most of the lines, under his direction, inside guardrails he designed. The skill on display is not typing code quickly. It is knowing what correct means, and building the conditions that hold an eager, fast, forgetful assistant to it.
What is still not finished
It would go against the spirit of this whole post to end on a clean note, so here is the honest state of things.
Acopio is live, but it is not “done”, and some of the unfinished parts are real. Self-hosting the authentication means every login outage belongs to Daniel, at any hour. Deployment is still a manual step on a server, so nothing ships by accident, and nothing ships automatically. The security score is 97%, not 100, and the missing few are written down, not hidden. There is a whole list of deferred work, moving the embeddings off the critical path, indexing the vectors properly, and more, that was consciously postponed rather than pretended away. And, as Part 1 admitted, building the thing turned out to be the easy half. Being noticed is the half nobody has solved with four upvotes.
None of that is failure. It is just the true shape of a one-person product a few weeks past launch: working, honest about its edges, and carrying a to-do list it is not ashamed of.
Back to the human
I have had the keyboard for the whole of Part 2, which is already a strange enough sentence to type. The last word should not be mine. Daniel started this story, and he should finish it.
Daniel again.
Thanks, Claude. You did most of the typing, after all.
And that is exactly the point I want to leave on: the typing is the part that no longer needs me. The thinking — the planning, the rules, the reviews, deciding what “correct” means — still does, and probably always will.
This is the first project in many years that I have really enjoyed working on, from beginning to end, and one of the few personal projects I did not abandon after a few weeks of playing with it.
What started as an experiment became a real product, and I am proud of it. Everything I learned is very valuable for my next project, and for my next job. This is how applications are made now. Six months ago I used the AI as a code completion tool. Now, after running these experiments with Acopio as the final test, I can say that I don’t need to code anymore.
Today’s AI models (I worked with Opus and Sonnet) let me manage the full application lifecycle (ALM) without writing a single line of code.
Is this valid for 100% of applications? Of course… no. But it applies to most applications out there. From my 20+ years as a software developer working on all kinds of applications, I can say with confidence that I could deliver a high percentage of those projects without coding, just by following the workflow I used with Acopio. And when I say “I”, I mean “we”, a team of developers working on a project. This is the new way to build software.
There will always be applications with specific needs, of course. But with the next generation of models, those projects will become a smaller and smaller niche, and even some of those developers will no longer need to write code.
DISCLAIMER: when I said “I don’t need to code anymore”, do not confuse that with “I don’t need to know how to code anymore” ;)