I scanned an AI-built invoice app. The customer sets their own price.
I run a security pipeline that tears apart AI-generated web apps. This week I pointed it at an invoice and checkout application that was built entirely by an AI app builder. The kind of thing that goes from idea to live, taking-payments URL in an afternoon.
It found something that should stop anyone shipping commerce this way: the customer controls the price.
The amount the app charged was read straight from the browser’s request. Change one number before it reaches the server and you pay one dollar for a thousand-dollar invoice. The server never checked what the invoice was actually worth. It charged whatever the client told it to.
That alone is bad. It got worse the further the scan looked.
The whole money path trusts the client
Three findings stacked on top of each other, and together they describe an application that takes the customer’s word for everything that matters.
The price comes from the client. The charge amount is pulled from the request body instead of from authoritative server-side records. Whatever the browser sends is what gets billed.
Payment is never verified server-side. There is no signed webhook from the payment processor confirming the charge actually went through. Instead the app decides whether you paid by polling an endpoint driven by the browser. In other words, the client tells the server “I paid,” and the server believes it.
Duplicate events are not deduplicated. The same payment confirmation can be recorded more than once, so a single event can be replayed.
Put those together and the picture is stark. A customer can set their own price, tell the server they have paid without paying, and replay that confirmation. The entire flow that moves money is built on trusting the one party you cannot trust: the browser.
Why a code scanner does not catch this
Here is the part that matters most, and it is the reason I keep doing this research.
This is not the kind of bug a pattern scanner finds. There is no dangerous function call. No eval, no SQL built from string concatenation, no hardcoded key sitting in plain sight. The code is valid. It reads cleanly. A tool like Semgrep, which looks for known-bad shapes, has nothing to grab onto, because nothing here matches a known-bad shape.
The flaw is not in the syntax. It is in the trust boundary. The code takes a number that should have come from the server and accepts it from the client instead. There is no rule for “this value came from the wrong source.” Catching it requires a review that actually reasons about where each value originates and whether the server should believe it.
That is exactly what the reasoning layer in my pipeline is for. The pattern scan runs first and catches the mechanical stuff. The reasoning layer runs on top and catches the logic: who controls this number, is this amount authoritative, does the server verify what the client claims. In AI-generated apps, the second category is where the real damage lives, and it is invisible to the first.
Why AI builders produce this
This is not one bad tool, and it is not bad luck. It is structural, and once you see it you see it everywhere:
The model optimizes for “checkout works.” On the happy path, the price flows from the page to the charge and the payment succeeds. The feature demos perfectly. Nobody in a demo tries to change the price, because there is no adversary in a demo.
Server-side authority is invisible plumbing. “The price must come from a record the customer cannot edit” is not a feature you can see on screen. It produces no visible difference when it is missing. So it gets left out.
Nothing turns red. The build passes, the page loads, a test charge goes through. Every signal in the normal workflow says success. None of them say “a hostile customer could rewrite this.”
What the fix looks like
The repair is real work, not a one-liner, and that is worth being honest about.
Take the price from the server. The charge amount must come from an authoritative record keyed by something the customer cannot tamper with, never from the request body.
// Illustrative, not the actual code
// Wrong: the browser decides the price
const { amount } = req.body;
// Right: the server looks up the real amount
const invoice = await db.getInvoice(invoiceId);
const amount = invoice.amountDue;
Verify payment with a signed webhook. Trust the payment processor’s signed event that the charge completed, not a status the browser reports. And verify the signature so the event cannot be forged.
Reject duplicates. Track processed event IDs so the same confirmation cannot be replayed to record a payment twice.
None of this is hard. It simply was never done, because the app worked without it.
What this means if you ship AI-built commerce
If you have built anything that takes money with an AI app builder, the honest assumption is that some part of your money path trusts the client. Not because you did anything wrong, but because the tools generate this pattern by default and nothing in the workflow flags it.
The takeaway is not “stop using AI builders.” They are genuinely useful. The takeaway is that “checkout works” and “checkout is safe to expose to paying strangers” are two different statements, and the gap between them is exactly where this lives.
Two things worth doing today: trace where your charge amount comes from, and if it originates anywhere the customer can edit, that is your hole. And confirm payment success is verified by a signed event from your processor, not by anything the browser tells you.
Scan your app
I built a free scanner that runs these checks on AI-generated apps. Send me an email (traef@wewatchyourwebsite.com) and let’s chat about your app.
How much of your checkout still trusts the browser? That is the question worth sitting with.
