How GPTBot Drained My Vercel Usage
The hidden cost of AI crawlers: When Fast Origin Transfer hits 100%
It started with a notification I didn't expect to see.
"You have used 100% of your Fast Origin Transfer limit."
I host this space on Vercel. Usually, my usage is well within the limits of the Hobby plan. I serve text, some images, and simple pages. It's efficient.
But suddenly, my account was redlining.
I dove into the monitoring dashboard, looking for the culprit. Was it a DDoS attack? A viral post? A recursive loop in my code?
No. It was this:
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GPTBot/1.3; +https://openai.com/gptbot)
The Culprit: GPTBot
OpenAI's crawler, GPTBot, had decided to index my site. And it didn't just "visit." It seemingly tried to ingest everything, all at once, aggressively.
The result? It maxed out my Fast Origin Transfer.
What is Fast Origin Transfer?
On Vercel, "Fast Origin Transfer" typically refers to the bandwidth used to fetch content from your origin server (or the compute functions generating your pages) to Vercel's Edge Network.
If you have a lot of Server-Side Rendered (SSR) pages, or if your cache hit ratio drops because a bot is requesting thousands of old, obscure URLs that haven't been visited in months (and thus aren't in the cache), Vercel has to go back to the origin (or execute the function) for every single request.
For a normal user, this is negligible. For a bot crawling thousands of pages at high speed? It adds up instantly.
The Irony of the "Success"
The logs showed a sea of 200 OK responses.
no error (success)
From a technical standpoint, my site performed perfectly. It served every request the bot threw at it. It scaled up, handled the traffic, and delivered the content.
But that "success" meant my infrastructure quota was being burned to fuel a dataset for a model I don't own, for a company worth billions, while I'm here worrying about my personal site's limits.
The Fix: Banning the Bot
If you want to stop your personal project from becoming collateral damage in the AI arms race, you need to update your robots.txt.
Here is how to tell GPTBot to go away:
User-agent: GPTBot
Disallow: /
You can also block other common AI crawlers while you're at it:
User-agent: GPTBot
Disallow: /
User-agent: CCBot
Disallow: /
User-agent: Google-Extended
Disallow: /
Conclusion
We are currently in a weird phase of the web. We publish content to share with humans, but increasingly, our "readers" are machines vacuuming up data.
If you're self-hosting or using platform-as-a-service tiers like Vercel, Netlify, or AWS, keep an eye on your logs. Your "viral" traffic spike might just be a hungry model looking for training data—and leaving you with the bill.
