Vercel AI Gateway: why I stopped calling the Anthropic SDK directly
I switched production calls from @ai-sdk/anthropic to model strings via Vercel AI Gateway. Provider-agnostic fallback, OIDC auth without API keys, $5/day cap. Before/after streamText snippets.
For half a year I called Anthropic via @ai-sdk/anthropic in every production deploy. Standard pattern: npm install @ai-sdk/anthropic, drop ANTHROPIC_API_KEY into Vercel ENV, plug it into streamText({ model: anthropic('claude-sonnet-4-6') }), done. Works. Until the day Anthropic went down for an hour with 529 errors and I had nowhere to fail over. That's when I migrated to Vercel AI Gateway and I'm not going back.
What AI Gateway is and what it solves
Vercel AI Gateway is a proxy for LLM providers that sits between your app and Anthropic/OpenAI/Google/xAI. Instead of calling an SDK directly, you call models via the Gateway endpoint. That gets you four things that are hard to build yourself:
- Provider-agnostic model strings -
'anthropic/claude-sonnet-4-6','openai/gpt-5.1','google/gemini-3'. Noimport { anthropic } from '@ai-sdk/anthropic'. Same code, different string. - Fallback chain - when the primary provider returns 5xx or 429, Gateway switches to a secondary inside the same request. The client only sees the successful stream.
- Centralized observability - every request in one dashboard, breakdown by model, project, route. Without bolting on Datadog.
- OIDC auth on Vercel - Vercel deploy signs the request, Gateway verifies. No API key in ENV.
Before: direct Anthropic SDK
This was my pattern in DokladBot, Maruška, customer support bots.
import { anthropic } from '@ai-sdk/anthropic';
import { streamText } from 'ai';
export async function POST(req: Request) {
const { messages } = await req.json();
const result = streamText({
model: anthropic('claude-sonnet-4-6'),
system: 'You are a helpful assistant.',
messages,
maxOutputTokens: 2048,
});
return result.toUIMessageStreamResponse();
}Looks fine. Three problems:
- API key in ENV. When I rotate the key (regularly, for leak prevention), I have to redeploy. Two-minute pause for production.
- Single provider. Anthropic 529 = my endpoint returns 500. No fallback.
- Per-project usage tracking. Anthropic Console gives me total spend on the key, no per-route breakdown.
After: AI Gateway via model string
import { streamText } from 'ai';
export async function POST(req: Request) {
const { messages } = await req.json();
const result = streamText({
model: 'anthropic/claude-sonnet-4-6',
system: 'You are a helpful assistant.',
messages,
maxOutputTokens: 2048,
});
return result.toUIMessageStreamResponse();
}That string instead of an import is not cosmetic. AI SDK v6 detects the provider/model format and routes through Gateway. No @ai-sdk/anthropic in package.json. No ANTHROPIC_API_KEY in ENV. When the deploy runs on Vercel and is linked to the AI Gateway integration, the OIDC token attaches automatically.
Local dev works the same via vercel env pull and vercel dev - the Gateway key is for local runtime, OIDC kicks in at deploy time.
Fallback chain in 2 lines
This is the main reason I'm not going back. AI SDK v6 supports it directly in options:
import { streamText } from 'ai';
const result = streamText({
model: 'anthropic/claude-sonnet-4-6',
providerOptions: {
gateway: {
order: ['anthropic', 'openai'],
},
},
messages,
});order: ['anthropic', 'openai'] means: try Anthropic, on fail jump to the OpenAI equivalent. Gateway internally maps claude-sonnet-4-6 → gpt-5.1 (or whatever you configure as equivalent).
Real story: during a demo deploy of Maruška for a client, Anthropic dropped for 47 minutes with 529. Gateway switched to OpenAI fallback, the demo finished, the client never noticed. Without this I would have had to interrupt the demo.
Cost cap and observability
Vercel dashboard has per-project budget for Gateway. I set $5/day soft cap, $20/day hard cap on every project. Soft cap exceeded → I get a notification. Hard cap = Gateway returns 402, and I know there's a leak somewhere.
| Before (direct SDK) | After (AI Gateway) |
|---|---|
| API key per provider in ENV | No key on Vercel deploy (OIDC) |
| Single provider per request | Configurable fallback chain |
| Anthropic Console = only usage view | Per-project, per-route breakdown |
| Provider package per model | 'provider/model' string |
| Provider migration = code change + redeploy | Migration = config change |
| Cost cap = roll your own alerting | Built-in soft/hard cap |
Edge cases you might hit
1. Streaming compatibility. AI Gateway fully supports the streamText SSE flow. No extra buffering, < 50ms overhead vs direct SDK call. I measured p95 240ms (gateway) vs 215ms (direct) on a typical 800-token response.
2. Tool calling across providers. anthropic/claude-sonnet-4-6 and openai/gpt-5.1 have different tool schemas, but AI SDK v6 normalizes them. You define tools once, fallback works.
3. Local dev. If you don't use vercel dev, you'll need AI_GATEWAY_API_KEY in .env.local. For CI/CD pipelines that's an extra secret, but only one - instead of per-provider keys.
4. Pricing. Gateway doesn't add markup. You pay provider price plus a minimal observability fee. For me roughly $0.50/month across all projects.
Migrating an existing project
In Maruška the migration took 9 minutes:
# 1. Drop the provider package
pnpm remove @ai-sdk/anthropic
# 2. Find every anthropic('...') call
grep -rn "anthropic('" src/
# 3. Replace with model string
sed -i "s/anthropic('claude-sonnet-4-6')/'anthropic\/claude-sonnet-4-6'/g" src/**/*.ts
# 4. Remove ANTHROPIC_API_KEY from Vercel ENV
vercel env rm ANTHROPIC_API_KEY productionPlus a PR with a diff that looks trivial. Code review was more about whether it makes sense than about correctness.
Lessons
- Model string > SDK import is the right default for AI features on Vercel. Provider-package per model is technical debt.
- Fallback chain saves demo presentations. At least once a year your primary provider dies on the day you have an important call.
- OIDC auth = less secret rotation work. No API key in ENV means nothing to compromise.
- Set the cost cap before the first deploy. I once left an app with a buggy retry loop running overnight - without the cap it would have been five figures.
- AI SDK v6 ships a compatible migration path. You can run hybrid (some routes on Gateway, some direct) during migration.
What's next
- DokladBot case study → - the project where Gateway dropped operational pain to zero
- Claude Code workflow → - how Claude Code reviews these migrations
- Multi-tenant Postgres → - another "do it right once" pattern
If you're migrating from direct SDK to Gateway on your own project, drop me a line. Most migrations are under 30 minutes.