← Back to Guides
8 min readIntermediate
Share

Caching Strategies for Vibecoded Apps

Most vibecoded apps either cache nothing (slow, expensive) or cache everything (stale, buggy). Here's how to pick the right layer for each kind of data.

Caching Strategies for Vibecoded Apps

Ask an AI assistant to "make the API faster" and you'll often get a cache slapped onto whatever function happened to be slow, with no thought about invalidation. Six requests later, users are seeing yesterday's data. Caching is one of the few areas where doing nothing is often safer than doing it wrong — so it's worth understanding the actual decision tree instead of reaching for the first pattern that compiles.

Start by asking: how often does this data change, and who cares if it's stale?

That single question decides which of the four layers below you need. Most apps only need one or two of them.

Layer 1: HTTP cache headers (free, and the first thing to reach for)

If a response is served over HTTP and doesn't depend on the requesting user, Cache-Control headers let the browser and any CDN in front of your app skip the request entirely on repeat visits.

Cache-Control: public, max-age=3600, stale-while-revalidate=86400
  • max-age=3600 — serve straight from cache for an hour, no request to your server at all.
  • stale-while-revalidate=86400 — after that hour, serve the stale copy immediately while quietly re-fetching in the background. Users never wait on a cache miss.

This is the right layer for: public marketing pages, blog/guide content (exactly what powers this site's own /blog and /guides), API responses that are the same for everyone (a public leaderboard, a status page).

It is the wrong layer for: anything gated by auth, anything that must reflect a write the user just made.

Layer 2: In-memory cache (per-server-process, cheap, resets on deploy)

For data that's expensive to compute but doesn't need to survive a restart — a parsed config, a rate-limit counter, a small lookup table — a plain in-memory Map with a TTL is often all you need. No Redis, no extra infrastructure, no network round-trip.

const cache = new Map<string, { value: unknown; expires: number }>();

function getCached<T>(key: string, ttlMs: number, compute: () => T): T {
  const hit = cache.get(key);
  if (hit && hit.expires > Date.now()) return hit.value as T;
  const value = compute();
  cache.set(key, { value, expires: Date.now() + ttlMs });
  return value;
}

The catch: this cache is local to one server process. On a multi-instance deploy, each instance has its own copy, so different users can see different cached values until each instance's TTL expires independently. Fine for rate-limit buckets and computed constants; not fine for anything that needs a single source of truth across instances.

Layer 3: Shared cache (Redis / Upstash / similar)

Once you have more than one server process and need every instance to see the same cached value — session data, a shared rate limiter, a expensive database aggregate — you need a cache that lives outside any single process. This is real infrastructure with its own failure modes (the cache going down shouldn't take your app down with it — always have a fallback path to the source of truth), so don't reach for it until the in-memory layer actually breaks down under multi-instance load.

Layer 4: Client-side cache (SWR / React Query)

For data fetched from the browser, a client cache with automatic revalidation (SWR, TanStack Query) solves a different problem: it stops your UI from re-fetching the same data on every component mount, and it deduplicates simultaneous requests for the same key. This is a UX optimization, not a server-load optimization — the request still hits your API on a cache miss, it just avoids redundant client-side refetches.

const { data, isLoading } = useSWR("/api/user/profile", fetcher, {
  revalidateOnFocus: false,
  dedupingInterval: 60_000, // treat repeated calls within 60s as one
});

The invalidation problem — the part that actually matters

Caching is easy. Invalidating a cache correctly when the underlying data changes is where bugs live. Two patterns cover almost every case:

  • TTL-based (time-based expiry) — simplest, and the right default for anything where a few minutes of staleness is acceptable (search results, aggregated stats, a homepage tool count).
  • Event-based (invalidate on write) — required when staleness is user-visible and unacceptable. If a user edits their profile, the profile cache for that user must be cleared at the moment of the write, not on a timer. Miss this and you'll ship a bug where "I just saved and it still shows the old value."

If you're not sure which pattern a piece of data needs, default to a short TTL (30–60 seconds) rather than no cache at all — it absorbs traffic spikes without creating a "why is this stale for 10 minutes" support ticket.

What to tell your AI assistant

When asking for a caching layer, specify all three of: what triggers invalidation, how long staleness is acceptable, and whether the cache needs to be shared across server instances. Without those constraints, you'll get a cache that works in the demo and misbehaves the first time two users hit it at once.

Stay in the flow

Get vibecoding tips, new tool announcements, and guides delivered to your inbox.

No spam, unsubscribe anytime.