All projects
Shipped

URL Shortener with Analytics

A link shortener that treats every redirect as an event — sub-10ms redirects on the hot path, click analytics rolled up asynchronously into a React dashboard.

Architecture diagram for URL Shortener with Analytics

Problem

The shortener itself is a solved problem — the interesting part is the tension it creates. A redirect must be as close to instant as possible (every millisecond is user-visible), but the whole reason to self-host one is the analytics: who clicked, when, from where, on what device. Recording analytics synchronously puts a database write on the hottest path in the system. The design goal was simple to state: the redirect must never wait for analytics.

Architecture

Two paths, deliberately unequal.

Hot path (redirect): GET /:code hits the Node service, checks an in-process LRU cache (codes are immutable once created, so cache invalidation is a non-problem), falls back to Postgres on a miss, and issues a 301. The click event — timestamp, code, referrer, coarse user-agent — is pushed onto an in-memory buffer and the response goes out. The buffer is flushed to a click_events table in batches (every second or 500 events, whichever comes first) by a background loop in the same process.

Cold path (analytics): a rollup worker aggregates click_events into hourly buckets per link — clicks, unique referrers, device split — and writes them to a link_stats table. The React dashboard reads only from rollups; raw events are pruned after 90 days. Short codes are 7-character base62 strings derived from a sequence with a random offset, which keeps them non-guessable enough without a collision-check loop.

Everything ships as two containers (API + worker) plus Postgres, composed with Docker; the demo runs the API on a single small instance.

Trade-offs

  • Buffered writes over write-per-click. Batching cut p99 redirect latency from ~28 ms to ~6 ms in load tests, at the cost of an honesty window: a crash can lose up to a second of click data. For analytics — not billing — that's the right trade. I wrote it down in the README so future-me doesn't "fix" it.
  • Postgres over a dedicated OLAP store. ClickHouse would shrug at this volume, but it's another system to run. Hourly rollups keep dashboard queries on indexed, pre-aggregated rows; Postgres comfortably handles millions of raw events before this decision needs revisiting.
  • 301 over 302. Permanent redirects let browsers and CDNs cache, which is free speed but means cached repeat clicks go uncounted. I chose honest speed over inflated numbers; a flag flips any link to 302 when counting matters more.
  • In-process cache over Redis. One less moving part. The cost is a cold cache per instance after deploys — acceptable at this scale, and the Postgres fallback is still a single indexed lookup.

What I'd do differently

The in-memory event buffer was the right call until the day I wanted a second API instance — then "flush loop per process" quietly became "N flush loops with N failure modes." Next iteration moves the buffer to a proper queue (which is exactly what pushed me to build the job queue project). I'd also capture coarse geo (country from a local MaxMind lookup) from day one; it's the first thing anyone asks of link analytics and it's painful to backfill from pruned events.