All posts

· 9 min read

URL Shortener — Part 1: The Whole Thing in 173 Lines of Go

The smallest URL shortener that works end to end — hash-as-code, a map behind a mutex, HTMX instead of a framework — and the exact line where each of those three choices becomes the wrong one.

Part of project: URL Shortener in Go

Paste a long URL into a web page, get a short one back, click it, land on the original. That's the entire feature, and it's 173 lines of Go across three files — no database, no JavaScript framework, no dependencies past a single CDN script tag. It compiles to one ~11.7 MB binary that serves north of 19,000 requests a second on every route.

This is Part 1 of a series that builds a URL shortener and then breaks it on purpose — persistence, multiple instances, collisions, custom codes, analytics, one constraint at a time. Part 1's job is the smallest thing that genuinely works end to end. Three decisions define it, and they're connected: how codes get generated (hash the URL), where they're kept (a map in memory), and how the page updates (HTMX, no framework). Each is the simplest defensible choice. Each is the wrong choice for a version of this app that doesn't exist yet. That tension is the whole post.

The shape of it

There's no router library here. It's the standard library's http.ServeMux, which since Go 1.22 understands HTTP methods and path wildcards directly:

mux := http.NewServeMux()
mux.HandleFunc("GET /{$}", app.indexHandler)
mux.HandleFunc("POST /shorten", app.shortenHandler)
mux.HandleFunc("GET /{code}", app.redirectHandler)
mux.Handle("GET /styles/", http.StripPrefix("/styles/", http.FileServer(http.Dir("styles"))))

Four routes do everything. The {$} anchor on the first one means "exactly /", not "anything starting with /". POST /shorten takes the form submission. GET /{code} is the catch-all redirect: any single path segment that isn't a more specific route lands here. And /styles/ serves the one CSS file.

The ordering subtlety is worth naming now, because it bites later: GET /{code} matches any one-segment path. Request /favicon.ico and you get short code not found (404) from the redirect handler, because there's no more specific route to claim it. The mux resolves by specificity, not declaration order — so /shorten and /styles/ beat the wildcard — but anything else falls through to it.

Handlers hang off an App struct so they share the store and the parsed templates without package globals:

type App struct {
	store *Store
	tmpl  *template.Template
}

Templates are parsed exactly once at startup with template.Must(template.ParseGlob("templates/*.html")). Must panics if any template is malformed, so a typo in the HTML crashes the process on boot instead of on the first unlucky request. Fail fast.

Generating codes: hash the URL

The most interesting 17 lines in the project:

const base62 = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"

func shorten(url string) string {
	h := fnv.New64a()
	h.Write([]byte(url))
	n := h.Sum64()

	if n == 0 {
		return string(base62[0])
	}
	var code []byte
	for n > 0 {
		code = append(code, base62[n%62])
		n /= 62
	}
	return string(code)
}

The standard move for a URL shortener is an auto-incrementing counter, base62-encoded — that's how you get short, dense codes (1, 2, …, a, b). But a counter needs shared state: every instance has to agree on the next number, which means a database sequence or a coordination service. I didn't want any of that in Part 1.

So instead: hash the URL itself. FNV-1a is a non-cryptographic hash that's in the standard library (hash/fnv), allocates nothing beyond the hasher, and is fast. Base62-encode the 64-bit result and you have a code. The property this buys is idempotency — the same URL always produces the same code, with zero stored state to consult. Shorten https://example.com a thousand times and you get the same code a thousand times.

Notice the codes come out long. A 64-bit number base62-encoded is up to 11 digits, so most codes are 11 characters — ET5ZgfKuBE6, not abc123. They're also variable-length, and the loop emits the least-significant digit first, so the code is "reversed" relative to positional value. That last part is completely irrelevant, because lookup is an exact-match map read, not arithmetic. It just means you don't get the dense sequential strings a counter would give you.

Storing them: a map and a mutex

type Store struct {
	mu sync.RWMutex
	m  map[string]string
}

func (s *Store) Save(code, url string) {
	s.mu.Lock()
	defer s.mu.Unlock()
	s.m[code] = url
}

func (s *Store) Get(code string) (url string, ok bool) {
	s.mu.RLock()
	defer s.mu.RUnlock()
	url, ok = s.m[code]
	return
}

The mutex isn't optional decoration. Each request runs in its own goroutine, and a concurrent read and write against a plain map is an instant fatal error: concurrent map read and map write — a process-killing panic, not a recoverable error. Putting the lock as a field right next to the map is deliberate: callers go through Save/Get and physically cannot forget to hold it.

What this store doesn't do: survive a restart, or exist in more than one place. Bounce the process and every short link 404s. That's not an oversight; it's the line where Part 2 begins.

The frontend: HTMX, no framework

The page is one form. No React, no build step, no bundler. The dynamic behavior comes from three HTMX attributes:

<form hx-post="/shorten" hx-target="#result" hx-swap="innerHTML">
  <input
    type="url"
    name="url"
    placeholder="https://example.com/very/long/path"
    required
  />
  <button type="submit">Shorten</button>
</form>

<div id="result"></div>

hx-post="/shorten" submits the form over AJAX instead of a full navigation. hx-target="#result" and hx-swap="innerHTML" say: take whatever HTML the server sends back and drop it inside the #result div. The server doesn't return JSON for the client to render — it returns rendered HTML, a fragment:

<div class="result-card">
  <a class="short-link" href="/{{.Code}}" target="_blank" rel="noopener"
    >{{.ShortURL}}</a
  >
  <div class="original">&rarr; {{.OriginalURL}}</div>
</div>

The handler that produces it does one thing worth calling out — it reconstructs the full short URL, scheme included, so it's copy-paste ready even behind a TLS-terminating proxy:

scheme := "http"
if r.TLS != nil || r.Header.Get("X-Forwarded-Proto") == "https" {
	scheme = "https"
}
shortURL := scheme + "://" + r.Host + "/" + code

There's a smaller, sneakier decision in the error path. When the form is submitted empty, the handler doesn't return a 4xx — it returns the error fragment with a 200:

if url == "" {
	// Render a styled fragment (200) so HTMX swaps it into #result,
	// rather than replacing the page with raw error text.
	a.tmpl.ExecuteTemplate(w, "error.html", "Please enter a URL to shorten.")
	return
}

HTMX, by default, only swaps content from 2xx responses. Return a 400 and the nice error card silently never appears. So "user error" becomes a 200 with error-shaped HTML — the HTMX worldview leaking into your status codes. It's the kind of thing you only learn by watching the swap not happen.

The whole frontend ships with one external dependency: the HTMX script from a CDN, pinned with a Subresource Integrity hash so a compromised CDN can't swap in different code. Theming — light and dark — is one stylesheet using CSS custom properties and prefers-color-scheme, with no JavaScript involved.

End to end, the two paths look like this:

Does it hold up?

I pointed ApacheBench at it. Caveat up front: these are loopback numbers — ab hitting localhost on an Apple M1 Pro, single process, in-memory store, 20,000 requests at concurrency 50 per route. There's no network, no TLS, no real-world tail. Treat them as "is the code itself a bottleneck" (it isn't), not "production capacity."

RouteReq/secp50p95p99Failed
GET / (index)20,4122 ms3 ms6 ms0
GET /{code} (redirect)19,6152 ms3 ms4 ms0
POST /shorten19,5112 ms3 ms3 ms0

ab reports percentiles at whole-millisecond resolution, so those p50/p95/p99 columns are coarse buckets — the real latencies are sub-millisecond. Nothing failed, and the three routes land within ~5% of each other, which makes sense: they all do roughly the same trivial work — a map operation plus a template render or a header write. The hash, the lock, the base62 loop never show up. For Part 1 the takeaway is that the design has no performance problem; the limits we hit later are about correctness and scale, not speed.

Build-side, for completeness: a cold build from an empty cache takes 5.26 s (17.94 s of user time — it's compiling the standard-library dependencies in parallel), a warm incremental rebuild after touching one file is 0.20 s, and the output binary is 11.7 MB with everything statically linked in.

Where these are the wrong choices

This is the honest part. Every decision above is the right call for "smallest thing that works" and a liability for anything bigger.

  • Hash-based codes can collide silently. Two different URLs can produce the same 64-bit FNV value, hence the same code. When that happens, the second Save overwrites the first in the map — and now the first URL's code redirects to the wrong destination, with no error, no log, no detection. With 64 bits you won't hit this casually, but "won't hit it casually" is not "can't hit it," and there is zero handling.
  • Hash-based codes can't be revoked, customized, or expired. The code is the hash of the URL. You can't hand someone a vanity code, you can't expire a link, and you can't take one down without taking down the mapping for everyone who shortened that same URL.
  • The in-memory store loses everything on restart and can't be replicated. One process, one map. Deploy a second instance behind a load balancer and you have two independent maps: a code created on instance A is a 404 on instance B. This is the single biggest constraint, and the explicit setup for the next part.
  • Server-side validation is basically absent. The only checks are the browser's type="url" and required attributes. A direct curl -d 'url=not-a-url' /shorten sails right through — it'll happily shorten and "redirect" to garbage. The browser is doing validation the server is trusting.
  • The catch-all route is greedy. GET /{code} swallows any unmatched single segment, so anything you forgot to route (/favicon.ico, /robots.txt) becomes a confusing "short code not found."

What's next

Part 2 is about the constraint that hurts first: the store doesn't persist and doesn't scale. Restart the process and your links are gone; run two copies and they don't share state. I'll introduce real storage — which immediately drags the collision question out of the theoretical and into something I have to actually handle. Because once writes are durable and shared, an overwrite isn't a transient bug. It's permanent data corruption.