HSTS and Certificate Management: A Real-World Fix

A few years ago, I helped clean up a messy HTTPS rollout for a mid-sized SaaS app. On paper, the setup looked fine: valid TLS cert, HTTPS redirect, load balancer in front, and a decent-looking security checklist in the wiki. In production, it was shaky.

Users could still hit http:// directly. Some static assets loaded from old subdomains with mismatched certificates. One internal team wanted to turn on HSTS preload immediately because they’d read that “preload means maximum security.” That would have been a great way to lock broken TLS behavior into every browser.

This is the kind of HSTS problem that shows up in real systems: HSTS itself is simple, but certificate management is where teams usually get burned.

The starting point

The app had this behavior:

http://app.example.com redirected to HTTPS
https://app.example.com used a valid certificate
static.example.com had an old cert missing some names
api.example.com terminated TLS on a different stack with a different renewal process
HSTS was set only on the app, and only sometimes
Certificate renewals were partly automated, partly “someone gets a calendar reminder”

That last one is how outages are born.

Here was the original Nginx config for the app tier:

server {
    listen 80;
    server_name app.example.com;
    return 301 https://$host$request_uri;
}

server {
    listen 443 ssl http2;
    server_name app.example.com;

    ssl_certificate     /etc/ssl/app/fullchain.pem;
    ssl_certificate_key /etc/ssl/app/privkey.pem;

    add_header X-Frame-Options SAMEORIGIN;
    add_header X-Content-Type-Options nosniff;

    location / {
        proxy_pass http://app_backend;
    }
}

Looks normal. But there were three actual problems:

No HSTS header
No consistency across subdomains
No confidence in certificate coverage or renewal

A plain HTTPS redirect is not HSTS. Without HSTS, the first request can still be downgraded or intercepted. And if your certificate management is inconsistent, HSTS can turn a recoverable warning into a hard outage.

Why HSTS makes certificate mistakes more painful

That’s the tradeoff teams underestimate.

Without HSTS, if a cert expires or the hostname doesn’t match, users typically see a browser warning. That’s bad, but some internal users will click through, and your support team gets noisy screenshots.

With HSTS, browsers refuse to bypass certificate errors for that site. That’s the point. Once a browser has seen:

Strict-Transport-Security: max-age=31536000; includeSubDomains

it will enforce HTTPS and reject broken certs hard.

So if you enable HSTS before your cert lifecycle is actually under control, you’re setting a trap for yourself.

The audit we did first

Before touching HSTS, we mapped every hostname that users or browsers could hit:

example.com
www.example.com
app.example.com
api.example.com
static.example.com
legacy support hostnames still embedded in templates
marketing redirects handled by a CDN

That matters because includeSubDomains is not a suggestion. If you publish HSTS at the parent domain, every subdomain needs working HTTPS and valid certificates.

We also checked what headers were really being returned in production, not what config files claimed. A free scan with HeaderTest helped catch inconsistent responses between nodes.

Then we standardized certificate ownership:

one inventory of all public hostnames
one documented issuance path
automatic renewal everywhere possible
alerting before expiration
post-renewal reload validation

That’s the boring work that makes HSTS safe.

The “before” certificate mess

The static host was the best example of how this goes sideways.

The frontend referenced:

<link rel="stylesheet" href="https://static.example.com/assets/app.css">
<script src="https://static.example.com/assets/app.js"></script>

But the certificate on static.example.com had been replaced months earlier with a cert that only covered cdn.example.com. Browsers tolerated it only because traffic was inconsistent and some users were behind caches.

If we had pushed includeSubDomains at example.com right then, we would have broken asset delivery for a chunk of users immediately.

This is why I’m pretty conservative with HSTS rollouts. Strong policy is good. Blind policy is reckless.

The safer rollout

We fixed certificate management first.

Step 1: make certificate renewal boring

For hosts using Nginx, we moved to automated renewal and a deploy hook that tested config before reload.

Example renewal hook:

#!/usr/bin/env bash
set -euo pipefail

nginx -t
systemctl reload nginx

for host in app.example.com api.example.com static.example.com; do
  echo | openssl s_client -connect "${host}:443" -servername "${host}" 2>/dev/null \
    | openssl x509 -noout -subject -issuer -dates
done

That’s not fancy, but it catches “renewed cert exists on disk, service never reloaded” which happens more often than people admit.

Step 2: standardize TLS config and HSTS behavior

We updated the app server like this:

server {
    listen 80;
    server_name app.example.com;
    return 301 https://$host$request_uri;
}

server {
    listen 443 ssl http2;
    server_name app.example.com;

    ssl_certificate     /etc/ssl/app/fullchain.pem;
    ssl_certificate_key /etc/ssl/app/privkey.pem;

    add_header Strict-Transport-Security "max-age=300" always;
    add_header X-Frame-Options SAMEORIGIN always;
    add_header X-Content-Type-Options nosniff always;

    location / {
        proxy_pass http://app_backend;
    }
}

Two details matter here:

always makes sure the header is sent on error responses too
max-age=300 is intentionally tiny for the first phase

I almost never start with one year. Five minutes is enough to verify behavior without trapping users in a bad policy.

Step 3: validate every subdomain before `includeSubDomains`

After confirming clean HTTPS behavior on all public hosts, we moved to:

add_header Strict-Transport-Security "max-age=86400; includeSubDomains" always;

One day, not one year.

Then later:

add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;

That staged rollout gave us time to catch a forgotten admin subdomain that still served a self-signed cert behind a VPN. If we had gone straight to a year with includeSubDomains, that internal tool would have become a support fire.

Apache example

Same idea on Apache:

<VirtualHost *:443>
    ServerName app.example.com

    SSLEngine on
    SSLCertificateFile /etc/ssl/app/fullchain.pem
    SSLCertificateKeyFile /etc/ssl/app/privkey.pem

    Header always set Strict-Transport-Security "max-age=300"

    ProxyPass / http://app_backend/
    ProxyPassReverse / http://app_backend/
</VirtualHost>

Then increase max-age only after you trust your cert operations.

What changed after cleanup

Once certificate management was consistent, HSTS became a force multiplier instead of a liability.

After

every public hostname had valid HTTPS
renewals were automated and tested
expiration alerts fired well before the deadline
HSTS was present on all intended responses
subdomains were reviewed before enabling includeSubDomains
preload was explicitly deferred until the team proved they could operate this safely

That last part saved us from making a permanent promise too early.

The preload temptation

A lot of teams want this:

Strict-Transport-Security: max-age=63072000; includeSubDomains; preload

I get it. Preload is clean and strong. But it’s also unforgiving.

If you preload your domain, browsers bake in the rule. You’re saying:

every subdomain must support HTTPS
certificates must stay valid
you cannot casually spin up a random HTTP-only host later
rollback is slow and painful

I only recommend preload when the certificate story is boring enough that nobody on the team even debates renewals anymore.

If your org still has spreadsheets, calendar reminders, or “ask ops where that cert lives,” you are not ready.

The biggest lesson

HSTS is not just a header decision. It’s an operational maturity decision.

The team originally thought the fix was “add one response header.” The real fix was:

clean hostname inventory
unified certificate issuance
automated renewal
tested reloads
staged HSTS rollout
discipline around subdomains

That’s why the before-and-after looked so different. Before, HTTPS was present. After, HTTPS was dependable.

If you want to check where your current setup stands, run a scan with HeaderTest, then compare what you see against your actual certificate inventory and renewal process. Most teams don’t have an HSTS problem. They have a certificate management problem that HSTS will expose immediately.

For the server-side details, the official references are worth keeping handy:

My rule is simple: get certificates boring first, then make HSTS strict. That order prevents outages.

The starting point#

Why HSTS makes certificate mistakes more painful#

The audit we did first#

The “before” certificate mess#

The safer rollout#

Step 1: make certificate renewal boring#

Step 2: standardize TLS config and HSTS behavior#

Step 3: validate every subdomain before includeSubDomains#

Apache example#

What changed after cleanup#

After#

The preload temptation#

The biggest lesson#