A few years ago I watched a team turn on HSTS in production with a one-line config change and a lot of confidence.
By lunch, support had a queue full of users who couldn’t reach a legacy upload app on a forgotten subdomain. By the end of the day, the team had learned the hard way that HSTS is easy to enable and surprisingly hard to roll back once browsers cache it.
That’s the part people skip.
HSTS is one of the best low-effort security headers you can deploy. It tells browsers: “stop trying HTTP for this site, always use HTTPS.” That blocks protocol downgrade attacks and helps kill off accidental insecure requests. But if you deploy it carelessly, you can brick parts of your own estate for weeks or months.
Here’s a real-world style rollout pattern that works, with the mistakes first and the safer version after.
The situation
The company had:
www.example.comon modern HTTPSapp.example.comon HTTPSapi.example.comon HTTPSfiles.example.compointing to an old system that still served some content over plain HTTP- a bunch of mystery subdomains nobody had audited in years
The goal was simple: enforce HTTPS everywhere and qualify for stronger browser protections.
The first attempt looked like this.
Before: the risky rollout
The ops team added this to the main Nginx config for example.com:
add_header Strict-Transport-Security "max-age=31536000; includeSubDomains; preload" always;
On paper, that looked great:
- 1 year policy
- covers all subdomains
- ready for preload
In reality, it assumed every current and future subdomain was HTTPS-clean, permanently.
It wasn’t.
What broke
Users who had visited www.example.com got the HSTS policy cached by their browser. Because includeSubDomains was set, the browser also forced HTTPS for files.example.com.
But files.example.com didn’t fully support HTTPS. Some requests failed. Some redirected in circles. Some hit certificate errors.
The ugly part: once the browser cached the policy, telling users “just try HTTP” stopped working. The browser refused.
That’s the operational trap with HSTS. A bad redirect can be fixed on the server. A bad HSTS policy lives in the client.
Why this happens
A browser that sees:
Strict-Transport-Security: max-age=31536000; includeSubDomains
stores a rule for the host for one year. After that:
http://www.example.combecomeshttps://www.example.combefore the request is sent- if
includeSubDomainsis present,http://anything.example.comalso gets upgraded - if HTTPS is broken on one of those hosts, users are stuck until the policy expires or they manually clear browser state
Preload makes this even more permanent, because browsers ship your domain in a built-in list. That’s great when you’re ready. It’s reckless when you’re not.
The safer rollout
The second attempt was boring, staged, and much better.
Phase 1: audit first
Before sending any HSTS header, the team made an inventory:
- every DNS record under
example.com - every app and CDN endpoint
- every external vendor CNAME
- certificates for all public hosts
- redirect behavior from HTTP to HTTPS
This is the unglamorous part nobody wants to do. Do it anyway.
I usually care about four checks:
- Does the host resolve publicly?
- Does HTTP redirect cleanly to HTTPS?
- Does HTTPS load with a valid cert and full chain?
- Are there mixed-content or callback flows that still assume HTTP?
A quick external headers check also helps catch obvious mistakes. If you want a fast sanity check before rollout, run a free security headers scan at HeaderTest.
Phase 2: start with a short max-age
Instead of going straight to one year, they deployed HSTS only on the main site with a tiny duration:
add_header Strict-Transport-Security "max-age=300" always;
That’s 5 minutes.
No includeSubDomains.
No preload.
No heroics.
This gave them a safe test window. If they found a problem, the browser cache would age out quickly.
For Apache, the equivalent looked like this:
Header always set Strict-Transport-Security "max-age=300"
For an Express app behind TLS termination, if the app itself sets headers:
app.use((req, res, next) => {
res.setHeader('Strict-Transport-Security', 'max-age=300');
next();
});
Though honestly, I prefer setting HSTS at the edge proxy or load balancer so it’s consistent across apps.
Phase 3: verify real behavior
After deploying the short policy, the team tested:
- fresh browser sessions
- repeat visits after HSTS caching
- login redirects
- SSO callbacks
- mobile apps using embedded webviews
- old bookmarked HTTP URLs
- direct visits to known subdomains
They also checked that HTTP always redirected to HTTPS before app logic ran.
Bad:
location / {
proxy_pass http://app_backend;
}
server {
listen 80;
server_name example.com;
}
Better:
server {
listen 80;
server_name example.com www.example.com;
return 301 https://$host$request_uri;
}
server {
listen 443 ssl http2;
server_name example.com www.example.com;
ssl_certificate /etc/ssl/fullchain.pem;
ssl_certificate_key /etc/ssl/private/key.pem;
add_header Strict-Transport-Security "max-age=300" always;
location / {
proxy_pass http://app_backend;
}
}
That ordering matters. HSTS does not replace redirects. It teaches browsers to stop needing them later.
Phase 4: increase max-age gradually
Once the short rollout was stable, they increased the duration:
First day:
Strict-Transport-Security: max-age=300
Then:
Strict-Transport-Security: max-age=86400
Then:
Strict-Transport-Security: max-age=2592000
Finally:
Strict-Transport-Security: max-age=31536000
This staged approach gave them checkpoints. If something weird surfaced after a week, they hadn’t committed every browser to a year-long policy yet.
After: enabling includeSubDomains safely
Only after the subdomain audit was complete did they move to this:
add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;
That change happened after:
files.example.comwas migrated to valid HTTPS- old subdomains were removed or firewalled off
- wildcard and SAN cert coverage was cleaned up
- vendor-managed subdomains were verified
This is where a lot of teams get burned. They think includeSubDomains means “the subdomains we care about.” It means all of them.
That includes:
- legacy admin tools
- forgotten staging boxes
- old marketing microsites
- third-party services on CNAMEs
- future subdomains someone creates next month
If your org is sloppy with DNS hygiene, includeSubDomains will expose it immediately.
Preload is the last step, not the first
Preload is attractive because it hardens first contact too. Even the very first visit is forced to HTTPS because the browser already knows your domain’s policy.
But preload has a high cost if you make a mistake.
The team waited until they met the usual preload expectations:
- HTTPS on the apex and
www - valid certs
- HTTP redirects cleanly to HTTPS
- HSTS with at least one year
includeSubDomainspreloadtoken present
Then they switched to:
add_header Strict-Transport-Security "max-age=31536000; includeSubDomains; preload" always;
That was months after the initial rollout, not the same day.
I think that’s the right call for most production systems. Preload should feel a little annoying to approve. That friction is healthy.
A rollback plan that actually works
You can reduce HSTS with:
Strict-Transport-Security: max-age=0
That tells browsers to delete the policy.
But there’s a catch: the browser has to successfully reach the site over HTTPS and receive that header. If HTTPS is broken, users may never get the rollback instruction.
That’s why “we’ll just revert it” is not much of a plan.
A real rollback plan includes:
- keeping HTTPS stable during rollback
- preserving valid certificates
- removing
includeSubDomainsonly after affected hosts can serve the new policy - understanding that preload removal takes time and browser release cycles
The final production pattern
The final setup for the company looked like this:
server {
listen 80;
server_name example.com www.example.com;
return 301 https://$host$request_uri;
}
server {
listen 443 ssl http2;
server_name example.com www.example.com;
ssl_certificate /etc/ssl/fullchain.pem;
ssl_certificate_key /etc/ssl/private/key.pem;
add_header Strict-Transport-Security "max-age=31536000; includeSubDomains; preload" always;
location / {
proxy_pass http://app_backend;
proxy_set_header Host $host;
proxy_set_header X-Forwarded-Proto https;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}
}
And the real win wasn’t the header itself. It was the cleanup work it forced:
- no more half-broken HTTP endpoints
- no more stray public subdomains
- no more guessing which apps were TLS-ready
That’s how HSTS should be deployed: not as a checkbox, but as the final lock after you’ve already shut the doors.
The production checklist I’d use
If I were rolling out HSTS on a real site today, I’d do it in this order:
- Inventory every public subdomain.
- Fix HTTPS everywhere that matters.
- Redirect all HTTP to HTTPS.
- Start with
max-age=300. - Watch logs, support tickets, and auth flows.
- Increase to a day, then a month, then a year.
- Add
includeSubDomainsonly after a full subdomain audit. - Add
preloadonly when you’re sure you want the commitment.
The mistake is treating HSTS like a simple header change. It’s really a browser-side policy rollout with a long memory.
Do the boring audit work first, and HSTS becomes one of the safest wins in your security baseline. Skip that work, and one line of config can haunt you for months.