Exposing the dashboard & IdP on a domain (AWS)
Serve the dashboard at spawnly.run, IdentityServer at auth.spawnly.run,
and move the docs to docs.spawnly.run (GitHub Pages), all over HTTPS on EKS.
Decisions (chosen): DNS in Route53 (move the zone), edge via the AWS Load Balancer Controller (ALB) + ACM, dashboard kept behind OIDC login with the demo users replaced before exposure.
Topology
registrar NS ──► Route53 zone (spawnly.run) [persistent: deploy/aws/dns] ├─ spawnly.run ALIAS ─► ALB ─► dashboard:8080 (TLS: ACM) ├─ auth.spawnly.run ALIAS ─► ALB ─► identity-server:8080 (TLS: ACM) └─ docs.spawnly.run CNAME ─► <user>.github.io (TLS: GitHub Pages)ACM cert: spawnly.run + *.spawnly.run (DNS-validated) [persistent: deploy/aws/dns]ALB from one Ingress (AWS LB Controller); apex/auth records by external-dns.Issuer model — already supported by the apps’ split-horizon env knobs:
ISSUER_URI = OIDC_AUTHORITY = https://auth.spawnly.run,
DASHBOARD_ORIGIN = https://spawnly.run, IDENTITY_INTERNAL_URL stays in-cluster,
resource servers’ IS_ISSUER = https://auth.spawnly.run (JWKS may stay internal).
Persistence & teardown
Three Terraform roots with different lifecycles:
deploy/aws/dns— persistent (registrar delegates NS here). Applied once (incl. a one-time NS cutover); not destroyed bydown.sh.deploy/aws/ecr— persistent (images survive teardown).deploy/aws/terraform— the cluster (recreated eachup).
down.sh destroys only the cluster; down.sh --all also destroys ECR and DNS
(full teardown — stops all cost, but breaks the NS delegation).
Implementation phases
Phase 1 — Persistent DNS + TLS (DONE)
deploy/aws/dns: Route53 zone, ACM cert (apex + wildcard, DNS-validated),
docs.<domain> CNAME → Pages. One-time NS cutover at the registrar (see that
root’s README). down.sh --all tears it down.
Phase 2 — Cluster edge controllers (cluster root + deploy) — IMPLEMENTED
Provisioned in deploy/aws/terraform/edge.tf (IAM + Pod Identity) and installed
by deploy/aws/install-edge.sh (Helm). up.sh runs the installer only when the
DNS root is applied (public exposure is opt-in).
- AWS Load Balancer Controller: the AWS-published IAM policy (as a
spawnly-lbccustomer-managed policy), an IAM role + Pod Identity association, install via Helm/EKS addon. - external-dns: scoped
route53:ChangeResourceRecordSets/list on the zone, role + Pod Identity association, install via Helm (--domain-filter,--txt-owner-id,--policy=sync).
Phase 3 — Ingress + routing — IMPLEMENTED
deploy/aws/ingress.yaml (applied by deploy.sh when the DNS root is set up; it
injects the ACM certificate-arn). external-dns writes the records from the rule
hosts. Phase 4 (issuer env + cookie/ForwardedHeaders code) is still required for
a working browser login — until then the apps are reachable at the ALB but the
OIDC flow isn’t wired to the public hostnames.
Original phase notes:
One ALB Ingress (target-type: ip, scheme: internet-facing, HTTP→HTTPS
redirect, certificate-arn = the dns root’s ACM ARN):
spawnly.run → dashboard:8080, auth.spawnly.run → identity-server:8080, with
external-dns annotations to write the apex/auth records.
Phase 4 — App config + code touches
- Env: IdP
ISSUER_URI/DASHBOARD_ORIGIN; dashboardOIDC_AUTHORITY(keepIDENTITY_INTERNAL_URLinternal); resource serversIS_ISSUER. - IdP
Program.cs:UseForwardedHeaders(trust the ALBX-Forwarded-Proto) +Secure/SameSitecookies on HTTPS (today only the HTTPSameSite=Laxworkaround exists). Deferred until this phase so it’s tested behind the ALB — bundling untested cookie/proxy changes risks the working kind + aws-stsweb flows. - Dashboard
cmd/dashboard/auth.go:Securecookies whenX-Forwarded-Proto=https.
Phase 5 — Replace demo users
Replace hardcoded TestUsers.cs with real credentials from a Secret before the
IdP is internet-facing.
Phase 6 — Move docs to docs.spawnly.run (do first, to free the apex)
GitHub Pages custom domain docs.spawnly.run (+ CNAME file); Astro site: https://docs.spawnly.run; the CNAME record is in the dns root.
Phase 7 — Scripts & lifecycle
up.sh reads the dns root’s acm_certificate_arn for the ingress and installs
the controllers. down.sh deletes the Ingress first (so the ALB + its DNS
records are cleaned up) before destroying the cluster. (DONE: teardown ordering +
--all.)
Phase 8 — IAM (least-privilege additions)
Terraform principal gains route53:* (or zone-scoped), acm:*, and IAM
create/attach for the spawnly-lbc/spawnly-externaldns roles+policies
(already spawnly*-scoped). eks:* and PassRole→pods.eks already present.
Phase 9 — Verify
dig the three hosts; https://auth.spawnly.run/.well-known/openid-configuration
shows issuer: https://auth.spawnly.run; https://spawnly.run login → token
iss=auth.spawnly.run → spawn + sample-api succeed.
Cost & caveats
- Always-on: ALB ~$16–23/mo + Route53 zone $0.50/mo + the running cluster/NAT. ACM is free. This is a persistent public endpoint, not the ephemeral model.
- Migration sequencing: stand up
docs.spawnly.runon Pages and replicate existing records in Route53 before flipping registrar NS, so docs never goes dark. ACM validation only completes after the NS cutover.