Previously: `SERIAL=$(date +%Y%m%d)01` — same-day re-runs produced the
same serial. HE polled, saw no change, never pulled the update.
Now: scan zones-prepared/ for the highest `YYYYMMDDNN` matching today's
date and increment the NN counter. First run of the day starts at NN=01.
Caps at NN=99 with a clear error message (set SERIAL manually if you
genuinely need >99 changes per day).
`SERIAL=<value> make prep` still overrides the auto-detection, useful
for forcing a specific serial during recovery or for testing.
Verified end-to-end on dell01: prep bumped 2026051601 → 2026051602,
CoreDNS auto-reload picked it up within 30s, all queried zones serve
the new serial. HE will pull on its next refresh poll (SOA refresh
= 3600s, so worst case 1 hour).
Goal was to restrict AXFR to Hurricane Electric's five secondary
nameserver IPs. Tried several CoreDNS Corefile syntaxes:
transfer { to 216.218.130.2 ... 216.66.1.2 } # space-separated
transfer { to 216.218.130.2 \n to 216.218.131.2 } # multi-line
transfer { to 216.218.130.2 } # single IP
transfer { to * 216.218.130.2 ... } # mixed
Every form with a specific IPv4 address silently breaks server-block
startup — the auto plugin still loads zones into memory but the
:53/:443/:853 listeners never bind. Reproducible on coredns/coredns
1.11.3 AND 1.12.2 with the (common) snippet + auto + forward shape.
Only `to *` results in healthy listener startup.
Even if we got CoreDNS-side filtering to work, Docker's default
userland-proxy rewrites source IPs to the bridge gateway, which would
break IP-based filtering anyway short of `network_mode: host`.
Decision: keep `to *` in CoreDNS, push HE-only filtering to the
FortiWiFi firewall (source-IP-restricted VIP/DNAT for WAN:53/tcp).
This is correct-layered defense — the perimeter does the IP work
before packets ever reach dell01.
Goal: serve the public DNS face via Hurricane Electric's free
secondary-DNS service (dns.he.net), with CoreDNS on dell01 acting as
the hidden primary. We edit zones here; HE pulls them via AXFR.
Changes:
- scripts/prepare-zones.sh:
* SOA mname: ns1.vultr.com -> ns1.he.net (so the apex SOA reflects
HE as the primary in published RDATA)
* Strip ns?.vultr.com NS records from each zone and inject the five
HE nameservers (ns1..ns5.he.net) as the authoritative NS set
- Corefile (shared `common` snippet):
* Add `transfer { to * }` to authorize AXFR. Tried specific IPs +
`*` mixed on the same line but CoreDNS silently fails to bind
server blocks with that syntax; bare `to *` is the only form that
actually starts the listeners. Trade-off: NOTIFY targeting is lost
(HE polls per SOA refresh=3600s instead of being pushed). For DNS
data this is fine since each record is publicly queryable anyway.
Verified AXFR end-to-end: `dig @dell01 -p 5353 acrazy.org AXFR +tcp`
returns 41 records with the new HE NS set and HE-rooted SOA.
Still needed (operator action):
- Firewall NAT for TCP/53 -> 172.16.1.15:5353 (so HE can connect in)
- Add each of the 91 zones at dns.he.net as Secondary DNS pointing
at 154.27.180.210
- Update each domain's registrar NS records from Vultr -> HE
The original healthcheck `wget -qO- http://127.0.0.1:8080/health` has
been failing since day one because the CoreDNS image is distroless —
no shell, no HTTP client. The container has been running in
"(unhealthy)" status the whole time without anyone noticing because
nothing depends_on it.
Replace with `/coredns -version`, which is the thinnest honest check
the image can support. For deeper liveness/readiness, scrape
:8081/health from outside the container.
Deployed to dell01.mer.idahomuellers.net with firewall NAT'ing
public requests in to host:5353/tcp+udp.
Port changes baked in as new defaults so future hosts inherit them:
- DNS_PORT: 1053 -> 5353 (dev was 1053 because avahi-daemon owns
5353 on Arch desktops; production hosts typically don't run avahi
and 5353 is the conventional non-privileged DNS port — mDNS uses
multicast 224.0.0.251:5353 which never conflicts with a unicast bind)
- HEALTH_PORT: 8080 -> 8081 (8080 collided with a python3 service
on dell01; 8081 is less commonly contested)
Replaces the self-signed dev cert flow with a real LE prod cert for
dns.l.supported.systems, issued and auto-renewed by a Caddy sidecar
using DNS-01 challenge against the Vultr API.
Components:
- caddy/Dockerfile builds Caddy 2.10.0 with caddy-dns/vultr plugin
via xcaddy. GOTOOLCHAIN=auto so xcaddy can fetch newer Go on demand
when plugin versions advance their minimum Go.
- caddy/Caddyfile uses DNS-01 with explicit public resolvers (1.1.1.1,
9.9.9.9) for the propagation check. Without that, Docker's embedded
DNS leaks the container into the host's split-horizon LAN DNS, which
returns LAN IPs for ns1.vultr.com and the propagation check fails.
- docker-compose: caddy service shares ./caddy-data with coredns via a
read-only subpath mount that excludes /acme (account private key).
- Healthcheck doubles as a symlinker: maintains stable cert.pem /
key.pem names at /data/caddy/ and chmods cert files + their dirs to
be readable by CoreDNS's nonroot user. Flips to "healthy" only once
the symlinks dereference (i.e. cert exists), gating CoreDNS start
via depends_on: service_healthy.
- Corefile unchanged — same /etc/coredns/certs/cert.pem path; only the
bind-mount source switches from ./certs to ./caddy-data/caddy.
- New Makefile target: tls-up orchestrates the bring-up sequence.
Cert is valid until Aug 12 2026. Verified end-to-end:
dig @127.0.0.1 -p 8853 +tls +tls-hostname=dns.l.supported.systems ...
dig @127.0.0.1 -p 8443 +https +tls-hostname=dns.l.supported.systems ...
- New Corefile snippet (common) shared across plain DNS / DoT / DoH so
zone-loading + forward + cache stay DRY across all three transports
- scripts/generate-certs.sh: openssl-only self-signed RSA cert with SANs
for localhost / 127.0.0.1 / ::1 / coredns / dns.local. Idempotent —
skips regeneration if cert is valid >24h ahead; FORCE=1 to rotate.
- Key chmod is 0644 so the CoreDNS container's nonroot user can read it
via the bind mount. Acceptable for local dev; production should mount
real certs with proper UID/GID.
- DOT_PORT=8853, DOH_PORT=8443 (avoids Caddy already-on-443 collision)
- Makefile: `make certs`, `make test-tls`
- All three transports verified end-to-end (dig +tls, dig +https,
curl with raw RFC 8484 wire format)