Decouples git.supported.systems from the legacy host record. Resolves
through git.supportedsystems.net (64.177.112.188) — the new git server
under the .supportedsystems.net infrastructure namespace. Old gitea box
at 66.42.70.188 has no more named-DNS reference here.
Same pattern as autoconfig/autodiscover/imap/smtp/pop — webmail was
being caught by the wildcard (* 60 IN A 108.61.23.129) and resolving
to the docker host. Explicit CNAME points it at the mail server FQDN
where the webmail UI actually runs.
These 4 mail-discovery hostnames were silently caught by the wildcard
(* 60 IN A 108.61.23.129), resolving to the docker host instead of
the mail server. CNAMEs to mail.supported.systems make their resolution
explicit and follow the mail server's A record automatically.
Mail server migration cutover. mail.supported.systems flips inbound mail
for all 20 MX-referring zones to the new server. old-mailu.supported.systems
preserves a name pointing at the old IP (66.42.75.247) during the
migration window for IMAP drain, mailbox sync, and parallel verification.
Decouples the 6 dependent services (dignity.ink:kayla, septic.report:permits,
supported.systems:{docs, *.docs, mcbluetooth, s120}) from the legacy host
record. Services now follow the new-canonical .supportedsystems.net naming
and resolve directly to the new docker host.
Bulk swap of the old docker-2 host IP to the new one across 4 zones.
docker-2.supported.systems intentionally preserved at the old IP — 6
CNAMEs depend on the FQDN; the old box keeps its identity until
decommissioned.
These were leftover from a past cert renewal — timelinize.l isn't an
active service. Their presence made timelinize.l an empty non-terminal
that suppressed *.l wildcard synthesis at HE per RFC 4592 §2.2.3.
Bulk swap of the old docker host IP to the new one across 13 zones.
docker-1.supported.systems intentionally preserved at the old IP — the
hostname stays tied to the old box until decommissioned.
cubeseptic.com, flonhoney.com, hydrushydroponics.com,
idahogreendreams.com, qube-construction.com, qube-septic.com,
qubeseptic.com — all were hosted on 108.61.229.209 (docker-1, old)
and are being decommissioned, not migrated to the replacement host.
Wildcards in DNS only synthesize for names that don't already exist
in the zone tree. A `_acme-challenge.<sub>` TXT record makes <sub>
an "empty non-terminal" — exists in the tree (as a parent node) but
has no records of its own. Per RFC 4592 §2.2.3, wildcards skip these,
so RFC-compliant resolvers (HE, BIND) return NODATA for <sub> even
when the zone has `* CNAME @`.
Fix: for each <sub> that's an empty non-terminal in a zone with a
wildcard, add an explicit `<sub> CNAME @` so the resolution outcome
matches what the wildcard would have produced. Zero-knowledge — no
need to identify the specific service IP per name.
30 records added across 14 zones:
acrazy.org (langfuse.dootie)
context.bet (studio)
copper-springs.online (docs.butler.dev)
demostar.io (cw.cw, doom, meet)
home-inspector.store (api, dashboard, mailpit)
inspect.pics (admin)
log.doctor (app, docs)
malloys.us (cp, cp-sandbox, mary)
nielsen-inspections.com (calendar, cw, files, v2-calendar)
qubeseptic.com (api.dispatch, dispatch, leads, mail.dispatch,
rentcache.dispatch)
ryanmalloy.com (c4ai)
sidejob.pro (api)
upc.llc (catalog, minio.or, or, s3)
CoreDNS (lenient) was returning the wildcard CNAME for these names
anyway; HE (strict RFC-compliant) was returning empty. After this
change, both behave identically.
Previously: refresh=3600 retry=1800 minimum=300 (RFC-conformant but
slow). With HE's free secondary service exhibiting puller→anycast
replication lag of up to ~1 hour, we want to give them every signal
to refresh faster.
New: refresh=300 retry=120 minimum=60.
- refresh 300s: slaves poll our SOA every 5 minutes. ~91 zones polled
by HE = ~1 query/sec to dell01:53, trivial load. If HE honors the
master's refresh internally (some secondary providers do, some
don't), this also nudges their puller→anycast sync.
- retry 120s: kept < refresh per RFC 1912 §2.2.
- minimum 60s: tightens NXDOMAIN negative-cache TTL on public
resolvers from 5 min to 1 min. The dominant window when a newly-
added name is briefly NX-cached on Cloudflare/Google/Quad9 before
they re-ask HE.
expire stays at 604800 (1 week) — that's "how long HE keeps serving
stale data if we vanish," unrelated to fresh-data propagation.
Hurricane Electric requires asymmetric transfer config:
- AXFR pull from 216.218.133.2 (slave.dns.he.net / ns4.he.net)
- NOTIFY destination 216.218.130.2 (ns1.he.net)
CoreDNS's transfer plugin uses a single bidirectional `to` list for
both, which is fine in principle but breaks in a confirmed bug: any
`to` with more than one specific IPv4 silently kills server-block
listener startup (no error, zones load, but :53 never binds).
Reproduced on 1.11.3 + 1.12.2 even with a minimal fresh `docker run`.
Workaround:
- Corefile keeps `transfer { to * }` (open AXFR; firewall does the
real source-IP filtering on TCP/53)
- scripts/notify-he.py crafts and sends NOTIFY messages directly to
216.218.130.2 (only). Pure-stdlib Python — no dependencies.
- Makefile `prep` target runs notify-he.py after prepare-zones.sh
so every zone-bump fires NOTIFY automatically.
Verified end-to-end: HE acks NOTIFY (rcode=0) for the 10 zones it
hosts as secondaries; remaining 81 return REFUSED (rcode=5) because
HE doesn't have them configured yet. Note: HE's free slave service
acks NOTIFY but only actually re-pulls AXFR on its hourly poll cycle
(observed behavior — they're poll-based by design). NOTIFY still
useful long-term in case HE changes that behavior; harmless either way.
27 records across 15 zones converted from direct A records pointing at
the Tailscale endpoint (100.79.95.190) to CNAMEs pointing at the
Tailscale-named alias. Now if the underlying Tailscale node's IP
changes, only the rpm-bullet record needs updating instead of
chasing 27 zones.
Affected zones (all *.l labels + a handful of dev / dev.mary names):
acrazy.org copper-springs.online demostar.io flonhoney.com
homestar.ink kg7q.cc malloys.us ourjob.site
qubeseptic.com ryanmalloy.com septic.report sidejob.pro
supported.systems warehack.ing zmesh.systems
No CNAME collisions: none of the converted names had other records
(MX/TXT/SRV/CAA/AAAA) at the same exact name. _acme-challenge.<sub>.l
records sit at distinct subdomains and continue to resolve independently
(verified: TXT lookups for known _acme-challenge.l.* names still return
the original values).
Also fixed prepare-zones.sh: added `|| true` after the serial-detection
grep so a zero-match (first run of a new day) doesn't trip `set -e`
and abort the whole prep.
Previously: `SERIAL=$(date +%Y%m%d)01` — same-day re-runs produced the
same serial. HE polled, saw no change, never pulled the update.
Now: scan zones-prepared/ for the highest `YYYYMMDDNN` matching today's
date and increment the NN counter. First run of the day starts at NN=01.
Caps at NN=99 with a clear error message (set SERIAL manually if you
genuinely need >99 changes per day).
`SERIAL=<value> make prep` still overrides the auto-detection, useful
for forcing a specific serial during recovery or for testing.
Verified end-to-end on dell01: prep bumped 2026051601 → 2026051602,
CoreDNS auto-reload picked it up within 30s, all queried zones serve
the new serial. HE will pull on its next refresh poll (SOA refresh
= 3600s, so worst case 1 hour).
Goal was to restrict AXFR to Hurricane Electric's five secondary
nameserver IPs. Tried several CoreDNS Corefile syntaxes:
transfer { to 216.218.130.2 ... 216.66.1.2 } # space-separated
transfer { to 216.218.130.2 \n to 216.218.131.2 } # multi-line
transfer { to 216.218.130.2 } # single IP
transfer { to * 216.218.130.2 ... } # mixed
Every form with a specific IPv4 address silently breaks server-block
startup — the auto plugin still loads zones into memory but the
:53/:443/:853 listeners never bind. Reproducible on coredns/coredns
1.11.3 AND 1.12.2 with the (common) snippet + auto + forward shape.
Only `to *` results in healthy listener startup.
Even if we got CoreDNS-side filtering to work, Docker's default
userland-proxy rewrites source IPs to the bridge gateway, which would
break IP-based filtering anyway short of `network_mode: host`.
Decision: keep `to *` in CoreDNS, push HE-only filtering to the
FortiWiFi firewall (source-IP-restricted VIP/DNAT for WAN:53/tcp).
This is correct-layered defense — the perimeter does the IP work
before packets ever reach dell01.
Goal: serve the public DNS face via Hurricane Electric's free
secondary-DNS service (dns.he.net), with CoreDNS on dell01 acting as
the hidden primary. We edit zones here; HE pulls them via AXFR.
Changes:
- scripts/prepare-zones.sh:
* SOA mname: ns1.vultr.com -> ns1.he.net (so the apex SOA reflects
HE as the primary in published RDATA)
* Strip ns?.vultr.com NS records from each zone and inject the five
HE nameservers (ns1..ns5.he.net) as the authoritative NS set
- Corefile (shared `common` snippet):
* Add `transfer { to * }` to authorize AXFR. Tried specific IPs +
`*` mixed on the same line but CoreDNS silently fails to bind
server blocks with that syntax; bare `to *` is the only form that
actually starts the listeners. Trade-off: NOTIFY targeting is lost
(HE polls per SOA refresh=3600s instead of being pushed). For DNS
data this is fine since each record is publicly queryable anyway.
Verified AXFR end-to-end: `dig @dell01 -p 5353 acrazy.org AXFR +tcp`
returns 41 records with the new HE NS set and HE-rooted SOA.
Still needed (operator action):
- Firewall NAT for TCP/53 -> 172.16.1.15:5353 (so HE can connect in)
- Add each of the 91 zones at dns.he.net as Secondary DNS pointing
at 154.27.180.210
- Update each domain's registrar NS records from Vultr -> HE
The original healthcheck `wget -qO- http://127.0.0.1:8080/health` has
been failing since day one because the CoreDNS image is distroless —
no shell, no HTTP client. The container has been running in
"(unhealthy)" status the whole time without anyone noticing because
nothing depends_on it.
Replace with `/coredns -version`, which is the thinnest honest check
the image can support. For deeper liveness/readiness, scrape
:8081/health from outside the container.
Deployed to dell01.mer.idahomuellers.net with firewall NAT'ing
public requests in to host:5353/tcp+udp.
Port changes baked in as new defaults so future hosts inherit them:
- DNS_PORT: 1053 -> 5353 (dev was 1053 because avahi-daemon owns
5353 on Arch desktops; production hosts typically don't run avahi
and 5353 is the conventional non-privileged DNS port — mDNS uses
multicast 224.0.0.251:5353 which never conflicts with a unicast bind)
- HEALTH_PORT: 8080 -> 8081 (8080 collided with a python3 service
on dell01; 8081 is less commonly contested)
Replaces the self-signed dev cert flow with a real LE prod cert for
dns.l.supported.systems, issued and auto-renewed by a Caddy sidecar
using DNS-01 challenge against the Vultr API.
Components:
- caddy/Dockerfile builds Caddy 2.10.0 with caddy-dns/vultr plugin
via xcaddy. GOTOOLCHAIN=auto so xcaddy can fetch newer Go on demand
when plugin versions advance their minimum Go.
- caddy/Caddyfile uses DNS-01 with explicit public resolvers (1.1.1.1,
9.9.9.9) for the propagation check. Without that, Docker's embedded
DNS leaks the container into the host's split-horizon LAN DNS, which
returns LAN IPs for ns1.vultr.com and the propagation check fails.
- docker-compose: caddy service shares ./caddy-data with coredns via a
read-only subpath mount that excludes /acme (account private key).
- Healthcheck doubles as a symlinker: maintains stable cert.pem /
key.pem names at /data/caddy/ and chmods cert files + their dirs to
be readable by CoreDNS's nonroot user. Flips to "healthy" only once
the symlinks dereference (i.e. cert exists), gating CoreDNS start
via depends_on: service_healthy.
- Corefile unchanged — same /etc/coredns/certs/cert.pem path; only the
bind-mount source switches from ./certs to ./caddy-data/caddy.
- New Makefile target: tls-up orchestrates the bring-up sequence.
Cert is valid until Aug 12 2026. Verified end-to-end:
dig @127.0.0.1 -p 8853 +tls +tls-hostname=dns.l.supported.systems ...
dig @127.0.0.1 -p 8443 +https +tls-hostname=dns.l.supported.systems ...
- New Corefile snippet (common) shared across plain DNS / DoT / DoH so
zone-loading + forward + cache stay DRY across all three transports
- scripts/generate-certs.sh: openssl-only self-signed RSA cert with SANs
for localhost / 127.0.0.1 / ::1 / coredns / dns.local. Idempotent —
skips regeneration if cert is valid >24h ahead; FORCE=1 to rotate.
- Key chmod is 0644 so the CoreDNS container's nonroot user can read it
via the bind mount. Acceptable for local dev; production should mount
real certs with proper UID/GID.
- DOT_PORT=8853, DOH_PORT=8443 (avoids Caddy already-on-443 collision)
- Makefile: `make certs`, `make test-tls`
- All three transports verified end-to-end (dig +tls, dig +https,
curl with raw RFC 8484 wire format)