coredns

Author	SHA1	Message	Date
Ryan Malloy	cc33fcbcc8	caddy: add caddy-dns/rfc2136 + test-rfc2136 site -- self-hosted ACME flow Wires Caddy as the ACME client side of our new self-hosted DNS-01 flow. Proves the design end-to-end: caddy-dns/rfc2136 -> our CoreDNS rfc2136 plugin -> zone file write -> git auto-commit -> HE AXFR -> LE validates -> cert issued. Changes: - caddy/Dockerfile: --with github.com/caddy-dns/rfc2136 added alongside the existing caddy-dns/vultr. - caddy/Caddyfile: new test-rfc2136.supported.systems site that uses the new provider. server coredns:53 (docker internal), key from env, propagation_delay 60s + timeout 600s to accommodate HE pull. - docker-compose.yml: ACME_TSIG_SECRET passed to the caddy container (the same secret CoreDNS verifies on the other side of the loop). First cert issued in production: 2026-05-21 ~13:23 UTC. ~5.5 min end-to-end from Caddy starting to cert in hand. Documented in session notes; the cert sits unused in caddy-data/ until/unless something publishes ports 80/443 for that hostname.	2026-05-21 13:27:05 -06:00
Ryan Malloy	18aa53bdc7	prod-readiness: alpine runtime + uid:gid passthrough + git auto-commit working The final set of fixes to make the rfc2136 plugin truly operational in production: - coredns/Dockerfile: switch runtime stage from gcr.io/distroless to alpine:3.20. Distroless has no package manager and no shell, so `git commit` (called by the plugin's auto-commit code path) had no way to execute. Alpine adds ~10 MB image size but gives us git + a usable shell for debugging. - docker-compose.yml: `user: "${COREDNS_UID:-1003}:${COREDNS_GID:-1004}"`. The container runs as the host's rpm user (uid 1003/gid 1004 on dell01) so zone files the plugin writes are owned by rpm:rpm on the host -- not root. Without this the plugin would write root-owned files we couldn't read or git-edit. Defaults match dell01; override per-host via env if needed. - .env.example: documents COREDNS_IMAGE_TAG (CalVer; bump per build). Add COREDNS_UID/GID if you need to override on a host where rpm has different numeric ids. Combined with the bumped image tag (2026.05.21.2), the full end-to-end flow works: caddy/nsupdate -> TSIG verify -> plugin handler -> atomic file write -> git auto-commit -> auto plugin reload -> query returns new record.	2026-05-21 13:01:36 -06:00
Ryan Malloy	162abedfdd	.env now gitignored; .env.example is the committed template Per standard Docker convention. The active `.env` is per-host (contains the actual TSIG secret + any host-specific port/hostname overrides). The `.env.example` template documents the expected variables with stub values so a fresh checkout knows what to copy. Also: docker-compose.yml now passes ACME_TSIG_SECRET to the coredns container via plain `environment:` directive -- compose auto-reads `.env` for substitution. No --env-file gymnastics needed at the invocation level.	2026-05-21 12:37:23 -06:00
Ryan Malloy	3720cd2885	deploy: enable rfc2136 plugin for all 84 production zones Wires the custom CoreDNS image (built via coredns/Dockerfile, source includes git.supported.systems/rsp2k/coredns-rfc2136) into production: - docker-compose.yml: switch coredns service from upstream image pin to a build target. New `image: coredns-rfc2136:${COREDNS_IMAGE_TAG}` is locally-built; `up -d coredns` triggers the build. - .env: COREDNS_IMAGE_TAG=2026.05.21 (CalVer). Old COREDNS_IMAGE kept as a comment for emergency rollback to upstream 1.11.3. - Corefile: new rfc2136 directive inside (common) snippet enumerating all 84 zones currently in zones/. Plugin is now in the chain for every server block (plain DNS, DoT, DoH). UPDATE opcode lands in the plugin handler; auto-commit on, CalVer SOA serial bumping on, zones-dir /zones matches the existing bind-mount. TSIG key is read from ${ACME_TSIG_SECRET} which lives in .env.local (gitignored). Production deployment needs that file synced to dell01 separately. This commit DOESN'T trigger the deployment by itself -- the image must be built on dell01 and the container recreated to apply.	2026-05-21 12:17:20 -06:00
Ryan Malloy	083e29bd3e	docker-compose: make VULTR_API_KEY optional Caddy needs this only for DNS-01 cert renewal via Vultr's API, which happens within the final 30 days of the cert's 90-day lifetime -- roughly once a quarter. Requiring it to be exported on every `docker compose up` was friction for routine ops (CoreDNS recreations during unrelated config changes). Empty default keeps the stack startable without the key in scope. When renewal is imminent, set the var properly OR (preferred long-term) migrate Caddy to caddy-dns/rfc2136 pointing at our own plugin and retire the Vultr dependency entirely.	2026-05-21 11:17:56 -06:00
Ryan Malloy	6d72d65642	Retire prepare-zones.sh pipeline; zones/ is now the served form Big migration: the source/prepared split is gone. Each zones/.zone is now an RFC-compliant zone file that CoreDNS reads directly. Editing a record is just edit + bump SOA + commit. CoreDNS auto-reloads within 30s; HE pulls on its own 300s SOA-refresh cycle. Why: groundwork for the coredns-rfc2136 plugin to edit zones in place without juggling a source/prepared transformation step. Also reduces the mental model from "edit source, run prep, push" to just "edit". Changes: - zones/.zone: 84 files migrated from Vultr-export form to RFC-compliant form (SOA injected, Vultr NS replaced with HE NS, CNAME/MX/NS rdata dot-terminated, apex lines get explicit @ prefix). Diff is mechanical and byte-count is unchanged (~340K) -- pure formatting promotion. - docker-compose.yml: bind ./zones:/zones:ro (was ./zones-prepared) - Makefile: dropped 'prep' target. 'reload' is now a no-op explainer. 'tls-up' no longer depends on prep. 'clean' no longer wipes prepared. - scripts/prepare-zones.sh moved to scripts/archive/ (kept for reference). - .gitignore: updated comment for zones-prepared/ (now legacy). NOT in this commit (follow-ups): - CLAUDE.md updates documenting the new workflow. - scripts/bump-serials.sh helper for manual-edit SOA bumping. - coredns-rfc2136 plugin refactor (Phase 2b in the plan).	2026-05-21 11:14:42 -06:00
Ryan Malloy	b78cfb0b45	coredns: fix silently-broken healthcheck (distroless image has no wget) The original healthcheck `wget -qO- http://127.0.0.1:8080/health` has been failing since day one because the CoreDNS image is distroless — no shell, no HTTP client. The container has been running in "(unhealthy)" status the whole time without anyone noticing because nothing depends_on it. Replace with `/coredns -version`, which is the thinnest honest check the image can support. For deeper liveness/readiness, scrape :8081/health from outside the container.	2026-05-16 14:01:22 -06:00
Ryan Malloy	c1afe77b27	coredns: production Let's Encrypt cert via Caddy sidecar (DNS-01 + Vultr) Replaces the self-signed dev cert flow with a real LE prod cert for dns.l.supported.systems, issued and auto-renewed by a Caddy sidecar using DNS-01 challenge against the Vultr API. Components: - caddy/Dockerfile builds Caddy 2.10.0 with caddy-dns/vultr plugin via xcaddy. GOTOOLCHAIN=auto so xcaddy can fetch newer Go on demand when plugin versions advance their minimum Go. - caddy/Caddyfile uses DNS-01 with explicit public resolvers (1.1.1.1, 9.9.9.9) for the propagation check. Without that, Docker's embedded DNS leaks the container into the host's split-horizon LAN DNS, which returns LAN IPs for ns1.vultr.com and the propagation check fails. - docker-compose: caddy service shares ./caddy-data with coredns via a read-only subpath mount that excludes /acme (account private key). - Healthcheck doubles as a symlinker: maintains stable cert.pem / key.pem names at /data/caddy/ and chmods cert files + their dirs to be readable by CoreDNS's nonroot user. Flips to "healthy" only once the symlinks dereference (i.e. cert exists), gating CoreDNS start via depends_on: service_healthy. - Corefile unchanged — same /etc/coredns/certs/cert.pem path; only the bind-mount source switches from ./certs to ./caddy-data/caddy. - New Makefile target: tls-up orchestrates the bring-up sequence. Cert is valid until Aug 12 2026. Verified end-to-end: dig @127.0.0.1 -p 8853 +tls +tls-hostname=dns.l.supported.systems ... dig @127.0.0.1 -p 8443 +https +tls-hostname=dns.l.supported.systems ...	2026-05-14 01:34:57 -06:00
Ryan Malloy	066ba1892a	coredns: DoT (:853) + DoH (:443) listeners with self-signed cert - New Corefile snippet (common) shared across plain DNS / DoT / DoH so zone-loading + forward + cache stay DRY across all three transports - scripts/generate-certs.sh: openssl-only self-signed RSA cert with SANs for localhost / 127.0.0.1 / ::1 / coredns / dns.local. Idempotent — skips regeneration if cert is valid >24h ahead; FORCE=1 to rotate. - Key chmod is 0644 so the CoreDNS container's nonroot user can read it via the bind mount. Acceptable for local dev; production should mount real certs with proper UID/GID. - DOT_PORT=8853, DOH_PORT=8443 (avoids Caddy already-on-443 collision) - Makefile: `make certs`, `make test-tls` - All three transports verified end-to-end (dig +tls, dig +https, curl with raw RFC 8484 wire format)	2026-05-14 01:12:25 -06:00
Ryan Malloy	10867ee319	coredns: docker compose stack with Vultr zone import - Auto plugin loads zones-prepared/.zone (regex zone-name extraction) - scripts/prepare-zones.sh transforms raw Vultr exports: synthesizes SOA (omitted by Vultr; CoreDNS requires it) * prepends @ to leading-TAB apex lines to disambiguate owner inheritance * dot-terminates NS/MX/CNAME rdata so $ORIGIN doesn't double-suffix - DNS_PORT defaults to 1053 (5353=avahi, 53=libvirt dnsmasq on this host) - Forwards non-authoritative queries to 1.1.1.1/1.0.0.1/9.9.9.9 - Makefile targets: prep, up, down, reload, test, logs - 91 zones loaded	2026-05-12 01:51:09 -06:00

10 Commits