8 Commits

Author SHA1 Message Date
3720cd2885 deploy: enable rfc2136 plugin for all 84 production zones
Wires the custom CoreDNS image (built via coredns/Dockerfile, source
includes git.supported.systems/rsp2k/coredns-rfc2136) into production:

- docker-compose.yml: switch coredns service from upstream image pin
  to a build target. New `image: coredns-rfc2136:${COREDNS_IMAGE_TAG}`
  is locally-built; `up -d coredns` triggers the build.
- .env: COREDNS_IMAGE_TAG=2026.05.21 (CalVer). Old COREDNS_IMAGE kept
  as a comment for emergency rollback to upstream 1.11.3.
- Corefile: new rfc2136 directive inside (common) snippet enumerating
  all 84 zones currently in zones/. Plugin is now in the chain for
  every server block (plain DNS, DoT, DoH). UPDATE opcode lands in
  the plugin handler; auto-commit on, CalVer SOA serial bumping on,
  zones-dir /zones matches the existing bind-mount.

TSIG key is read from ${ACME_TSIG_SECRET} which lives in .env.local
(gitignored). Production deployment needs that file synced to dell01
separately.

This commit DOESN'T trigger the deployment by itself -- the image
must be built on dell01 and the container recreated to apply.
2026-05-21 12:17:20 -06:00
48cddc91cf Phase 0 scaffolding: RFC 2136 plugin groundwork (inactive)
Lays the groundwork for a future CoreDNS rfc2136 plugin that will accept
TSIG-authenticated dynamic DNS updates from Caddy (via caddy-dns/rfc2136),
enabling self-hosted ACME DNS-01 cert automation without depending on
registrar APIs.

Nothing in this commit is active at runtime:
- Corefile additions are commented out
- coredns/Dockerfile references a plugin repo that doesn't exist yet
- scripts/acme-add-domain.sh just appends CNAME glue but has nothing
  to talk to until the plugin is built

Architecture and implementation plan:
  ~/.claude/plans/dood-does-coredns-offer-enumerated-piglet.md

Secret management: TSIG key generated and stored in .env.local
(gitignored). .env.local.example documents the expected shape.
2026-05-20 18:20:43 -06:00
9e345fa488 Corefile: drop explicit cache 30, use plugin default (3600)
The cache 30 directive in the (common) snippet was clamping
authoritative TTLs to 30s max — every record HE pulled showed TTL≈5
because the cache plugin intercepts responses regardless of source
(auto plugin authoritative answers AND forward plugin resolver answers).

Switching to bare 'cache' uses the plugin's 3600s default, which
preserves our source TTLs: most records at 300s, _dmarc/dkim/SRV at
3600s, wildcards at 60s.
2026-05-20 16:28:50 -06:00
d4a5ce9f82 coredns: script-based NOTIFY to ns1.he.net on every prep
Hurricane Electric requires asymmetric transfer config:
  - AXFR pull from 216.218.133.2 (slave.dns.he.net / ns4.he.net)
  - NOTIFY destination 216.218.130.2 (ns1.he.net)

CoreDNS's transfer plugin uses a single bidirectional `to` list for
both, which is fine in principle but breaks in a confirmed bug: any
`to` with more than one specific IPv4 silently kills server-block
listener startup (no error, zones load, but :53 never binds).
Reproduced on 1.11.3 + 1.12.2 even with a minimal fresh `docker run`.

Workaround:
  - Corefile keeps `transfer { to * }` (open AXFR; firewall does the
    real source-IP filtering on TCP/53)
  - scripts/notify-he.py crafts and sends NOTIFY messages directly to
    216.218.130.2 (only). Pure-stdlib Python — no dependencies.
  - Makefile `prep` target runs notify-he.py after prepare-zones.sh
    so every zone-bump fires NOTIFY automatically.

Verified end-to-end: HE acks NOTIFY (rcode=0) for the 10 zones it
hosts as secondaries; remaining 81 return REFUSED (rcode=5) because
HE doesn't have them configured yet. Note: HE's free slave service
acks NOTIFY but only actually re-pulls AXFR on its hourly poll cycle
(observed behavior — they're poll-based by design). NOTIFY still
useful long-term in case HE changes that behavior; harmless either way.
2026-05-18 16:57:54 -06:00
57c8366b7f coredns: document why HE-IP restriction lives at firewall, not CoreDNS
Goal was to restrict AXFR to Hurricane Electric's five secondary
nameserver IPs. Tried several CoreDNS Corefile syntaxes:

  transfer { to 216.218.130.2 ... 216.66.1.2 }       # space-separated
  transfer { to 216.218.130.2 \n to 216.218.131.2 }  # multi-line
  transfer { to 216.218.130.2 }                       # single IP
  transfer { to * 216.218.130.2 ... }                 # mixed

Every form with a specific IPv4 address silently breaks server-block
startup — the auto plugin still loads zones into memory but the
:53/:443/:853 listeners never bind. Reproducible on coredns/coredns
1.11.3 AND 1.12.2 with the (common) snippet + auto + forward shape.
Only `to *` results in healthy listener startup.

Even if we got CoreDNS-side filtering to work, Docker's default
userland-proxy rewrites source IPs to the bridge gateway, which would
break IP-based filtering anyway short of `network_mode: host`.

Decision: keep `to *` in CoreDNS, push HE-only filtering to the
FortiWiFi firewall (source-IP-restricted VIP/DNAT for WAN:53/tcp).
This is correct-layered defense — the perimeter does the IP work
before packets ever reach dell01.
2026-05-16 16:04:44 -06:00
1ab88a25f7 coredns: hidden-primary architecture with AXFR for HE secondaries
Goal: serve the public DNS face via Hurricane Electric's free
secondary-DNS service (dns.he.net), with CoreDNS on dell01 acting as
the hidden primary. We edit zones here; HE pulls them via AXFR.

Changes:
- scripts/prepare-zones.sh:
  * SOA mname: ns1.vultr.com -> ns1.he.net (so the apex SOA reflects
    HE as the primary in published RDATA)
  * Strip ns?.vultr.com NS records from each zone and inject the five
    HE nameservers (ns1..ns5.he.net) as the authoritative NS set
- Corefile (shared `common` snippet):
  * Add `transfer { to * }` to authorize AXFR. Tried specific IPs +
    `*` mixed on the same line but CoreDNS silently fails to bind
    server blocks with that syntax; bare `to *` is the only form that
    actually starts the listeners. Trade-off: NOTIFY targeting is lost
    (HE polls per SOA refresh=3600s instead of being pushed). For DNS
    data this is fine since each record is publicly queryable anyway.

Verified AXFR end-to-end: `dig @dell01 -p 5353 acrazy.org AXFR +tcp`
returns 41 records with the new HE NS set and HE-rooted SOA.

Still needed (operator action):
- Firewall NAT for TCP/53 -> 172.16.1.15:5353 (so HE can connect in)
- Add each of the 91 zones at dns.he.net as Secondary DNS pointing
  at 154.27.180.210
- Update each domain's registrar NS records from Vultr -> HE
2026-05-16 15:49:42 -06:00
066ba1892a coredns: DoT (:853) + DoH (:443) listeners with self-signed cert
- New Corefile snippet (common) shared across plain DNS / DoT / DoH so
  zone-loading + forward + cache stay DRY across all three transports
- scripts/generate-certs.sh: openssl-only self-signed RSA cert with SANs
  for localhost / 127.0.0.1 / ::1 / coredns / dns.local. Idempotent —
  skips regeneration if cert is valid >24h ahead; FORCE=1 to rotate.
- Key chmod is 0644 so the CoreDNS container's nonroot user can read it
  via the bind mount. Acceptable for local dev; production should mount
  real certs with proper UID/GID.
- DOT_PORT=8853, DOH_PORT=8443 (avoids Caddy already-on-443 collision)
- Makefile: `make certs`, `make test-tls`
- All three transports verified end-to-end (dig +tls, dig +https,
  curl with raw RFC 8484 wire format)
2026-05-14 01:12:25 -06:00
10867ee319 coredns: docker compose stack with Vultr zone import
- Auto plugin loads zones-prepared/*.zone (regex zone-name extraction)
- scripts/prepare-zones.sh transforms raw Vultr exports:
  * synthesizes SOA (omitted by Vultr; CoreDNS requires it)
  * prepends @ to leading-TAB apex lines to disambiguate owner inheritance
  * dot-terminates NS/MX/CNAME rdata so $ORIGIN doesn't double-suffix
- DNS_PORT defaults to 1053 (5353=avahi, 53=libvirt dnsmasq on this host)
- Forwards non-authoritative queries to 1.1.1.1/1.0.0.1/9.9.9.9
- Makefile targets: prep, up, down, reload, test, logs
- 91 zones loaded
2026-05-12 01:51:09 -06:00