11 Commits

Author SHA1 Message Date
7367401734 Send DNS NOTIFY to secondaries after every UPDATE
Per RFC 1996, a master that mutates a zone SHOULD notify its
secondaries so they can immediately AXFR rather than wait for their
next SOA-refresh poll. Without this, propagation lag from UPDATE to
public DNS is bounded by the secondary's refresh interval (300s for
us) — which is borderline for ACME validation timing.

New Corefile directive:
    notify <host[:port]> [<host[:port]>...]

Targets accept bare hostnames (port 53 default), host:port, or
[ipv6]:port. The same list applies to every zone in the rfc2136
block.

Implementation: fire-and-forget UDP per target, each in its own
goroutine, capped by a 2s timeout. The UPDATE response to the client
is never held pending NOTIFY acks (RFC 1996 §4 explicitly decouples
them). Failures log at DEBUG only — a briefly-unreachable secondary
is normal and would otherwise spam logs.

Retires the external scripts/notify-secondaries.py workflow for any
deployment that wires the directive: secondaries now hear about
changes within seconds of the UPDATE landing, no cron or manual
invocation needed.

New tests:
- TestSendNotify_DeliversToTarget — packet arrives, opcode + zone correct
- TestSendNotify_NoTargets_NoCrash — empty list short-circuits
- TestSendNotify_BadTarget_LogsButDoesNotBlock — fire-and-forget timing
- TestNotifyOne_AppendsDefaultPort — host vs host:port normalization
2026-05-23 00:54:45 -06:00
8d1477350a M8: per-key UPDATE rate limiting (token bucket)
Hamilton M8: a compromised TSIG key — or a misconfigured client
retrying forever — must not be able to drive unbounded UPDATE traffic.
Each UPDATE costs disk IOPS, a git commit, and a slot in the SOA
serial counter (now 9999/day per zone). Without a cap, a few hours of
runaway traffic could exhaust the SOA serial counter and brick the
zone for the day.

Implementation: per-key token bucket in ratelimit.go. Default 100
tokens / 60 seconds. New keys start full so legitimate clients see no
delay at boot. Refill is continuous, capped at the burst value.

Configurable in Corefile:
  rate-limit off                    # disable entirely
  rate-limit <burst> <period-secs>  # e.g., rate-limit 200 60

Enforcement runs in ServeDNS after TSIG verification — a request that
fails auth doesn't consume a token (and a forged TSIG can't be used to
deny service to a real key holder, since we never reached the rate
check).

100/min is well above ACME's needs: a worst-case full-renewal storm
across our ~84 zones emits maybe 200 UPDATEs total over several
minutes. Anything beyond is suspicious by definition.

New tests covering: first-call allowed, burst exhaustion, refill
behavior, per-key isolation, refill-cap (no idle-accumulation
overflow).
2026-05-22 21:31:17 -06:00
8e421f925e C1/C2/M9: tighten security boundary at handleUpdate
C1 — Document the process-global MsgAcceptFunc mutation:
CoreDNS 1.14.3 doesn't expose per-Config MsgAcceptFunc (server.go:159
hardcodes the dns.Server struct), so the override has to be global. The
init()-level comment now explains the operational consequences in
detail, and setup() emits a loud INFO log calling out the global scope
for operator audit. Upstream support for per-Config MsgAcceptFunc would
let us delete the whole stanza.

C2 — handleUpdate now requires the caller to assert TSIG verification
via an explicit `verified bool` parameter. The security contract is
encoded in the function signature, not in convention. ServeDNS passes
verified=true after checkTSIG succeeds; verified=false produces an
immediate Refused with no state mutation. Future internal callers
(NOTIFY relay, admin RPC, refactor) physically cannot reach the
mutation code without proving the request was authenticated.

M9 — Don't sign TSIG-failure rejection responses. Per Hamilton's
finding, signing a rejection with the named key attests "yes, this
server holds that key" — useful intel for an attacker probing key
existence. Unsigned Refused is the right shape: nsupdate sees "no TSIG
on reply" and treats as auth failure, which is what actually happened.

New test TestUpdate_UnverifiedCaller_Refused proves the C2 contract:
handleUpdate(w, msg, false) refuses, zone file unchanged.
2026-05-22 21:18:47 -06:00
6268e6eafd Sign responses to TSIG-signed UPDATEs (RFC 8945 §5.4.2)
When a request arrives with TSIG, attach a TSIG record to the response
so dns.ResponseWriter computes the MAC at write time using the secret
in TsigSecret. Without this, BIND nsupdate complains "expected a TSIG
or SIG(0)" on every UPDATE, even when the update applies successfully.

Two response paths fixed:
  - handleUpdate success/per-rcode replies (update.go)
  - ServeDNS rejection when TSIG verification fails (plugin.go)

The new helper in tsig.go is a no-op for unsigned requests. Unknown
keys still silently skip signing — we can't authenticate to a peer we
don't share a key with.

Tests verify both branches: signed request → response carries matching
TSIG (key name + algorithm); unsigned request → response stays plain.
2026-05-22 09:24:12 -06:00
1fe95e3f6c Revert diagnostic ServeDNS log line 2026-05-21 11:46:49 -06:00
19e93e39e1 DIAGNOSTIC: log every ServeDNS call (will revert) 2026-05-21 11:35:20 -06:00
0f28127284 Phase 2b: refactor to file-backed storage; UPDATE writes zones/*.zone
Major architectural pivot per the user's "RFC 2136 mechanism for the
existing zonefiles, not a new in-memory thing" framing. The plugin no
longer maintains its own in-memory state OR serves any queries -- both
of those are now the auto plugin's job, reading the same zone files.

The plugin's sole responsibility is now: receive TSIG-authed UPDATE
messages, edit the matching zones/<zone>.zone file, bump the SOA
serial in CalVer (YYYYMMDDNN) form, and optionally auto-commit to git.

What changed:
- DELETED: store.go (in-memory recordStore), store_test.go (12 tests),
  plugin_test.go (10 ServeDNS query tests), old update_test.go.
- NEW: zonefile.go -- file-backed authority for one zone. loadRRs via
  miekg/dns zone parser; mutation helpers (lookupIn/nameExistsIn/
  removeRRsetFrom/removeRRFrom/removeNameFrom/addRRTo) on []dns.RR
  slices; bumpSerial with CalVer semantics + NN exhaustion handling;
  writeAtomic via temp-file rename; commit shells to `git add && git
  commit` with configurable author.
- NEW: zonefile_test.go -- 17 tests covering load/lookup/mutate/bump/
  write paths.
- REWRITTEN: plugin.go -- ServeDNS is now thin: UPDATE → TSIG → handler;
  everything else → Next. No synthetic SOA/NS, no query serving.
- REWRITTEN: update.go -- handleUpdate now opens the zoneFile, loads,
  applies (with prereq checks against the loaded RRs), bumps serial,
  writes, commits. Detects no-op updates to avoid spurious file writes.
- REWRITTEN: setup.go -- new directives: `zones-dir` (required),
  `auto-commit` (default true), `git-author <name> <email>`. Dropped
  `nameserver` and `persist`. Validates each declared zone has a file
  on disk via os.Stat before CoreDNS finishes starting.
- REWRITTEN: setup_test.go -- 17 cases for the new grammar.
- REWRITTEN: update_test.go -- 11 cases using real temp zone files
  via t.TempDir().

Total: 30 tests passing, 0 failures.

Next: Phase 2c (custom CoreDNS image, deploy, smoke test with nsupdate).
2026-05-21 11:26:50 -06:00
1d2d919728 Phase 1.4: UPDATE opcode handler + TSIG verification
Replaces the Phase-1.3 refuseUpdate() stub with a real RFC 2136 handler.
Caddy via caddy-dns/rfc2136 can now inject and remove records.

UPDATE message handling (update.go):
- Zone section validation: must be exactly one SOA-typed record naming
  a zone we're authoritative for. Returns FORMERR/NOTAUTH otherwise.
- Prerequisites (§3.2): name-exists, RRset-exists, name-NOT-exists,
  RRset-NOT-exists semantics implemented. First failure short-circuits
  with the spec's rcode (NXDOMAIN/NXRRSET/YXDOMAIN/YXRRSET).
- Updates (§3.4.2): add RR, delete RRset (CLASS=ANY+RDLEN=0), delete
  all RRsets at name (CLASS=ANY+TYPE=ANY), delete specific RR (CLASS=
  NONE).
- Apex SOA/NS protected: synthetic and cannot be added or removed via
  UPDATE. Apex wipe (TYPE=ANY at apex) also refused.
- Default TTL applied to incoming records with TTL=0.

TSIG (tsig.go + setup.go):
- setup() now populates dnsserver.Config.TsigSecret so the underlying
  dns.Server auto-verifies signatures via miekg/dns.
- checkTSIG() in ServeDNS gates UPDATEs: rejects if no TSIG, unknown
  key name, algorithm-downgrade attempt, or w.TsigStatus() != nil.
- No TSIG keys configured → all UPDATEs refused (safety default).
- Algorithm pinning prevents downgrade attacks (e.g. forced HMAC-MD5).

Tests (update_test.go): 11 new cases covering happy paths and every
error rcode. Total: 35 top-level test passes, 0 failures.

ServeDNS dispatch now calls handleUpdate after auth gate. The
refuseUpdate() stub is gone. UPDATE end-to-end via nsupdate requires
the custom CoreDNS image (Phase 2) to verify TSIG plumbing on the
dns.Server side.
2026-05-21 10:51:18 -06:00
1cca9a5aa7 Phase 1.3: in-memory store + ServeDNS query dispatch
ServeDNS now answers authoritatively for the configured zone(s):
- Apex SOA → synthetic SOA (serial = store generation counter)
- Apex NS  → synthetic NS pointing at p.Nameserver
- In-store lookups for any qtype
- NODATA vs NXDOMAIN correctly distinguished (SOA in authority section)
- UPDATE opcode → REFUSED (Phase 1.4 implements properly)
- Queries outside our zones pass through to Next

Added:
- store.go: recordStore with sync.RWMutex + atomic generation counter.
  Operations: Add (de-dupes), RemoveRRset, RemoveRR, RemoveName, Lookup
  (returns a copy so callers can't corrupt internal state), NameExists.
  All keyed on canonical lowercase + trailing-dot names.
- plugin.go: ServeDNS dispatch, findZone (longest-suffix match),
  syntheticSOA, syntheticNS. New Nameserver field.
- setup.go: nameserver directive. Default Nameserver = first zone apex.
  Store initialised at parse time.
- store_test.go: 12 unit tests covering add/dedupe/remove/lookup/
  generation/case-insensitivity/copy-safety.
- plugin_test.go: 10 dispatch tests covering pass-through, apex
  synthetics, in-store lookups, NXDOMAIN/NODATA semantics, UPDATE
  refusal, findZone longest-suffix-wins and case behavior.
- setup_test.go: 3 new cases for the nameserver directive + store init.

Total: 38 tests passing.

Module: git.supported.systems/rsp2k/coredns-rfc2136
2026-05-21 10:37:48 -06:00
eba6313ec0 Phase 1.2: wire parser → typed config + 13 unit tests
The Corefile parser now fully populates typed fields on RFC2136 instead
of just recognising directives. Validation happens at parse-time so
configuration errors fail loud at CoreDNS startup rather than silent at
request time.

Added:
- config.go: tsigKey type, TSIG algorithm allowlist (rejects HMAC-MD5
  deliberately), base64 secret decoder with 8-byte minimum length check,
  canonical-key-name normalisation (lowercase + trailing dot).
- plugin.go: RFC2136 struct now carries TSIGKeys map, TTL uint32,
  PersistPath string. DefaultTTL=60.
- setup.go: parse() validates and stores tsig-key/ttl/persist directives.
  Duplicate key names rejected. Multiple TSIG keys allowed (for rotation).
  At-least-one-zone is enforced.
- setup_test.go: 13 table-driven cases (5 happy + 8 error paths) using
  caddy.NewTestController. All pass.

ServeDNS still passes through — UPDATE handling lands in Phase 1.4.

Module path: git.supported.systems/rsp2k/coredns-rfc2136
2026-05-21 10:31:22 -06:00
e9d37f483c Initial commit: plugin skeleton, compiles against CoreDNS 1.14.3
Sets up the package layout for a CoreDNS plugin that will accept RFC 2136
dynamic updates with TSIG authentication, primarily targeting self-hosted
ACME DNS-01 cert automation.

What this commit gives us:
- go.mod against coredns/caddy v1.1.4, coredns/coredns v1.14.3, miekg/dns v1.1.72
- plugin.go: RFC2136 struct + Handler interface (ServeDNS is pass-through)
- setup.go: init() registration + Corefile parser (skeleton — recognizes
  tsig-key, ttl, persist directives but doesn't yet wire them)
- README.md, .gitignore

go build ./... clean. No tests yet — those come with Phase 1.2 alongside
the actual UPDATE handler and in-memory store.

Plan: ~/.claude/plans/dood-does-coredns-offer-enumerated-piglet.md
2026-05-20 18:25:36 -06:00