Per RFC 1996, a master that mutates a zone SHOULD notify its
secondaries so they can immediately AXFR rather than wait for their
next SOA-refresh poll. Without this, propagation lag from UPDATE to
public DNS is bounded by the secondary's refresh interval (300s for
us) — which is borderline for ACME validation timing.
New Corefile directive:
notify <host[:port]> [<host[:port]>...]
Targets accept bare hostnames (port 53 default), host:port, or
[ipv6]:port. The same list applies to every zone in the rfc2136
block.
Implementation: fire-and-forget UDP per target, each in its own
goroutine, capped by a 2s timeout. The UPDATE response to the client
is never held pending NOTIFY acks (RFC 1996 §4 explicitly decouples
them). Failures log at DEBUG only — a briefly-unreachable secondary
is normal and would otherwise spam logs.
Retires the external scripts/notify-secondaries.py workflow for any
deployment that wires the directive: secondaries now hear about
changes within seconds of the UPDATE landing, no cron or manual
invocation needed.
New tests:
- TestSendNotify_DeliversToTarget — packet arrives, opcode + zone correct
- TestSendNotify_NoTargets_NoCrash — empty list short-circuits
- TestSendNotify_BadTarget_LogsButDoesNotBlock — fire-and-forget timing
- TestNotifyOne_AppendsDefaultPort — host vs host:port normalization
7.2 KiB
coredns-rfc2136
A CoreDNS plugin that accepts RFC 2136 dynamic DNS updates (TSIG-authenticated), filling a gap in the official plugin set.
CoreDNS as-shipped has no plugin for accepting dynamic updates — its
plugin model treats authoritative data as read-only (loaded from auto,
file, secondary, etc.). This plugin adds the missing piece.
Primary use case: self-hosted ACME DNS-01
The motivating problem: automate Let's Encrypt cert issuance for many domains without depending on registrar APIs (Vultr/Route53/Cloudflare). The architecture:
_acme-challenge.example.com CNAME <uuid>.auth.supported.systems
│
│ delegated NS to your CoreDNS host
▼
CoreDNS + rfc2136 plugin
│
│ accepts TSIG UPDATEs from Caddy
│ (caddy-dns/rfc2136) or any other
│ ACME client
▼
Let's Encrypt validates
One-time per protected domain: add a CNAME glue line in your static
zones. After that, all cert issuance + renewal happens via UPDATE
messages — zero static zone-file churn.
Status
v2026.05.22.2: production-ready. Handles UPDATE messages against
file-backed zones, TSIG-authenticates, bumps SOA serial in CalVer
YYMMDD*10000+NNNN form, atomically writes the zone file, optionally
git-commits each change for an audit trail. Designed to coexist with
CoreDNS's auto plugin (which serves queries from the same zone files
on its reload cycle).
Configuration
rfc2136 <zone> [<zone>...] {
zones-dir <path> # required
tsig-key <name> <algorithm> <base64-secret> # may repeat
ttl <seconds> # default 60
auto-commit <true|false> # default true
git-author <name> <email> # optional
rate-limit <burst> <period-seconds> # default 100 / 60s
rate-limit off # disable rate-limit
notify <host[:port]> [<host[:port]>...] # NOTIFY secondaries on every UPDATE
}
Example:
.:53 auth.example.com {
rfc2136 auth.example.com {
zones-dir /var/lib/coredns/zones
tsig-key acme-key. hmac-sha256 BASE64SECRET==
ttl 60
auto-commit true
git-author "coredns-rfc2136" "rfc2136@coredns.example.com"
}
auto {
directory /var/lib/coredns/zones (.*)\.zone {1}
reload 30s
}
errors
log
}
Operational constraints
A few behaviors operators should know before relying on this plugin:
Single-process atomicity only
The per-zone mutex serializes UPDATEs within one CoreDNS process. It
does NOT coordinate with external file edits. If you rsync a zone
file from a workstation while the plugin is mid-UPDATE, you get a
race. The plugin defends against this with a snapshot-and-recheck:
loadRRs captures (mtime, size), and immediately before writing back,
we re-stat; if the file changed, the UPDATE is refused with SERVFAIL
and the client (Caddy etc.) retries on a fresh load. The window is
narrow but non-zero.
Recommendation: don't rsync zone files into a directory the plugin is actively writing to. If you must, expect occasional SERVFAILs that resolve on retry.
MsgAcceptFunc is process-global
CoreDNS 1.14.3 doesn't expose a per-Config MsgAcceptFunc, so this
plugin overrides the miekg/dns package-level default at init() time.
Every server block in the process will accept the UPDATE opcode
at the wire layer — but only blocks with rfc2136 in their plugin
chain do anything useful with it (others pass through and return
FormatError). The actual security boundary is TSIG, enforced both in
ServeDNS and as a defense-in-depth check inside handleUpdate.
No-op UPDATEs do not bump the SOA serial
If an UPDATE adds an RR that's already present (deduped per RFC 2136 §3.4.2.2) or deletes one that doesn't exist, the file is not rewritten and the SOA serial is not advanced. We return NOERROR. Downstream secondaries are not asked to AXFR for a no-change.
If you need to force a serial bump (rare), send a touch-UPDATE: add a throwaway RR then delete it.
SOA invariants are enforced strictly
loadRRs refuses zone files with: zero SOAs, multiple SOAs, or an
SOA whose owner doesn't match the zone apex. Both at startup (via
validateZoneFiles) and on every UPDATE. Zone-file corruption fails
loud at boot rather than mysteriously on first ACME activity.
Serial counter rolls over at NNNN=9999
Format is YYMMDD*10000 + NNNN. At NNNN=9999, the next bump rolls to
the next encoded day with NNNN=0001. On heavy days the encoded date
drifts ahead of wall time; on quiet days it catches back up. Monotonic
ordering (the only DNS requirement) holds. uint32 won't wrap for ~117
years at full 10000/day burn.
TSIG replay window is miekg/dns's default (currently 300s)
The fudge window enforced by miekg/dns's TsigVerify is what gates
replay. If miekg/dns ever changes its default, this plugin's behavior
changes with it. A future enhancement is tsig-fudge as a Corefile
directive.
Git commit failure is logged at ERROR, not rolled back
If git commit fails after a successful writeAtomic, the zone file
is correct but the audit trail diverges. We log at ERROR with a
recovery hint (git -C <dir> status + manual commit). We do NOT roll
back the file write — the auto plugin may have already noticed the
new mtime, and rolling back creates more races than it solves.
Per-key rate limit
UPDATE traffic is token-bucket capped per TSIG key. Default 100
UPDATEs per 60 seconds. ACME storms are well within this; anything
beyond is suspicious. Tune via rate-limit <burst> <period>.
NOTIFY to secondaries (optional)
After every successful UPDATE, the plugin can fire DNS NOTIFY (RFC 1996) to a list of secondary servers. This collapses propagation lag from "up to the secondary's SOA refresh interval" (often 300s) to "a few seconds" — secondaries that receive NOTIFY do an immediate SOA poll and AXFR if changed.
Configure with the notify directive:
rfc2136 example.com {
zones-dir /zones
tsig-key ...
notify ns2.example.com ns3.example.com 216.218.130.2
}
Semantics:
- Targets may be
host,host:port, or[ipv6]:port. Default port is 53. - Fire-and-forget: each target gets its own goroutine with a 2s timeout. The UPDATE response to the client is not held pending NOTIFY acks (RFC 1996 §4 decouples them).
- Failures log at DEBUG only — a briefly-unreachable secondary is normal and would otherwise spam.
- Missed NOTIFY = no harm; secondary catches up on its own refresh.
- The same target list applies to every zone in the block.
Building
This plugin is consumed by a custom CoreDNS build via plugin.cfg:
# In CoreDNS source's plugin.cfg, BEFORE the `cache` plugin:
rfc2136:git.supported.systems/rsp2k/coredns-rfc2136
Then go get git.supported.systems/rsp2k/coredns-rfc2136 && make.
License
MIT (TODO: add LICENSE file).