coredns-rfc2136/README.md
Ryan Malloy 7367401734 Send DNS NOTIFY to secondaries after every UPDATE
Per RFC 1996, a master that mutates a zone SHOULD notify its
secondaries so they can immediately AXFR rather than wait for their
next SOA-refresh poll. Without this, propagation lag from UPDATE to
public DNS is bounded by the secondary's refresh interval (300s for
us) — which is borderline for ACME validation timing.

New Corefile directive:
    notify <host[:port]> [<host[:port]>...]

Targets accept bare hostnames (port 53 default), host:port, or
[ipv6]:port. The same list applies to every zone in the rfc2136
block.

Implementation: fire-and-forget UDP per target, each in its own
goroutine, capped by a 2s timeout. The UPDATE response to the client
is never held pending NOTIFY acks (RFC 1996 §4 explicitly decouples
them). Failures log at DEBUG only — a briefly-unreachable secondary
is normal and would otherwise spam logs.

Retires the external scripts/notify-secondaries.py workflow for any
deployment that wires the directive: secondaries now hear about
changes within seconds of the UPDATE landing, no cron or manual
invocation needed.

New tests:
- TestSendNotify_DeliversToTarget — packet arrives, opcode + zone correct
- TestSendNotify_NoTargets_NoCrash — empty list short-circuits
- TestSendNotify_BadTarget_LogsButDoesNotBlock — fire-and-forget timing
- TestNotifyOne_AppendsDefaultPort — host vs host:port normalization
2026-05-23 00:54:45 -06:00

7.2 KiB

coredns-rfc2136

A CoreDNS plugin that accepts RFC 2136 dynamic DNS updates (TSIG-authenticated), filling a gap in the official plugin set.

CoreDNS as-shipped has no plugin for accepting dynamic updates — its plugin model treats authoritative data as read-only (loaded from auto, file, secondary, etc.). This plugin adds the missing piece.

Primary use case: self-hosted ACME DNS-01

The motivating problem: automate Let's Encrypt cert issuance for many domains without depending on registrar APIs (Vultr/Route53/Cloudflare). The architecture:

_acme-challenge.example.com  CNAME  <uuid>.auth.supported.systems
                                      │
                                      │ delegated NS to your CoreDNS host
                                      ▼
                              CoreDNS + rfc2136 plugin
                                      │
                                      │ accepts TSIG UPDATEs from Caddy
                                      │ (caddy-dns/rfc2136) or any other
                                      │ ACME client
                                      ▼
                                  Let's Encrypt validates

One-time per protected domain: add a CNAME glue line in your static zones. After that, all cert issuance + renewal happens via UPDATE messages — zero static zone-file churn.

Status

v2026.05.22.2: production-ready. Handles UPDATE messages against file-backed zones, TSIG-authenticates, bumps SOA serial in CalVer YYMMDD*10000+NNNN form, atomically writes the zone file, optionally git-commits each change for an audit trail. Designed to coexist with CoreDNS's auto plugin (which serves queries from the same zone files on its reload cycle).

Configuration

rfc2136 <zone> [<zone>...] {
    zones-dir <path>                              # required
    tsig-key <name> <algorithm> <base64-secret>   # may repeat
    ttl <seconds>                                 # default 60
    auto-commit <true|false>                      # default true
    git-author <name> <email>                     # optional
    rate-limit <burst> <period-seconds>           # default 100 / 60s
    rate-limit off                                # disable rate-limit
    notify <host[:port]> [<host[:port]>...]       # NOTIFY secondaries on every UPDATE
}

Example:

.:53 auth.example.com {
    rfc2136 auth.example.com {
        zones-dir /var/lib/coredns/zones
        tsig-key acme-key. hmac-sha256 BASE64SECRET==
        ttl 60
        auto-commit true
        git-author "coredns-rfc2136" "rfc2136@coredns.example.com"
    }
    auto {
        directory /var/lib/coredns/zones (.*)\.zone {1}
        reload 30s
    }
    errors
    log
}

Operational constraints

A few behaviors operators should know before relying on this plugin:

Single-process atomicity only

The per-zone mutex serializes UPDATEs within one CoreDNS process. It does NOT coordinate with external file edits. If you rsync a zone file from a workstation while the plugin is mid-UPDATE, you get a race. The plugin defends against this with a snapshot-and-recheck: loadRRs captures (mtime, size), and immediately before writing back, we re-stat; if the file changed, the UPDATE is refused with SERVFAIL and the client (Caddy etc.) retries on a fresh load. The window is narrow but non-zero.

Recommendation: don't rsync zone files into a directory the plugin is actively writing to. If you must, expect occasional SERVFAILs that resolve on retry.

MsgAcceptFunc is process-global

CoreDNS 1.14.3 doesn't expose a per-Config MsgAcceptFunc, so this plugin overrides the miekg/dns package-level default at init() time. Every server block in the process will accept the UPDATE opcode at the wire layer — but only blocks with rfc2136 in their plugin chain do anything useful with it (others pass through and return FormatError). The actual security boundary is TSIG, enforced both in ServeDNS and as a defense-in-depth check inside handleUpdate.

No-op UPDATEs do not bump the SOA serial

If an UPDATE adds an RR that's already present (deduped per RFC 2136 §3.4.2.2) or deletes one that doesn't exist, the file is not rewritten and the SOA serial is not advanced. We return NOERROR. Downstream secondaries are not asked to AXFR for a no-change.

If you need to force a serial bump (rare), send a touch-UPDATE: add a throwaway RR then delete it.

SOA invariants are enforced strictly

loadRRs refuses zone files with: zero SOAs, multiple SOAs, or an SOA whose owner doesn't match the zone apex. Both at startup (via validateZoneFiles) and on every UPDATE. Zone-file corruption fails loud at boot rather than mysteriously on first ACME activity.

Serial counter rolls over at NNNN=9999

Format is YYMMDD*10000 + NNNN. At NNNN=9999, the next bump rolls to the next encoded day with NNNN=0001. On heavy days the encoded date drifts ahead of wall time; on quiet days it catches back up. Monotonic ordering (the only DNS requirement) holds. uint32 won't wrap for ~117 years at full 10000/day burn.

TSIG replay window is miekg/dns's default (currently 300s)

The fudge window enforced by miekg/dns's TsigVerify is what gates replay. If miekg/dns ever changes its default, this plugin's behavior changes with it. A future enhancement is tsig-fudge as a Corefile directive.

Git commit failure is logged at ERROR, not rolled back

If git commit fails after a successful writeAtomic, the zone file is correct but the audit trail diverges. We log at ERROR with a recovery hint (git -C <dir> status + manual commit). We do NOT roll back the file write — the auto plugin may have already noticed the new mtime, and rolling back creates more races than it solves.

Per-key rate limit

UPDATE traffic is token-bucket capped per TSIG key. Default 100 UPDATEs per 60 seconds. ACME storms are well within this; anything beyond is suspicious. Tune via rate-limit <burst> <period>.

NOTIFY to secondaries (optional)

After every successful UPDATE, the plugin can fire DNS NOTIFY (RFC 1996) to a list of secondary servers. This collapses propagation lag from "up to the secondary's SOA refresh interval" (often 300s) to "a few seconds" — secondaries that receive NOTIFY do an immediate SOA poll and AXFR if changed.

Configure with the notify directive:

rfc2136 example.com {
    zones-dir /zones
    tsig-key ...
    notify ns2.example.com ns3.example.com 216.218.130.2
}

Semantics:

  • Targets may be host, host:port, or [ipv6]:port. Default port is 53.
  • Fire-and-forget: each target gets its own goroutine with a 2s timeout. The UPDATE response to the client is not held pending NOTIFY acks (RFC 1996 §4 decouples them).
  • Failures log at DEBUG only — a briefly-unreachable secondary is normal and would otherwise spam.
  • Missed NOTIFY = no harm; secondary catches up on its own refresh.
  • The same target list applies to every zone in the block.

Building

This plugin is consumed by a custom CoreDNS build via plugin.cfg:

# In CoreDNS source's plugin.cfg, BEFORE the `cache` plugin:
rfc2136:git.supported.systems/rsp2k/coredns-rfc2136

Then go get git.supported.systems/rsp2k/coredns-rfc2136 && make.

License

MIT (TODO: add LICENSE file).