Compare commits
No commits in common. "main" and "fix/wider-serial-counter" have entirely different histories.
main
...
fix/wider-
130
README.md
130
README.md
@ -33,25 +33,17 @@ messages — zero static zone-file churn.
|
||||
|
||||
## Status
|
||||
|
||||
**v2026.05.22.2**: production-ready. Handles UPDATE messages against
|
||||
file-backed zones, TSIG-authenticates, bumps SOA serial in CalVer
|
||||
YYMMDD*10000+NNNN form, atomically writes the zone file, optionally
|
||||
git-commits each change for an audit trail. Designed to coexist with
|
||||
CoreDNS's `auto` plugin (which serves queries from the same zone files
|
||||
on its reload cycle).
|
||||
**Phase 1 (skeleton)**: compiles, registers with CoreDNS, parses the
|
||||
Corefile directive. Does not yet handle UPDATE messages or serve any
|
||||
records. ServeDNS is a pass-through. See `phases.md` for the roadmap.
|
||||
|
||||
## Configuration
|
||||
|
||||
```
|
||||
rfc2136 <zone> [<zone>...] {
|
||||
zones-dir <path> # required
|
||||
tsig-key <name> <algorithm> <base64-secret> # may repeat
|
||||
ttl <seconds> # default 60
|
||||
auto-commit <true|false> # default true
|
||||
git-author <name> <email> # optional
|
||||
rate-limit <burst> <period-seconds> # default 100 / 60s
|
||||
rate-limit off # disable rate-limit
|
||||
notify <host[:port]> [<host[:port]>...] # NOTIFY secondaries on every UPDATE
|
||||
tsig-key <key-name> <algorithm> <base64-secret>
|
||||
ttl <seconds>
|
||||
persist <path>
|
||||
}
|
||||
```
|
||||
|
||||
@ -60,124 +52,14 @@ Example:
|
||||
```
|
||||
.:53 auth.example.com {
|
||||
rfc2136 auth.example.com {
|
||||
zones-dir /var/lib/coredns/zones
|
||||
tsig-key acme-key. hmac-sha256 BASE64SECRET==
|
||||
ttl 60
|
||||
auto-commit true
|
||||
git-author "coredns-rfc2136" "rfc2136@coredns.example.com"
|
||||
}
|
||||
auto {
|
||||
directory /var/lib/coredns/zones (.*)\.zone {1}
|
||||
reload 30s
|
||||
}
|
||||
errors
|
||||
log
|
||||
}
|
||||
```
|
||||
|
||||
## Operational constraints
|
||||
|
||||
A few behaviors operators should know before relying on this plugin:
|
||||
|
||||
### Single-process atomicity only
|
||||
|
||||
The per-zone mutex serializes UPDATEs *within one CoreDNS process*. It
|
||||
does NOT coordinate with external file edits. If you `rsync` a zone
|
||||
file from a workstation while the plugin is mid-UPDATE, you get a
|
||||
race. The plugin defends against this with a snapshot-and-recheck:
|
||||
loadRRs captures (mtime, size), and immediately before writing back,
|
||||
we re-stat; if the file changed, the UPDATE is refused with SERVFAIL
|
||||
and the client (Caddy etc.) retries on a fresh load. The window is
|
||||
narrow but non-zero.
|
||||
|
||||
**Recommendation**: don't rsync zone files into a directory the plugin
|
||||
is actively writing to. If you must, expect occasional SERVFAILs that
|
||||
resolve on retry.
|
||||
|
||||
### MsgAcceptFunc is process-global
|
||||
|
||||
CoreDNS 1.14.3 doesn't expose a per-Config `MsgAcceptFunc`, so this
|
||||
plugin overrides the miekg/dns package-level default at `init()` time.
|
||||
**Every server block** in the process will accept the UPDATE opcode
|
||||
at the wire layer — but only blocks with `rfc2136` in their plugin
|
||||
chain do anything useful with it (others pass through and return
|
||||
FormatError). The actual security boundary is TSIG, enforced both in
|
||||
`ServeDNS` and as a defense-in-depth check inside `handleUpdate`.
|
||||
|
||||
### No-op UPDATEs do not bump the SOA serial
|
||||
|
||||
If an UPDATE adds an RR that's already present (deduped per RFC 2136
|
||||
§3.4.2.2) or deletes one that doesn't exist, the file is not rewritten
|
||||
and the SOA serial is not advanced. We return NOERROR. Downstream
|
||||
secondaries are not asked to AXFR for a no-change.
|
||||
|
||||
If you need to force a serial bump (rare), send a touch-UPDATE: add a
|
||||
throwaway RR then delete it.
|
||||
|
||||
### SOA invariants are enforced strictly
|
||||
|
||||
`loadRRs` refuses zone files with: zero SOAs, multiple SOAs, or an
|
||||
SOA whose owner doesn't match the zone apex. Both at startup (via
|
||||
`validateZoneFiles`) and on every UPDATE. Zone-file corruption fails
|
||||
loud at boot rather than mysteriously on first ACME activity.
|
||||
|
||||
### Serial counter rolls over at NNNN=9999
|
||||
|
||||
Format is `YYMMDD*10000 + NNNN`. At NNNN=9999, the next bump rolls to
|
||||
the next encoded day with NNNN=0001. On heavy days the encoded date
|
||||
drifts ahead of wall time; on quiet days it catches back up. Monotonic
|
||||
ordering (the only DNS requirement) holds. uint32 won't wrap for ~117
|
||||
years at full 10000/day burn.
|
||||
|
||||
### TSIG replay window is miekg/dns's default (currently 300s)
|
||||
|
||||
The fudge window enforced by miekg/dns's TsigVerify is what gates
|
||||
replay. If miekg/dns ever changes its default, this plugin's behavior
|
||||
changes with it. A future enhancement is `tsig-fudge` as a Corefile
|
||||
directive.
|
||||
|
||||
### Git commit failure is logged at ERROR, not rolled back
|
||||
|
||||
If `git commit` fails after a successful `writeAtomic`, the zone file
|
||||
is correct but the audit trail diverges. We log at ERROR with a
|
||||
recovery hint (`git -C <dir> status` + manual commit). We do NOT roll
|
||||
back the file write — the auto plugin may have already noticed the
|
||||
new mtime, and rolling back creates more races than it solves.
|
||||
|
||||
### Per-key rate limit
|
||||
|
||||
UPDATE traffic is token-bucket capped per TSIG key. Default 100
|
||||
UPDATEs per 60 seconds. ACME storms are well within this; anything
|
||||
beyond is suspicious. Tune via `rate-limit <burst> <period>`.
|
||||
|
||||
### NOTIFY to secondaries (optional)
|
||||
|
||||
After every successful UPDATE, the plugin can fire DNS NOTIFY (RFC
|
||||
1996) to a list of secondary servers. This collapses propagation lag
|
||||
from "up to the secondary's SOA refresh interval" (often 300s) to
|
||||
"a few seconds" — secondaries that receive NOTIFY do an immediate SOA
|
||||
poll and AXFR if changed.
|
||||
|
||||
Configure with the `notify` directive:
|
||||
|
||||
```
|
||||
rfc2136 example.com {
|
||||
zones-dir /zones
|
||||
tsig-key ...
|
||||
notify ns2.example.com ns3.example.com 216.218.130.2
|
||||
}
|
||||
```
|
||||
|
||||
Semantics:
|
||||
- Targets may be `host`, `host:port`, or `[ipv6]:port`. Default port is 53.
|
||||
- Fire-and-forget: each target gets its own goroutine with a 2s timeout.
|
||||
The UPDATE response to the client is **not** held pending NOTIFY acks
|
||||
(RFC 1996 §4 decouples them).
|
||||
- Failures log at DEBUG only — a briefly-unreachable secondary is
|
||||
normal and would otherwise spam.
|
||||
- Missed NOTIFY = no harm; secondary catches up on its own refresh.
|
||||
- The same target list applies to every zone in the block.
|
||||
|
||||
## Building
|
||||
|
||||
This plugin is consumed by a custom CoreDNS build via `plugin.cfg`:
|
||||
|
||||
62
notify.go
62
notify.go
@ -1,62 +0,0 @@
|
||||
package rfc2136
|
||||
|
||||
import (
|
||||
"net"
|
||||
"time"
|
||||
|
||||
"github.com/miekg/dns"
|
||||
)
|
||||
|
||||
// notifyTimeout caps how long any single NOTIFY send can block before
|
||||
// we give up. RFC 1996 §4 says the master MUST NOT block UPDATE
|
||||
// acknowledgement on NOTIFY delivery — the secondaries will fall back
|
||||
// to their own SOA refresh polling if NOTIFY is missed. 2s is plenty
|
||||
// for a healthy secondary to ack via UDP; a slow/blackholed target
|
||||
// just times out.
|
||||
const notifyTimeout = 2 * time.Second
|
||||
|
||||
// defaultNotifyPort is appended to any target that doesn't already
|
||||
// specify host:port. NOTIFY is always-over-port-53 in practice.
|
||||
const defaultNotifyPort = "53"
|
||||
|
||||
// sendNotify dispatches fire-and-forget DNS NOTIFY messages (RFC 1996)
|
||||
// to every configured secondary for the given zone. Each target gets
|
||||
// its own goroutine so a slow/blackholed secondary can't slow
|
||||
// propagation to its siblings.
|
||||
//
|
||||
// We do NOT wait for goroutines to finish — the UPDATE response goes
|
||||
// back to the client immediately. Whether secondaries ack or not, the
|
||||
// master's job is done; secondaries that miss the NOTIFY pick up the
|
||||
// new serial on their next refresh poll.
|
||||
//
|
||||
// Failures are logged at Debug level. NOTIFY is best-effort; logging
|
||||
// at Warning would flood the operator on every transient packet drop
|
||||
// for secondaries that are intermittently reachable.
|
||||
func sendNotify(zone string, targets []string) {
|
||||
if len(targets) == 0 {
|
||||
return
|
||||
}
|
||||
for _, t := range targets {
|
||||
go notifyOne(zone, t)
|
||||
}
|
||||
}
|
||||
|
||||
// notifyOne sends one NOTIFY packet to `target` for `zone`. Target
|
||||
// can be "host" (default port 53), "host:port", or "[ipv6]:port".
|
||||
func notifyOne(zone, target string) {
|
||||
addr := target
|
||||
if _, _, err := net.SplitHostPort(addr); err != nil {
|
||||
addr = net.JoinHostPort(addr, defaultNotifyPort)
|
||||
}
|
||||
|
||||
msg := new(dns.Msg)
|
||||
msg.SetNotify(dns.Fqdn(zone))
|
||||
|
||||
c := &dns.Client{Net: "udp", Timeout: notifyTimeout}
|
||||
_, _, err := c.Exchange(msg, addr)
|
||||
if err != nil {
|
||||
log.Debugf("NOTIFY %s → %s failed: %v", zone, addr, err)
|
||||
return
|
||||
}
|
||||
log.Debugf("NOTIFY %s → %s ok", zone, addr)
|
||||
}
|
||||
129
notify_test.go
129
notify_test.go
@ -1,129 +0,0 @@
|
||||
package rfc2136
|
||||
|
||||
import (
|
||||
"net"
|
||||
"sync"
|
||||
"testing"
|
||||
"time"
|
||||
|
||||
"github.com/miekg/dns"
|
||||
)
|
||||
|
||||
// testNotifyListener spins up a UDP DNS-protocol listener on an
|
||||
// ephemeral port that captures any messages it receives. Returns the
|
||||
// host:port string for use as a NOTIFY target, plus a getter for the
|
||||
// last-captured message.
|
||||
func testNotifyListener(t *testing.T) (addr string, getLast func() *dns.Msg) {
|
||||
t.Helper()
|
||||
conn, err := net.ListenUDP("udp", &net.UDPAddr{IP: net.IPv4(127, 0, 0, 1), Port: 0})
|
||||
if err != nil {
|
||||
t.Fatalf("ListenUDP: %v", err)
|
||||
}
|
||||
|
||||
var mu sync.Mutex
|
||||
var last *dns.Msg
|
||||
done := make(chan struct{})
|
||||
|
||||
go func() {
|
||||
buf := make([]byte, 512)
|
||||
for {
|
||||
conn.SetReadDeadline(time.Now().Add(500 * time.Millisecond))
|
||||
n, _, err := conn.ReadFromUDP(buf)
|
||||
if err != nil {
|
||||
select {
|
||||
case <-done:
|
||||
return
|
||||
default:
|
||||
continue
|
||||
}
|
||||
}
|
||||
msg := new(dns.Msg)
|
||||
if unpackErr := msg.Unpack(buf[:n]); unpackErr == nil {
|
||||
mu.Lock()
|
||||
last = msg
|
||||
mu.Unlock()
|
||||
}
|
||||
}
|
||||
}()
|
||||
|
||||
t.Cleanup(func() {
|
||||
close(done)
|
||||
conn.Close()
|
||||
})
|
||||
|
||||
return conn.LocalAddr().String(), func() *dns.Msg {
|
||||
mu.Lock()
|
||||
defer mu.Unlock()
|
||||
return last
|
||||
}
|
||||
}
|
||||
|
||||
func TestSendNotify_DeliversToTarget(t *testing.T) {
|
||||
addr, getLast := testNotifyListener(t)
|
||||
|
||||
sendNotify("auth.example.com", []string{addr})
|
||||
|
||||
// Wait up to 1s for the packet to arrive (test listener polls on
|
||||
// 500ms deadline). The send goroutine writes immediately; the
|
||||
// listener loop just needs one read cycle to pick it up.
|
||||
deadline := time.Now().Add(1 * time.Second)
|
||||
for time.Now().Before(deadline) {
|
||||
if msg := getLast(); msg != nil {
|
||||
if msg.Opcode != dns.OpcodeNotify {
|
||||
t.Errorf("Opcode = %d, want OpcodeNotify (%d)", msg.Opcode, dns.OpcodeNotify)
|
||||
}
|
||||
if len(msg.Question) != 1 || msg.Question[0].Name != "auth.example.com." {
|
||||
t.Errorf("Question = %+v, want one entry with name auth.example.com.", msg.Question)
|
||||
}
|
||||
if !msg.Authoritative {
|
||||
t.Errorf("AA flag not set on NOTIFY")
|
||||
}
|
||||
return
|
||||
}
|
||||
time.Sleep(20 * time.Millisecond)
|
||||
}
|
||||
t.Fatal("NOTIFY never arrived at target within 1s")
|
||||
}
|
||||
|
||||
func TestSendNotify_NoTargets_NoCrash(t *testing.T) {
|
||||
// Empty target list must short-circuit without launching goroutines
|
||||
// or panicking.
|
||||
sendNotify("auth.example.com", nil)
|
||||
sendNotify("auth.example.com", []string{})
|
||||
// No assertions — survival is the test.
|
||||
}
|
||||
|
||||
func TestSendNotify_BadTarget_LogsButDoesNotBlock(t *testing.T) {
|
||||
// Target a port we know nothing listens on. The fire-and-forget
|
||||
// send must return immediately; the goroutine eventually times out.
|
||||
start := time.Now()
|
||||
sendNotify("auth.example.com", []string{"127.0.0.1:1"})
|
||||
if elapsed := time.Since(start); elapsed > 100*time.Millisecond {
|
||||
t.Errorf("sendNotify blocked %v on unreachable target; expected fire-and-forget", elapsed)
|
||||
}
|
||||
}
|
||||
|
||||
func TestNotifyOne_AppendsDefaultPort(t *testing.T) {
|
||||
// Spin up a listener on 127.0.0.1:<random>, then call notifyOne
|
||||
// with both forms (bare host + host:port) and verify both deliver.
|
||||
addr, getLast := testNotifyListener(t)
|
||||
host, port, err := net.SplitHostPort(addr)
|
||||
if err != nil {
|
||||
t.Fatalf("split: %v", err)
|
||||
}
|
||||
_ = host
|
||||
|
||||
// Form 1: host:port (the normal case).
|
||||
notifyOne("first.example.com", addr)
|
||||
time.Sleep(100 * time.Millisecond)
|
||||
if m := getLast(); m == nil || len(m.Question) == 0 || m.Question[0].Name != "first.example.com." {
|
||||
t.Errorf("host:port form did not deliver: %+v", m)
|
||||
}
|
||||
|
||||
// We can't easily test the bare-host case because port 53 is the
|
||||
// default and we can't bind there without root. Verifying the
|
||||
// defaulting branch directly is sufficient.
|
||||
if port == "" {
|
||||
t.Fatal("unreachable: SplitHostPort returned empty port")
|
||||
}
|
||||
}
|
||||
40
plugin.go
40
plugin.go
@ -19,8 +19,6 @@ package rfc2136
|
||||
|
||||
import (
|
||||
"context"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
"github.com/coredns/coredns/plugin"
|
||||
"github.com/miekg/dns"
|
||||
@ -64,18 +62,6 @@ type RFC2136 struct {
|
||||
// zones holds per-zone file handlers, keyed by canonical zone name.
|
||||
// Populated in setup; mutexes live inside each zoneFile.
|
||||
zones map[string]*zoneFile
|
||||
|
||||
// rateLimit caps UPDATE traffic per TSIG key (Hamilton M8). nil
|
||||
// disables rate limiting (test mode, or insecure deployments).
|
||||
// Populated in setup() once TSIG keys are known.
|
||||
rateLimit *rateLimiter
|
||||
|
||||
// NotifyTargets is the list of secondary servers (IP[:port]) to
|
||||
// send DNS NOTIFY messages to after every successful UPDATE.
|
||||
// Default port 53. Empty list = no NOTIFY (secondaries rely on
|
||||
// their own SOA-refresh polling). Configured via the `notify`
|
||||
// directive in the Corefile.
|
||||
NotifyTargets []string
|
||||
}
|
||||
|
||||
// Name implements plugin.Handler.
|
||||
@ -94,28 +80,16 @@ func (p *RFC2136) ServeDNS(ctx context.Context, w dns.ResponseWriter, r *dns.Msg
|
||||
log.Warningf("UPDATE rejected: %v", err)
|
||||
resp := new(dns.Msg)
|
||||
resp.SetRcode(r, dns.RcodeRefused)
|
||||
// Do NOT sign rejection responses. Signing a refusal with
|
||||
// the named key would attest the key exists on this server
|
||||
// (Hamilton M9). Unsigned Refused is the right shape — the
|
||||
// client sees "no TSIG on reply" and treats that as auth
|
||||
// failure, which is correct because auth DID fail.
|
||||
// Best-effort: if the request had TSIG and we recognize
|
||||
// the key, the framework will sign the rejection so the
|
||||
// client can authenticate "yes, server rejected this."
|
||||
// Unknown keys silently skip signing — correct, since we
|
||||
// can't prove identity to a peer we don't share a key with.
|
||||
signResponseIfSigned(resp, r)
|
||||
_ = w.WriteMsg(resp)
|
||||
return dns.RcodeRefused, nil
|
||||
}
|
||||
// Hamilton M8: per-key rate limit. TSIG just authenticates the
|
||||
// sender — it doesn't prove the sender's behavior is sane. A
|
||||
// compromised key or a runaway client must not be able to
|
||||
// exhaust disk/git/serial-counter resources.
|
||||
if p.rateLimit != nil {
|
||||
if tsig := r.IsTsig(); tsig != nil && !p.rateLimit.allow(strings.ToLower(tsig.Hdr.Name), time.Now()) {
|
||||
log.Warningf("UPDATE rate-limited for key %q", tsig.Hdr.Name)
|
||||
resp := new(dns.Msg)
|
||||
resp.SetRcode(r, dns.RcodeRefused)
|
||||
_ = w.WriteMsg(resp)
|
||||
return dns.RcodeRefused, nil
|
||||
}
|
||||
}
|
||||
return p.handleUpdate(w, r, true)
|
||||
return p.handleUpdate(w, r)
|
||||
}
|
||||
return plugin.NextOrFailure(p.Name(), p.Next, ctx, w, r)
|
||||
}
|
||||
|
||||
82
ratelimit.go
82
ratelimit.go
@ -1,82 +0,0 @@
|
||||
package rfc2136
|
||||
|
||||
import (
|
||||
"sync"
|
||||
"time"
|
||||
)
|
||||
|
||||
// Per-key token bucket. Hamilton M8: a compromised TSIG key — or a
|
||||
// misconfigured client retrying forever — must not be able to drive
|
||||
// unbounded UPDATE traffic. Each UPDATE costs disk IOPS, a git commit,
|
||||
// and a slot in the SOA serial counter (9999/day per zone). 100
|
||||
// UPDATEs/minute per key is well above any legitimate ACME workflow
|
||||
// (a full renewal storm across our ~84 zones might emit ~200 UPDATEs
|
||||
// total over several minutes); anything beyond is suspicious.
|
||||
const (
|
||||
defaultRateBurst = 100 // max tokens
|
||||
defaultRatePeriod = time.Minute // refill window
|
||||
)
|
||||
|
||||
// rateLimiter is a goroutine-safe per-key token bucket. The zero value
|
||||
// is unusable; construct via newRateLimiter.
|
||||
type rateLimiter struct {
|
||||
mu sync.Mutex
|
||||
buckets map[string]*bucket
|
||||
burst float64 // max tokens
|
||||
period time.Duration // time to fully refill
|
||||
}
|
||||
|
||||
type bucket struct {
|
||||
tokens float64
|
||||
lastRefill time.Time
|
||||
}
|
||||
|
||||
func newRateLimiter(burst int, period time.Duration) *rateLimiter {
|
||||
if burst <= 0 {
|
||||
burst = defaultRateBurst
|
||||
}
|
||||
if period <= 0 {
|
||||
period = defaultRatePeriod
|
||||
}
|
||||
return &rateLimiter{
|
||||
buckets: make(map[string]*bucket),
|
||||
burst: float64(burst),
|
||||
period: period,
|
||||
}
|
||||
}
|
||||
|
||||
// allow attempts to take one token for `key`. Returns true if a token
|
||||
// was available, false otherwise. New keys start full (burst tokens).
|
||||
//
|
||||
// Refill is continuous: tokens accumulate at burst/period per second.
|
||||
// The bucket caps at burst tokens.
|
||||
func (rl *rateLimiter) allow(key string, now time.Time) bool {
|
||||
rl.mu.Lock()
|
||||
defer rl.mu.Unlock()
|
||||
|
||||
b, ok := rl.buckets[key]
|
||||
if !ok {
|
||||
// First time we see this key — start the bucket full so
|
||||
// legitimate clients don't see refill delays at boot.
|
||||
rl.buckets[key] = &bucket{
|
||||
tokens: rl.burst - 1,
|
||||
lastRefill: now,
|
||||
}
|
||||
return true
|
||||
}
|
||||
|
||||
// Refill: tokens earned since last access.
|
||||
elapsed := now.Sub(b.lastRefill).Seconds()
|
||||
earned := elapsed * (rl.burst / rl.period.Seconds())
|
||||
b.tokens += earned
|
||||
if b.tokens > rl.burst {
|
||||
b.tokens = rl.burst
|
||||
}
|
||||
b.lastRefill = now
|
||||
|
||||
if b.tokens >= 1.0 {
|
||||
b.tokens -= 1.0
|
||||
return true
|
||||
}
|
||||
return false
|
||||
}
|
||||
@ -1,83 +0,0 @@
|
||||
package rfc2136
|
||||
|
||||
import (
|
||||
"testing"
|
||||
"time"
|
||||
)
|
||||
|
||||
func TestRateLimiter_FirstCallAllowed(t *testing.T) {
|
||||
rl := newRateLimiter(5, time.Minute)
|
||||
now := time.Now()
|
||||
if !rl.allow("key-a", now) {
|
||||
t.Errorf("first call for new key must be allowed")
|
||||
}
|
||||
}
|
||||
|
||||
func TestRateLimiter_BurstExhausts(t *testing.T) {
|
||||
rl := newRateLimiter(3, time.Minute)
|
||||
now := time.Now()
|
||||
// First 3 calls succeed.
|
||||
for i := 0; i < 3; i++ {
|
||||
if !rl.allow("key-a", now) {
|
||||
t.Fatalf("call %d should be allowed (burst=3)", i+1)
|
||||
}
|
||||
}
|
||||
// 4th immediately after burst should be denied (no time elapsed
|
||||
// for refill).
|
||||
if rl.allow("key-a", now) {
|
||||
t.Errorf("4th call exceeded burst; should be denied")
|
||||
}
|
||||
}
|
||||
|
||||
func TestRateLimiter_RefillsOverTime(t *testing.T) {
|
||||
// burst=2, period=1s → refill rate is 2 tokens/sec.
|
||||
rl := newRateLimiter(2, time.Second)
|
||||
t0 := time.Now()
|
||||
if !rl.allow("k", t0) {
|
||||
t.Fatal("call 1")
|
||||
}
|
||||
if !rl.allow("k", t0) {
|
||||
t.Fatal("call 2")
|
||||
}
|
||||
if rl.allow("k", t0) {
|
||||
t.Fatal("call 3 should be denied; bucket empty")
|
||||
}
|
||||
// Advance time by 500ms — should refill ~1 token.
|
||||
if !rl.allow("k", t0.Add(500*time.Millisecond)) {
|
||||
t.Errorf("expected refill after 500ms")
|
||||
}
|
||||
}
|
||||
|
||||
func TestRateLimiter_PerKeyIsolation(t *testing.T) {
|
||||
rl := newRateLimiter(2, time.Minute)
|
||||
now := time.Now()
|
||||
// Exhaust key-a.
|
||||
rl.allow("key-a", now)
|
||||
rl.allow("key-a", now)
|
||||
if rl.allow("key-a", now) {
|
||||
t.Fatal("key-a still has tokens; setup wrong")
|
||||
}
|
||||
// key-b is independent — must still be allowed.
|
||||
if !rl.allow("key-b", now) {
|
||||
t.Errorf("key-b was rate-limited despite no prior use")
|
||||
}
|
||||
}
|
||||
|
||||
// TestRateLimiter_DoesNotOverflow guards against refill math
|
||||
// accumulating beyond burst (which would let an attacker burst more
|
||||
// after a long idle period than the configured cap).
|
||||
func TestRateLimiter_DoesNotOverflow(t *testing.T) {
|
||||
rl := newRateLimiter(5, time.Second)
|
||||
t0 := time.Now()
|
||||
rl.allow("k", t0) // create bucket
|
||||
// Advance time 1 hour. Refill should cap at burst=5.
|
||||
tFuture := t0.Add(time.Hour)
|
||||
for i := 0; i < 5; i++ {
|
||||
if !rl.allow("k", tFuture) {
|
||||
t.Fatalf("post-idle call %d should be allowed (cap=5)", i+1)
|
||||
}
|
||||
}
|
||||
if rl.allow("k", tFuture) {
|
||||
t.Errorf("post-idle call 6 should be denied; cap exceeded")
|
||||
}
|
||||
}
|
||||
108
setup.go
108
setup.go
@ -6,7 +6,6 @@ import (
|
||||
"os"
|
||||
"path/filepath"
|
||||
"strconv"
|
||||
"time"
|
||||
|
||||
"github.com/coredns/caddy"
|
||||
"github.com/coredns/coredns/core/dnsserver"
|
||||
@ -21,44 +20,17 @@ var log = clog.NewWithPlugin("rfc2136")
|
||||
func init() {
|
||||
plugin.Register("rfc2136", setup)
|
||||
|
||||
// SECURITY/OPERATIONAL NOTE — process-global mutation:
|
||||
//
|
||||
// miekg/dns's default MsgAcceptFunc rejects UPDATE opcode messages
|
||||
// with NOTIMP at the wire layer, before any plugin sees them
|
||||
// (acceptfunc.go: "Don't allow dynamic updates, because then the
|
||||
// sections can contain a whole bunch of RRs"). CoreDNS 1.14.3
|
||||
// constructs its dns.Server instances without exposing a
|
||||
// per-server MsgAcceptFunc to plugins (see
|
||||
// coredns/core/dnsserver/server.go:159 — the dns.Server struct is
|
||||
// hardcoded), so to accept UPDATE opcodes anywhere we must
|
||||
// override the package-level default. We do that here.
|
||||
// with NOTIMP before they ever reach the plugin chain (see the
|
||||
// comment in miekg/dns/acceptfunc.go: "Don't allow dynamic updates,
|
||||
// because then the sections can contain a whole bunch of RRs").
|
||||
//
|
||||
// Consequences operators should understand:
|
||||
//
|
||||
// 1. The override is PROCESS-WIDE. Every CoreDNS server block in
|
||||
// this binary will accept UPDATE opcodes at the wire layer,
|
||||
// not just the one(s) where `rfc2136` is configured. Other
|
||||
// blocks will pass the UPDATE through their plugin chains;
|
||||
// since no plugin in those chains handles UPDATE, the request
|
||||
// falls off the end of the chain and CoreDNS returns
|
||||
// FormatError. No state is mutated — but the wire-layer
|
||||
// gatekeeping moved into the plugin chain.
|
||||
//
|
||||
// 2. The actual security boundary is TSIG verification, which
|
||||
// happens in this plugin's ServeDNS (checkTSIG) AND inside
|
||||
// handleUpdate (assertAuthenticated). Defense in depth: any
|
||||
// state-mutating path re-verifies, so a future refactor that
|
||||
// adds a new caller cannot accidentally skip auth.
|
||||
//
|
||||
// 3. If you remove `rfc2136` from your Corefile and reload via
|
||||
// SIGUSR1, this global is NOT restored. Restart the process
|
||||
// to fully revert.
|
||||
//
|
||||
// The mitigation matrix is: (a) loud comment here, (b) startup
|
||||
// INFO log listing the zones this plugin owns, (c) TSIG re-check
|
||||
// inside handleUpdate. The architecturally clean fix would be
|
||||
// upstream support in CoreDNS for per-Config MsgAcceptFunc — when
|
||||
// that lands, delete this whole stanza.
|
||||
// CoreDNS constructs its dns.Server instances without setting a
|
||||
// per-server MsgAcceptFunc, so the package-level default is the
|
||||
// one that runs. We override it here at plugin init() time -- well
|
||||
// before any dns.Server starts listening -- to permit UPDATE
|
||||
// through. The plugin itself does proper validation in the
|
||||
// UPDATE handler, so opening this gate doesn't lower security.
|
||||
dns.DefaultMsgAcceptFunc = msgAcceptFunc
|
||||
}
|
||||
|
||||
@ -137,10 +109,6 @@ func setup(c *caddy.Controller) error {
|
||||
|
||||
log.Infof("ready: zones=%v keys=%d ttl=%d dir=%q auto-commit=%t",
|
||||
p.Zones, len(p.TSIGKeys), p.TTL, p.ZonesDir, p.AutoCommit)
|
||||
// Surface the global MsgAcceptFunc override for operator audit —
|
||||
// if a sibling server block on this process doesn't expect UPDATE
|
||||
// opcodes to traverse it, the operator should know they will.
|
||||
log.Infof("dns.DefaultMsgAcceptFunc was overridden process-wide to permit OpcodeUpdate; state mutation requires TSIG verification on this plugin's zones=%v only", p.Zones)
|
||||
return nil
|
||||
}
|
||||
|
||||
@ -165,13 +133,6 @@ func parse(c *caddy.Controller) (*RFC2136, error) {
|
||||
// Per-zone git author overrides. Defaults are applied later.
|
||||
var gitAuthorName, gitAuthorEmail string
|
||||
|
||||
// Rate-limit config (Hamilton M8). Defaults are
|
||||
// defaultRateBurst/defaultRatePeriod from ratelimit.go; an explicit
|
||||
// `rate-limit <burst> <period-seconds>` directive overrides.
|
||||
rateBurst := defaultRateBurst
|
||||
ratePeriod := defaultRatePeriod
|
||||
rateLimitEnabled := true
|
||||
|
||||
for c.Next() {
|
||||
args := c.RemainingArgs()
|
||||
if len(args) < 1 {
|
||||
@ -243,37 +204,6 @@ func parse(c *caddy.Controller) (*RFC2136, error) {
|
||||
gitAuthorName = gArgs[0]
|
||||
gitAuthorEmail = gArgs[1]
|
||||
|
||||
case "notify":
|
||||
nArgs := c.RemainingArgs()
|
||||
if len(nArgs) < 1 {
|
||||
return nil, c.Errf("notify requires at least one secondary (host or host:port)")
|
||||
}
|
||||
p.NotifyTargets = append(p.NotifyTargets, nArgs...)
|
||||
|
||||
case "rate-limit":
|
||||
rArgs := c.RemainingArgs()
|
||||
switch len(rArgs) {
|
||||
case 1:
|
||||
if rArgs[0] == "off" || rArgs[0] == "false" || rArgs[0] == "no" {
|
||||
rateLimitEnabled = false
|
||||
break
|
||||
}
|
||||
return nil, c.Errf("rate-limit single-arg form must be 'off'; for limits use 'rate-limit <burst> <period-seconds>'")
|
||||
case 2:
|
||||
b, err := strconv.ParseUint(rArgs[0], 10, 31)
|
||||
if err != nil || b < 1 {
|
||||
return nil, c.Errf("rate-limit burst must be positive integer, got %q", rArgs[0])
|
||||
}
|
||||
pSec, err := strconv.ParseUint(rArgs[1], 10, 31)
|
||||
if err != nil || pSec < 1 {
|
||||
return nil, c.Errf("rate-limit period must be positive integer seconds, got %q", rArgs[1])
|
||||
}
|
||||
rateBurst = int(b)
|
||||
ratePeriod = time.Duration(pSec) * time.Second
|
||||
default:
|
||||
return nil, c.Errf("rate-limit takes 'off' OR '<burst> <period-seconds>', got %d args", len(rArgs))
|
||||
}
|
||||
|
||||
default:
|
||||
return nil, c.Errf("unknown directive: %s", c.Val())
|
||||
}
|
||||
@ -287,11 +217,6 @@ func parse(c *caddy.Controller) (*RFC2136, error) {
|
||||
return nil, c.Err("zones-dir is required")
|
||||
}
|
||||
|
||||
// Construct rate limiter if enabled.
|
||||
if rateLimitEnabled {
|
||||
p.rateLimit = newRateLimiter(rateBurst, ratePeriod)
|
||||
}
|
||||
|
||||
// Build zoneFile handles for each declared zone.
|
||||
p.zones = make(map[string]*zoneFile, len(p.Zones))
|
||||
for _, z := range p.Zones {
|
||||
@ -316,16 +241,8 @@ func parse(c *caddy.Controller) (*RFC2136, error) {
|
||||
}
|
||||
|
||||
// validateZoneFiles ensures every configured zone has an accessible
|
||||
// AND parseable file on disk at the expected path. Catches both typos
|
||||
// (file missing) and corrupt zone content at CoreDNS startup rather
|
||||
// than on the first UPDATE — the operator gets an immediate signal
|
||||
// instead of discovering the breakage minutes later when ACME fires.
|
||||
//
|
||||
// Hamilton M4: the previous version only stat()'d the file. A zone
|
||||
// with a syntax error sailed through startup, then the first UPDATE
|
||||
// returned SERVFAIL with no startup-time signal. We now run the same
|
||||
// loadRRs + assertSingleApexSOA path the UPDATE handler uses, so any
|
||||
// parse-time or SOA-invariant failure surfaces at startup.
|
||||
// file on disk at the expected path. Catches typos at CoreDNS startup
|
||||
// rather than the first UPDATE.
|
||||
func (p *RFC2136) validateZoneFiles() error {
|
||||
for zone, zf := range p.zones {
|
||||
st, err := os.Stat(zf.Path)
|
||||
@ -335,9 +252,6 @@ func (p *RFC2136) validateZoneFiles() error {
|
||||
if st.IsDir() {
|
||||
return fmt.Errorf("zone %q: %s is a directory, expected a regular file", zone, zf.Path)
|
||||
}
|
||||
if _, _, err := zf.loadRRs(); err != nil {
|
||||
return fmt.Errorf("zone %q at %s: %w", zone, zf.Path, err)
|
||||
}
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
10
tsig.go
10
tsig.go
@ -86,16 +86,6 @@ func (p *RFC2136) checkTSIG(w dns.ResponseWriter, r *dns.Msg) error {
|
||||
// A nil from TsigStatus means verification succeeded; any non-nil
|
||||
// error means the signature was invalid, the time was outside the
|
||||
// fudge window, or some other auth failure.
|
||||
//
|
||||
// H6 — Replay window: the TSIG `fudge` field (RFC 8945 §5.2) is the
|
||||
// allowed clock skew between client and server. miekg/dns enforces
|
||||
// this in TsigVerify (rejecting with ErrTime if the request's time
|
||||
// is too far from local time). We rely on miekg/dns's default
|
||||
// tolerance (currently 300s); we do not configure it ourselves.
|
||||
// If miekg/dns's default ever changes, our replay-window behavior
|
||||
// changes with it — operators should monitor upstream release notes.
|
||||
// A future enhancement is to make the fudge configurable via the
|
||||
// Corefile (e.g., `tsig-fudge 300`).
|
||||
if status := w.TsigStatus(); status != nil {
|
||||
return fmt.Errorf("TSIG verification failed for key %q: %w", keyName, status)
|
||||
}
|
||||
|
||||
132
update.go
132
update.go
@ -2,7 +2,6 @@ package rfc2136
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"path/filepath"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
@ -27,35 +26,10 @@ import (
|
||||
// Steps 3-7 happen under the zone-file mutex. If 8 fails we log but
|
||||
// don't roll back (the on-disk state is authoritative; lost commits
|
||||
// can be re-staged via `git add` later).
|
||||
//
|
||||
// SECURITY CONTRACT — the `verified` parameter:
|
||||
//
|
||||
// handleUpdate mutates zone files on disk. The caller MUST set
|
||||
// verified=true only after successfully validating the message's TSIG
|
||||
// signature against a configured key. ServeDNS does this. A
|
||||
// verified=false invocation is treated as an unauthenticated attempt
|
||||
// and refused — preserving the security boundary even if a future
|
||||
// internal caller (NOTIFY relay, admin RPC, refactor) reaches this
|
||||
// function without going through the wire-level TSIG check.
|
||||
//
|
||||
// Tests that exercise post-auth logic pass verified=true. Tests that
|
||||
// exercise auth rejection pass verified=false.
|
||||
//
|
||||
// This is defense-in-depth: ServeDNS already verifies; we re-assert at
|
||||
// the function boundary so the security property survives refactors.
|
||||
func (p *RFC2136) handleUpdate(w dns.ResponseWriter, r *dns.Msg, verified bool) (int, error) {
|
||||
func (p *RFC2136) handleUpdate(w dns.ResponseWriter, r *dns.Msg) (int, error) {
|
||||
resp := new(dns.Msg)
|
||||
resp.SetReply(r)
|
||||
if verified {
|
||||
// Only sign responses we authorize. Signing rejections leaks
|
||||
// attestation that the named key exists on this server (see M9
|
||||
// in the Hamilton review). Unauthorized callers get an
|
||||
// unsigned Refused.
|
||||
signResponseIfSigned(resp, r)
|
||||
} else {
|
||||
log.Warningf("handleUpdate refused: caller did not assert TSIG verification — possible internal bypass attempt")
|
||||
return p.updateResp(w, resp, dns.RcodeRefused)
|
||||
}
|
||||
signResponseIfSigned(resp, r)
|
||||
|
||||
// 1. Validate the Zone section.
|
||||
if len(r.Question) != 1 {
|
||||
@ -81,8 +55,8 @@ func (p *RFC2136) handleUpdate(w dns.ResponseWriter, r *dns.Msg, verified bool)
|
||||
zf.mu.Lock()
|
||||
defer zf.mu.Unlock()
|
||||
|
||||
// 3. Load the current zone contents (with file-identity snapshot).
|
||||
rrs, snap, err := zf.loadRRs()
|
||||
// 3. Load the current zone contents.
|
||||
rrs, err := zf.loadRRs()
|
||||
if err != nil {
|
||||
log.Errorf("UPDATE failed: %v", err)
|
||||
return p.updateResp(w, resp, dns.RcodeServerFailure)
|
||||
@ -115,28 +89,8 @@ func (p *RFC2136) handleUpdate(w dns.ResponseWriter, r *dns.Msg, verified bool)
|
||||
|
||||
if !changed {
|
||||
// UPDATE was a valid no-op (e.g. only contained adds for RRs
|
||||
// that were already present, deduped away per RFC 2136
|
||||
// §3.4.2.2). Return NOERROR without rewriting the file or
|
||||
// bumping the SOA serial.
|
||||
//
|
||||
// H7 — Policy decision documented:
|
||||
//
|
||||
// We DO NOT bump the SOA serial on no-op UPDATEs. Rationale:
|
||||
// - DNS-wise, nothing changed. Forcing downstream secondaries
|
||||
// (HE) to do an AXFR pull just to re-fetch identical content
|
||||
// wastes bandwidth and is not what RFC 2136 implies.
|
||||
// - The wire-visible cert-issuance chain for ACME does not
|
||||
// depend on the second-UPDATE's serial bump — once the first
|
||||
// UPDATE landed, the SOA already advanced and the auto plugin
|
||||
// reloaded; subsequent identical UPDATEs are spurious and
|
||||
// should be silent.
|
||||
// - Caddy's caddy-dns/rfc2136 client treats NOERROR-no-bump as
|
||||
// "yes I have your record" — which is the truthful answer.
|
||||
//
|
||||
// If a caller wants to force a serial bump for some reason, they
|
||||
// can send a touch-UPDATE that adds-then-deletes a throwaway
|
||||
// record. That's an explicit, intentional pattern and is
|
||||
// supported.
|
||||
// that were already present, deduped away). Return NOERROR
|
||||
// without rewriting the file.
|
||||
return p.updateResp(w, resp, dns.RcodeSuccess)
|
||||
}
|
||||
|
||||
@ -147,45 +101,21 @@ func (p *RFC2136) handleUpdate(w dns.ResponseWriter, r *dns.Msg, verified bool)
|
||||
return p.updateResp(w, resp, dns.RcodeServerFailure)
|
||||
}
|
||||
|
||||
// 6b. Concurrent-modification check (Hamilton H1). Before clobbering
|
||||
// the on-disk file, verify nothing changed it out from under us
|
||||
// between loadRRs and now. The per-zone mutex serializes us against
|
||||
// other in-process UPDATEs, but external editors (rsync push,
|
||||
// manual edit, `git checkout`) can race in any time. If the file
|
||||
// changed, refuse with SERVFAIL so Caddy retries on a fresh load.
|
||||
if err := zf.checkUnchanged(snap); err != nil {
|
||||
log.Warningf("UPDATE refused: %v", err)
|
||||
return p.updateResp(w, resp, dns.RcodeServerFailure)
|
||||
}
|
||||
|
||||
// 7. Atomic write.
|
||||
if err := zf.writeAtomic(updated, now); err != nil {
|
||||
log.Errorf("UPDATE write failed: %v", err)
|
||||
return p.updateResp(w, resp, dns.RcodeServerFailure)
|
||||
}
|
||||
|
||||
// 8. Auto-commit. Failure to commit means the file is correct but
|
||||
// the git audit trail diverges (Hamilton H2). We log at ERROR with
|
||||
// structured detail so operators discover the divergence; recovery
|
||||
// is `git -C <zonesDir> status` + `git add` + manual commit. We do
|
||||
// NOT roll back the file write — by the time the commit fails, the
|
||||
// auto plugin may have already noticed the new mtime, and rolling
|
||||
// back creates more races than it solves.
|
||||
// 8. Auto-commit. Failure to commit isn't fatal to the UPDATE —
|
||||
// the on-disk state is authoritative — but we log loudly.
|
||||
msg := summarizeUpdate(zone, r.Ns)
|
||||
if err := zf.commit(msg); err != nil {
|
||||
log.Errorf("git auto-commit failed; zone file is correct but audit trail diverged: zone=%s path=%s err=%v — recover with `git -C %s status` + manual commit",
|
||||
zone, zf.Path, err, filepath.Dir(zf.Path))
|
||||
log.Warningf("git auto-commit failed: %v", err)
|
||||
}
|
||||
|
||||
log.Infof("UPDATE applied: zone=%s prereqs=%d updates=%d msg=%q",
|
||||
zone, len(r.Answer), len(r.Ns), msg)
|
||||
|
||||
// Fire NOTIFY to configured secondaries (RFC 1996). Non-blocking:
|
||||
// each target gets its own goroutine, capped by notifyTimeout. The
|
||||
// UPDATE response to the client is not held on these acks — RFC
|
||||
// 1996 §4 explicitly decouples them.
|
||||
sendNotify(zone, p.NotifyTargets)
|
||||
|
||||
return p.updateResp(w, resp, dns.RcodeSuccess)
|
||||
}
|
||||
|
||||
@ -304,13 +234,8 @@ func applyUpdate(zone string, defaultTTL uint32, rrs []dns.RR, rr dns.RR) ([]dns
|
||||
log.Debugf("apex %s add refused", dns.TypeToString[hdr.Rrtype])
|
||||
return rrs, dns.RcodeRefused, false
|
||||
}
|
||||
// Hamilton L2: don't mutate the caller's RR header. miekg/dns
|
||||
// parses the UPDATE message into RRs the caller still owns;
|
||||
// silently rewriting their TTL is a hygiene smell. Copy first,
|
||||
// then apply our default if needed.
|
||||
if hdr.Ttl == 0 {
|
||||
rr = dns.Copy(rr)
|
||||
rr.Header().Ttl = defaultTTL
|
||||
hdr.Ttl = defaultTTL
|
||||
}
|
||||
before := len(rrs)
|
||||
rrs = addRRTo(rrs, rr)
|
||||
@ -319,43 +244,12 @@ func applyUpdate(zone string, defaultTTL uint32, rrs []dns.RR, rr dns.RR) ([]dns
|
||||
}
|
||||
|
||||
// summarizeUpdate produces a one-line commit message describing the
|
||||
// UPDATE for git history. The output is sanitized — Hamilton M7 — to
|
||||
// prevent attacker-controlled RR names (TSIG just authenticates the
|
||||
// sender; the payload is still attacker-controlled) from injecting
|
||||
// control characters into git log, log aggregators, or any downstream
|
||||
// renderer that interprets ANSI/newlines.
|
||||
// UPDATE for git history.
|
||||
func summarizeUpdate(zone string, updates []dns.RR) string {
|
||||
var msg string
|
||||
if len(updates) == 1 {
|
||||
msg = fmt.Sprintf("rfc2136 %s: %s", zone, oneLineOp(updates[0]))
|
||||
} else {
|
||||
msg = fmt.Sprintf("rfc2136 %s: %d operations", zone, len(updates))
|
||||
return fmt.Sprintf("rfc2136 %s: %s", zone, oneLineOp(updates[0]))
|
||||
}
|
||||
return sanitizeForCommitMessage(msg)
|
||||
}
|
||||
|
||||
// sanitizeForCommitMessage strips control characters from s, replacing
|
||||
// them with their printable escape form. This keeps git log + downstream
|
||||
// renderers safe from attacker-injected newlines, escape sequences, etc.
|
||||
func sanitizeForCommitMessage(s string) string {
|
||||
var b strings.Builder
|
||||
b.Grow(len(s))
|
||||
for _, r := range s {
|
||||
switch {
|
||||
case r == '\n':
|
||||
b.WriteString("\\n")
|
||||
case r == '\r':
|
||||
b.WriteString("\\r")
|
||||
case r == '\t':
|
||||
b.WriteString("\\t")
|
||||
case r < 0x20 || r == 0x7f:
|
||||
// Other C0 controls + DEL: emit \xNN.
|
||||
fmt.Fprintf(&b, "\\x%02x", r)
|
||||
default:
|
||||
b.WriteRune(r)
|
||||
}
|
||||
}
|
||||
return b.String()
|
||||
return fmt.Sprintf("rfc2136 %s: %d operations", zone, len(updates))
|
||||
}
|
||||
|
||||
// oneLineOp returns a short human-readable description of a single
|
||||
|
||||
119
update_test.go
119
update_test.go
@ -12,19 +12,14 @@ import (
|
||||
|
||||
// captureWriter implements dns.ResponseWriter and stashes the message
|
||||
// passed to WriteMsg so tests can inspect it after handleUpdate returns.
|
||||
//
|
||||
// tsigErr, if non-nil, is what TsigStatus() returns — letting tests
|
||||
// simulate TSIG-verification failure (bad MAC, fudge-window violation,
|
||||
// unknown key etc.).
|
||||
type captureWriter struct {
|
||||
msg *dns.Msg
|
||||
tsigErr error
|
||||
msg *dns.Msg
|
||||
}
|
||||
|
||||
func (cw *captureWriter) WriteMsg(m *dns.Msg) error { cw.msg = m; return nil }
|
||||
func (cw *captureWriter) Write([]byte) (int, error) { return 0, nil }
|
||||
func (cw *captureWriter) Close() error { return nil }
|
||||
func (cw *captureWriter) TsigStatus() error { return cw.tsigErr }
|
||||
func (cw *captureWriter) TsigStatus() error { return nil }
|
||||
func (cw *captureWriter) TsigTimersOnly(bool) {}
|
||||
func (cw *captureWriter) Hijack() {}
|
||||
func (cw *captureWriter) LocalAddr() net.Addr { return &net.IPAddr{IP: net.IPv4(127, 0, 0, 1)} }
|
||||
@ -60,14 +55,11 @@ func newUpdate(zone string, updates ...dns.RR) *dns.Msg {
|
||||
}
|
||||
|
||||
// runUpdate sends msg through the handler with TSIG auth bypassed
|
||||
// (calling handleUpdate directly with verified=true instead of
|
||||
// ServeDNS). Tests that exercise post-auth logic use this; tests that
|
||||
// assert auth-failure behavior should call handleUpdate(w, msg, false)
|
||||
// directly.
|
||||
// (calling handleUpdate directly instead of ServeDNS).
|
||||
func runUpdate(t *testing.T, p *RFC2136, msg *dns.Msg) (rcode int) {
|
||||
t.Helper()
|
||||
w := &captureWriter{}
|
||||
rcode, _ = p.handleUpdate(w, msg, true)
|
||||
rcode, _ = p.handleUpdate(w, msg)
|
||||
return rcode
|
||||
}
|
||||
|
||||
@ -75,7 +67,7 @@ func runUpdate(t *testing.T, p *RFC2136, msg *dns.Msg) (rcode int) {
|
||||
func readZoneRecords(t *testing.T, p *RFC2136, zone string) []dns.RR {
|
||||
t.Helper()
|
||||
zf := p.zones[dns.Fqdn(zone)]
|
||||
rrs, _, err := zf.loadRRs()
|
||||
rrs, err := zf.loadRRs()
|
||||
if err != nil {
|
||||
t.Fatalf("loadRRs: %v", err)
|
||||
}
|
||||
@ -160,103 +152,6 @@ func TestUpdate_DeleteRRset_RemovesAllOfType(t *testing.T) {
|
||||
}
|
||||
}
|
||||
|
||||
// TestCheckTSIG_BadStatus_Refused covers H6 — the path where miekg/dns
|
||||
// has reported a TSIG verification failure (bad MAC, fudge-window
|
||||
// violation, expired timestamp, etc.) via dns.ResponseWriter.TsigStatus.
|
||||
// checkTSIG must surface this as an error so ServeDNS refuses the
|
||||
// UPDATE. This is the test that would catch a regression in our
|
||||
// reliance on miekg/dns's default fudge window.
|
||||
func TestCheckTSIG_BadStatus_Refused(t *testing.T) {
|
||||
p := newTestPluginWithZone(t, "auth.example.com")
|
||||
// Configure a key so checkTSIG doesn't short-circuit on "no keys".
|
||||
p.TSIGKeys = map[string]tsigKey{
|
||||
"acme-update-key.": {Algorithm: dns.HmacSHA256, Secret: []byte("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa")},
|
||||
}
|
||||
upd := newUpdate("auth.example.com",
|
||||
mustRR(t, `foo.auth.example.com. 60 IN TXT "x"`),
|
||||
)
|
||||
upd.SetTsig("acme-update-key.", dns.HmacSHA256, 300, 0)
|
||||
|
||||
w := &captureWriter{
|
||||
// Simulate miekg/dns reporting "TSIG outside fudge window."
|
||||
tsigErr: dns.ErrTime,
|
||||
}
|
||||
if err := p.checkTSIG(w, upd); err == nil {
|
||||
t.Errorf("checkTSIG accepted request despite TsigStatus reporting %v", dns.ErrTime)
|
||||
}
|
||||
}
|
||||
|
||||
// TestUpdate_UnverifiedCaller_Refused proves the C2 defense-in-depth
|
||||
// contract: handleUpdate refuses any call that doesn't assert TSIG
|
||||
// verification, even if the rest of the message is well-formed and the
|
||||
// zone is authoritative. This guards against a future internal caller
|
||||
// that bypasses the wire-level TSIG check in ServeDNS.
|
||||
func TestUpdate_UnverifiedCaller_Refused(t *testing.T) {
|
||||
p := newTestPluginWithZone(t, "auth.example.com")
|
||||
upd := newUpdate("auth.example.com",
|
||||
mustRR(t, `foo.auth.example.com. 60 IN TXT "should-not-apply"`),
|
||||
)
|
||||
w := &captureWriter{}
|
||||
rcode, _ := p.handleUpdate(w, upd, false) // <- the security boundary
|
||||
if rcode != dns.RcodeRefused {
|
||||
t.Errorf("rcode = %d, want REFUSED for unverified caller", rcode)
|
||||
}
|
||||
// Zone file must NOT have been modified.
|
||||
for _, rr := range readZoneRecords(t, p, "auth.example.com") {
|
||||
if txt, ok := rr.(*dns.TXT); ok && txt.Hdr.Name == "foo.auth.example.com." {
|
||||
t.Errorf("unverified UPDATE applied state mutation: %s", rr.String())
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// TestUpdate_DeleteRR_IgnoresTTL covers M3: an UPDATE's delete-RR
|
||||
// specifies a different TTL than the stored RR, but per RFC 2136
|
||||
// §3.4.2.4 the deletion must match by owner/class/type/rdata only.
|
||||
// The old String()-based comparison would silently fail to match.
|
||||
func TestUpdate_DeleteRR_IgnoresTTL(t *testing.T) {
|
||||
p := newTestPluginWithZone(t, "auth.example.com")
|
||||
// Add a TXT with TTL 60.
|
||||
runUpdate(t, p, newUpdate("auth.example.com",
|
||||
mustRR(t, `foo.auth.example.com. 60 IN TXT "match-me"`),
|
||||
))
|
||||
// Issue a CLASS=NONE delete with a different TTL (per RFC 2136
|
||||
// §2.5.4, the TTL on a CLASS=NONE delete is supposed to be 0; some
|
||||
// clients get this wrong, and even when they get it right, an
|
||||
// implementation must ignore the value).
|
||||
delRR := mustRR(t, `foo.auth.example.com. 0 IN TXT "match-me"`)
|
||||
delRR.Header().Class = dns.ClassNONE
|
||||
if rcode := runUpdate(t, p, newUpdate("auth.example.com", delRR)); rcode != dns.RcodeSuccess {
|
||||
t.Fatalf("rcode = %d, want NOERROR", rcode)
|
||||
}
|
||||
for _, rr := range readZoneRecords(t, p, "auth.example.com") {
|
||||
if txt, ok := rr.(*dns.TXT); ok && txt.Hdr.Name == "foo.auth.example.com." && txt.Txt[0] == "match-me" {
|
||||
t.Errorf("TTL-differing delete did not remove the RR: %s", rr.String())
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// TestSanitizeForCommitMessage covers M7: attacker-controlled RR names
|
||||
// (TSIG only authenticates the sender; the payload is hostile by
|
||||
// default) must not inject control characters into commit messages or
|
||||
// downstream log renderers.
|
||||
func TestSanitizeForCommitMessage(t *testing.T) {
|
||||
cases := []struct {
|
||||
in, want string
|
||||
}{
|
||||
{"plain", "plain"},
|
||||
{"with\nnewline", "with\\nnewline"},
|
||||
{"tab\there", "tab\\there"},
|
||||
{"esc\x1b[31mred\x1b[0m", "esc\\x1b[31mred\\x1b[0m"},
|
||||
{"del\x7fchar", "del\\x7fchar"},
|
||||
}
|
||||
for _, tc := range cases {
|
||||
got := sanitizeForCommitMessage(tc.in)
|
||||
if got != tc.want {
|
||||
t.Errorf("sanitize(%q) = %q, want %q", tc.in, got, tc.want)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestUpdate_OutOfZone_Refused(t *testing.T) {
|
||||
p := newTestPluginWithZone(t, "auth.example.com")
|
||||
|
||||
@ -384,7 +279,7 @@ func TestUpdate_SignedRequest_ResponseGetsTSIG(t *testing.T) {
|
||||
upd.SetTsig("acme-update-key.", dns.HmacSHA256, 300, 0)
|
||||
|
||||
w := &captureWriter{}
|
||||
if _, err := p.handleUpdate(w, upd, true); err != nil {
|
||||
if _, err := p.handleUpdate(w, upd); err != nil {
|
||||
t.Fatalf("handleUpdate: %v", err)
|
||||
}
|
||||
if w.msg == nil {
|
||||
@ -414,7 +309,7 @@ func TestUpdate_UnsignedRequest_ResponseStaysUnsigned(t *testing.T) {
|
||||
// No SetTsig — request is plain.
|
||||
|
||||
w := &captureWriter{}
|
||||
if _, err := p.handleUpdate(w, upd, true); err != nil {
|
||||
if _, err := p.handleUpdate(w, upd); err != nil {
|
||||
t.Fatalf("handleUpdate: %v", err)
|
||||
}
|
||||
if w.msg.IsTsig() != nil {
|
||||
|
||||
212
zonefile.go
212
zonefile.go
@ -1,13 +1,10 @@
|
||||
package rfc2136
|
||||
|
||||
import (
|
||||
"context"
|
||||
"fmt"
|
||||
"math"
|
||||
"os"
|
||||
"os/exec"
|
||||
"path/filepath"
|
||||
"strconv"
|
||||
"strings"
|
||||
"sync"
|
||||
"time"
|
||||
@ -15,31 +12,6 @@ import (
|
||||
"github.com/miekg/dns"
|
||||
)
|
||||
|
||||
// gitCommandTimeout caps any single git invocation we shell out to. If
|
||||
// the process hangs (NFS stall, gpg-sign prompt, broken pre-commit hook
|
||||
// waiting on stdin, etc.) we must not block the caller — which holds
|
||||
// the per-zone mutex — forever. 10s is generous for a local-disk repo.
|
||||
const gitCommandTimeout = 10 * time.Second
|
||||
|
||||
// fileSnapshot captures the file-identity fingerprint at the moment we
|
||||
// last read the zone file. We compare it to the live stat just before
|
||||
// writing back to detect concurrent modification (rsync push, manual
|
||||
// edit, `git checkout`). If anything changed the file out from under
|
||||
// us, we refuse the UPDATE — Caddy will retry, and the next loadRRs
|
||||
// will see the updated state.
|
||||
type fileSnapshot struct {
|
||||
mtime time.Time
|
||||
size int64
|
||||
}
|
||||
|
||||
// matches returns true if `info` reports the same mtime and size we
|
||||
// captured. Inode comparison would be stricter, but mtime+size is
|
||||
// sufficient for the failure modes we care about (rename-over-original
|
||||
// edits change both).
|
||||
func (s fileSnapshot) matches(info os.FileInfo) bool {
|
||||
return s.mtime.Equal(info.ModTime()) && s.size == info.Size()
|
||||
}
|
||||
|
||||
// zoneFile is a file-backed authority for a single DNS zone. Replaces
|
||||
// the Phase-1.3 in-memory recordStore.
|
||||
//
|
||||
@ -102,32 +74,13 @@ func openZoneFile(path, origin string) *zoneFile {
|
||||
// loadRRs reads the zone file and parses it into an RR slice via
|
||||
// miekg/dns's zone parser. The parser handles $ORIGIN, $TTL, multi-line
|
||||
// SOA, comments, includes, etc.
|
||||
//
|
||||
// Returns (rrs, snapshot, error). The snapshot fingerprints the file
|
||||
// identity at read time so a subsequent writeIfUnchanged can detect
|
||||
// concurrent modification.
|
||||
//
|
||||
// Hamilton H4 — strict-parse validation: a single malformed line could
|
||||
// otherwise produce a partial parse where parser.Err() returns nil but
|
||||
// some records silently went missing. To catch this, we enforce a
|
||||
// post-parse invariant: exactly one SOA RR, and that SOA's name equals
|
||||
// the configured zone origin. A zone file that's been partially eaten
|
||||
// by the parser usually loses its SOA along the way — checking SOA
|
||||
// presence catches both H4 (silent truncation) and H3 (multi-SOA or
|
||||
// wrong-apex SOA) with a single guard.
|
||||
func (z *zoneFile) loadRRs() ([]dns.RR, fileSnapshot, error) {
|
||||
func (z *zoneFile) loadRRs() ([]dns.RR, error) {
|
||||
f, err := os.Open(z.Path)
|
||||
if err != nil {
|
||||
return nil, fileSnapshot{}, fmt.Errorf("open %s: %w", z.Path, err)
|
||||
return nil, fmt.Errorf("open %s: %w", z.Path, err)
|
||||
}
|
||||
defer f.Close()
|
||||
|
||||
info, err := f.Stat()
|
||||
if err != nil {
|
||||
return nil, fileSnapshot{}, fmt.Errorf("stat %s: %w", z.Path, err)
|
||||
}
|
||||
snap := fileSnapshot{mtime: info.ModTime(), size: info.Size()}
|
||||
|
||||
parser := dns.NewZoneParser(f, z.Origin, z.Path)
|
||||
parser.SetDefaultTTL(3600)
|
||||
|
||||
@ -136,65 +89,12 @@ func (z *zoneFile) loadRRs() ([]dns.RR, fileSnapshot, error) {
|
||||
rrs = append(rrs, rr)
|
||||
}
|
||||
if err := parser.Err(); err != nil {
|
||||
return nil, snap, fmt.Errorf("parse %s: %w", z.Path, err)
|
||||
return nil, fmt.Errorf("parse %s: %w", z.Path, err)
|
||||
}
|
||||
if len(rrs) == 0 {
|
||||
return nil, snap, fmt.Errorf("%s: zero RRs parsed", z.Path)
|
||||
return nil, fmt.Errorf("%s: zero RRs parsed", z.Path)
|
||||
}
|
||||
|
||||
// H3/H4 invariant: exactly one SOA, anchored at the zone origin.
|
||||
// Refuse to operate on a zone file whose SOA structure is wrong —
|
||||
// any subsequent bumpSerial or write would compound the damage.
|
||||
if err := assertSingleApexSOA(rrs, z.Origin); err != nil {
|
||||
return nil, snap, fmt.Errorf("zone %s integrity check failed: %w", z.Path, err)
|
||||
}
|
||||
|
||||
return rrs, snap, nil
|
||||
}
|
||||
|
||||
// assertSingleApexSOA enforces that rrs contains exactly one SOA and
|
||||
// that its owner matches the zone origin. Returns an error otherwise.
|
||||
// This is the H3+H4 zone-integrity invariant.
|
||||
func assertSingleApexSOA(rrs []dns.RR, origin string) error {
|
||||
origin = canon(origin)
|
||||
var soas []*dns.SOA
|
||||
for _, rr := range rrs {
|
||||
if s, ok := rr.(*dns.SOA); ok {
|
||||
soas = append(soas, s)
|
||||
}
|
||||
}
|
||||
switch len(soas) {
|
||||
case 0:
|
||||
return fmt.Errorf("no SOA record found (expected one at %q)", origin)
|
||||
case 1:
|
||||
if canon(soas[0].Hdr.Name) != origin {
|
||||
return fmt.Errorf("SOA owner is %q, expected zone apex %q", soas[0].Hdr.Name, origin)
|
||||
}
|
||||
return nil
|
||||
default:
|
||||
names := make([]string, len(soas))
|
||||
for i, s := range soas {
|
||||
names[i] = s.Hdr.Name
|
||||
}
|
||||
return fmt.Errorf("multiple SOA records found (%d): %s", len(soas), strings.Join(names, ", "))
|
||||
}
|
||||
}
|
||||
|
||||
// checkUnchanged returns nil if the on-disk file still matches the
|
||||
// captured snapshot. If the file has been modified (mtime or size
|
||||
// differs), returns an error — the caller should refuse the UPDATE
|
||||
// rather than clobber the external change.
|
||||
func (z *zoneFile) checkUnchanged(snap fileSnapshot) error {
|
||||
info, err := os.Stat(z.Path)
|
||||
if err != nil {
|
||||
return fmt.Errorf("stat %s: %w", z.Path, err)
|
||||
}
|
||||
if !snap.matches(info) {
|
||||
return fmt.Errorf("concurrent modification detected on %s: mtime %s/%s size %d/%d",
|
||||
z.Path, snap.mtime.Format(time.RFC3339Nano), info.ModTime().Format(time.RFC3339Nano),
|
||||
snap.size, info.Size())
|
||||
}
|
||||
return nil
|
||||
return rrs, nil
|
||||
}
|
||||
|
||||
// Lookup returns RRs in `rrs` matching (name, rtype). Both name and
|
||||
@ -252,19 +152,13 @@ func removeNameFrom(rrs []dns.RR, name string) []dns.RR {
|
||||
}
|
||||
|
||||
// removeRRFrom returns rrs minus the single RR matching the given one
|
||||
// by owner + type + rdata.
|
||||
//
|
||||
// Hamilton M3: per RFC 2136 §2.5.4, a delete-by-exact-match UPDATE
|
||||
// carries CLASS=NONE and TTL=0 as protocol flags, not as match
|
||||
// criteria. The target must match a stored RR by owner+type+rdata
|
||||
// alone. We normalize both sides to the same class + TTL before
|
||||
// invoking dns.IsDuplicate so the comparison is correct.
|
||||
// by owner + type + rdata. String() comparison covers rdata exactness.
|
||||
func removeRRFrom(rrs []dns.RR, target dns.RR) []dns.RR {
|
||||
targetN := normalizeForCompare(target)
|
||||
targetStr := target.String()
|
||||
out := rrs[:0:0]
|
||||
matched := false
|
||||
for _, rr := range rrs {
|
||||
if !matched && dns.IsDuplicate(normalizeForCompare(rr), targetN) {
|
||||
if !matched && rr.String() == targetStr {
|
||||
matched = true
|
||||
continue
|
||||
}
|
||||
@ -274,30 +168,17 @@ func removeRRFrom(rrs []dns.RR, target dns.RR) []dns.RR {
|
||||
}
|
||||
|
||||
// addRRTo appends rr to rrs unless an identical RR already exists
|
||||
// (de-dupe semantics per RFC 2136 §3.4.2.2). Same normalization as
|
||||
// removeRRFrom — dedupe is TTL- and class-insensitive in the comparison
|
||||
// (though the stored RR retains its original TTL/class).
|
||||
// (de-dupe semantics per RFC 2136 §3.4.2.2).
|
||||
func addRRTo(rrs []dns.RR, rr dns.RR) []dns.RR {
|
||||
rrN := normalizeForCompare(rr)
|
||||
target := rr.String()
|
||||
for _, existing := range rrs {
|
||||
if dns.IsDuplicate(normalizeForCompare(existing), rrN) {
|
||||
if existing.String() == target {
|
||||
return rrs
|
||||
}
|
||||
}
|
||||
return append(rrs, rr)
|
||||
}
|
||||
|
||||
// normalizeForCompare returns a copy of rr with TTL=0 and class=IN so
|
||||
// dns.IsDuplicate can be used to compare by (owner, type, rdata) alone.
|
||||
// Required by RFC 2136 §2.5.4's "TTL and CLASS are flags, not match
|
||||
// criteria" semantics.
|
||||
func normalizeForCompare(rr dns.RR) dns.RR {
|
||||
n := dns.Copy(rr)
|
||||
n.Header().Ttl = 0
|
||||
n.Header().Class = dns.ClassINET
|
||||
return n
|
||||
}
|
||||
|
||||
// serialCounterMul is the multiplier between the date prefix and the
|
||||
// counter in our SOA-serial encoding. The format is YYMMDD*10000 + NNNN,
|
||||
// giving 10000 bumps/day (NNNN ∈ [0001, 9999]). The 2-digit year keeps
|
||||
@ -347,9 +228,9 @@ func bumpSerial(rrs []dns.RR, now time.Time) error {
|
||||
// serial that happens to look like a valid YYMMDD prefix but is in
|
||||
// the past, which is the same handling: jump to today).
|
||||
if curDate := cur[:6]; isValidYYMMDD(curDate) && curDate >= today {
|
||||
nnnn := int(mustParseUint(cur[6:10]))
|
||||
nnnn := atoi(cur[6:10])
|
||||
if nnnn < 9999 {
|
||||
soa.Serial = uint32(mustParseUint(curDate)*serialCounterMul + uint64(nnnn+1))
|
||||
soa.Serial = uint32(parseUint(curDate)*serialCounterMul + uint64(nnnn+1))
|
||||
return nil
|
||||
}
|
||||
// NNNN=9999: roll to next encoded day, NNNN=0001.
|
||||
@ -358,31 +239,17 @@ func bumpSerial(rrs []dns.RR, now time.Time) error {
|
||||
return fmt.Errorf("serial date %q unparseable: %w", curDate, err)
|
||||
}
|
||||
next := d.AddDate(0, 0, 1).Format("060102")
|
||||
soa.Serial = uint32(mustParseUint(next)*serialCounterMul + 1)
|
||||
soa.Serial = uint32(parseUint(next)*serialCounterMul + 1)
|
||||
return nil
|
||||
}
|
||||
|
||||
// Older or unparseable: jump to today*10000+1. Migration path for
|
||||
// legacy YYYYMMDDNN serials lives here.
|
||||
candidate := uint32(mustParseUint(today)*serialCounterMul + 1)
|
||||
|
||||
// H5 — explicit MaxUint32 guard. Plain `>` comparison is correct in
|
||||
// practice (we'd never wrap during the zone's lifetime: 10000
|
||||
// bumps/day × 365 days × ~117 years = ~427M, well under 2^32). The
|
||||
// real failure mode we must prevent is wrap-to-0: if soa.Serial
|
||||
// somehow reached MaxUint32 (hand-edit, fuzz, or a future code path
|
||||
// we haven't written), `soa.Serial++` would wrap to 0, and
|
||||
// downstream secondaries per RFC 1982 treat 0-after-MaxUint32 as
|
||||
// "older" — they refuse to AXFR, and the zone goes dark. Loud
|
||||
// refusal forces the operator to manually reset the serial,
|
||||
// instead of silently bricking the zone.
|
||||
candidate := uint32(parseUint(today)*serialCounterMul + 1)
|
||||
if candidate <= soa.Serial {
|
||||
if soa.Serial == math.MaxUint32 {
|
||||
return fmt.Errorf("SOA serial at uint32 max (%d) — refusing to wrap to 0; operator must reset zone serial manually (see RFC 1982 §3.2)", soa.Serial)
|
||||
}
|
||||
// Defensive monotonic advance for the unusual "current serial
|
||||
// is already > today's new-format minimum" case (e.g., a
|
||||
// hand-edit set it to a far-future value).
|
||||
// Defensive: don't regress. If something has somehow
|
||||
// provisioned a serial >= today's new-format candidate (e.g.,
|
||||
// far-future serial from a hand-edit), just +1 to advance.
|
||||
soa.Serial++
|
||||
return nil
|
||||
}
|
||||
@ -400,20 +267,23 @@ func isValidYYMMDD(s string) bool {
|
||||
return err == nil
|
||||
}
|
||||
|
||||
// mustParseUint parses an all-digit string into uint64. Panics on
|
||||
// malformed input — the caller is responsible for passing only strings
|
||||
// that were validated as digit-substrings (e.g., a fixed-width slice of
|
||||
// a YYMMDDNNNN-formatted serial). Using strconv via this thin wrapper
|
||||
// keeps the panic behavior explicit while sharing stdlib's robust
|
||||
// parsing (Hamilton L1).
|
||||
func mustParseUint(s string) uint64 {
|
||||
n, err := strconv.ParseUint(s, 10, 64)
|
||||
if err != nil {
|
||||
// Programmer error if we ever hit this — every caller passes
|
||||
// digits-only strings derived from time.Format or a sliced
|
||||
// SOA serial. Panic so the bug surfaces in tests/CI rather
|
||||
// than silently producing a 0 serial.
|
||||
panic(fmt.Sprintf("mustParseUint(%q): %v", s, err))
|
||||
// atoi is a tiny helper that ignores errors — only called on a
|
||||
// substring we already validated is two digits.
|
||||
func atoi(s string) int {
|
||||
n := 0
|
||||
for _, c := range s {
|
||||
n = n*10 + int(c-'0')
|
||||
}
|
||||
return n
|
||||
}
|
||||
|
||||
// parseUint parses an all-digits string into a uint64. Used because
|
||||
// strconv.ParseUint adds error-handling overhead we don't need on
|
||||
// internally-controlled inputs.
|
||||
func parseUint(s string) uint64 {
|
||||
var n uint64
|
||||
for _, c := range s {
|
||||
n = n*10 + uint64(c-'0')
|
||||
}
|
||||
return n
|
||||
}
|
||||
@ -475,12 +345,6 @@ func (z *zoneFile) writeAtomic(rrs []dns.RR, now time.Time) error {
|
||||
// repository directory inferred from the zone file's parent. Returns
|
||||
// nil silently if AutoCommit is false. Returns an error if the commit
|
||||
// fails; the caller decides whether to roll back the file write.
|
||||
//
|
||||
// Both git invocations run under a context with a hard timeout
|
||||
// (gitCommandTimeout). If git hangs (NFS stall, gpg-sign prompt,
|
||||
// pre-commit hook waiting on stdin), we kill it rather than block the
|
||||
// caller's per-zone mutex indefinitely. ACME storms must not be able
|
||||
// to wedge the plugin via git getting stuck.
|
||||
func (z *zoneFile) commit(message string) error {
|
||||
if !z.AutoCommit {
|
||||
return nil
|
||||
@ -488,17 +352,15 @@ func (z *zoneFile) commit(message string) error {
|
||||
// We run git from the directory containing the zone file. git will
|
||||
// walk upward to find the .git dir.
|
||||
dir := filepath.Dir(z.Path)
|
||||
ctx, cancel := context.WithTimeout(context.Background(), gitCommandTimeout)
|
||||
defer cancel()
|
||||
// `git add` first; if file is already in the index, no harm done.
|
||||
add := exec.CommandContext(ctx, "git",
|
||||
add := exec.Command("git",
|
||||
"-C", dir,
|
||||
"add", "--", z.Path,
|
||||
)
|
||||
if out, err := add.CombinedOutput(); err != nil {
|
||||
return fmt.Errorf("git add failed: %w: %s", err, strings.TrimSpace(string(out)))
|
||||
}
|
||||
commit := exec.CommandContext(ctx, "git",
|
||||
commit := exec.Command("git",
|
||||
"-C", dir,
|
||||
"-c", "user.name="+z.GitAuthorName,
|
||||
"-c", "user.email="+z.GitAuthorEmail,
|
||||
|
||||
120
zonefile_test.go
120
zonefile_test.go
@ -1,7 +1,6 @@
|
||||
package rfc2136
|
||||
|
||||
import (
|
||||
"math"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"strings"
|
||||
@ -26,7 +25,7 @@ func TestZoneFile_LoadRRs(t *testing.T) {
|
||||
dir := withTempZonesDir(t, "auth.example.com")
|
||||
zf := openZoneFile(filepath.Join(dir, "auth.example.com.zone"), "auth.example.com.")
|
||||
|
||||
rrs, _, err := zf.loadRRs()
|
||||
rrs, err := zf.loadRRs()
|
||||
if err != nil {
|
||||
t.Fatalf("loadRRs: %v", err)
|
||||
}
|
||||
@ -41,7 +40,7 @@ func TestZoneFile_LoadRRs(t *testing.T) {
|
||||
|
||||
func TestZoneFile_LoadRRs_MissingFile(t *testing.T) {
|
||||
zf := openZoneFile("/nope/missing.zone", "missing.")
|
||||
if _, _, err := zf.loadRRs(); err == nil {
|
||||
if _, err := zf.loadRRs(); err == nil {
|
||||
t.Errorf("expected error loading missing file, got nil")
|
||||
}
|
||||
}
|
||||
@ -221,80 +220,6 @@ func TestBumpSerial_NonCalVerFormat_ResetsToToday(t *testing.T) {
|
||||
}
|
||||
}
|
||||
|
||||
// TestBumpSerial_MaxUint32_RefusesWrap covers H5: the defensive
|
||||
// branch must not wrap soa.Serial to 0. Wrap-to-0 makes downstream
|
||||
// secondaries treat the zone as reset per RFC 1982 and refuse to AXFR
|
||||
// the new value, taking the zone dark.
|
||||
func TestBumpSerial_MaxUint32_RefusesWrap(t *testing.T) {
|
||||
rrs := []dns.RR{
|
||||
&dns.SOA{Hdr: dns.RR_Header{Name: "example.com.", Rrtype: dns.TypeSOA},
|
||||
Serial: math.MaxUint32},
|
||||
}
|
||||
now := time.Date(2026, 5, 22, 1, 0, 0, 0, time.UTC)
|
||||
if err := bumpSerial(rrs, now); err == nil {
|
||||
t.Errorf("bumpSerial accepted MaxUint32; expected refusal to prevent wrap-to-0")
|
||||
}
|
||||
if got := rrs[0].(*dns.SOA).Serial; got == 0 {
|
||||
t.Errorf("Serial wrapped to 0 (catastrophic): %d", got)
|
||||
}
|
||||
}
|
||||
|
||||
// TestLoadRRs_NoSOA_Refused covers H4: a zone file that's missing its
|
||||
// SOA (because the parser ate it on a malformed prior line, or because
|
||||
// it was never there) must fail loadRRs rather than be silently treated
|
||||
// as a valid empty-of-SOA zone.
|
||||
func TestLoadRRs_NoSOA_Refused(t *testing.T) {
|
||||
dir := t.TempDir()
|
||||
path := filepath.Join(dir, "noSoA.example.com.zone")
|
||||
content := "$ORIGIN noSoA.example.com.\nfoo.noSoA.example.com. 60 IN A 192.0.2.1\n"
|
||||
if err := os.WriteFile(path, []byte(content), 0644); err != nil {
|
||||
t.Fatalf("write fixture: %v", err)
|
||||
}
|
||||
zf := openZoneFile(path, "noSoA.example.com.")
|
||||
if _, _, err := zf.loadRRs(); err == nil {
|
||||
t.Errorf("loadRRs accepted SOA-less zone file; expected refusal")
|
||||
}
|
||||
}
|
||||
|
||||
// TestLoadRRs_MultipleSOAs_Refused covers H3: two SOAs at the apex
|
||||
// produce inconsistent zone state visible to AXFR clients. We refuse
|
||||
// to operate on such a file rather than bumping only the first SOA.
|
||||
func TestLoadRRs_MultipleSOAs_Refused(t *testing.T) {
|
||||
dir := t.TempDir()
|
||||
path := filepath.Join(dir, "dupe.example.com.zone")
|
||||
content := `$ORIGIN dupe.example.com.
|
||||
dupe.example.com. 3600 IN SOA ns.example. admin.example. 1 60 60 60 60
|
||||
dupe.example.com. 3600 IN SOA ns.example. admin.example. 2 60 60 60 60
|
||||
dupe.example.com. 3600 IN NS ns.example.
|
||||
`
|
||||
if err := os.WriteFile(path, []byte(content), 0644); err != nil {
|
||||
t.Fatalf("write fixture: %v", err)
|
||||
}
|
||||
zf := openZoneFile(path, "dupe.example.com.")
|
||||
if _, _, err := zf.loadRRs(); err == nil {
|
||||
t.Errorf("loadRRs accepted dual-SOA zone; expected refusal")
|
||||
}
|
||||
}
|
||||
|
||||
// TestLoadRRs_NonApexSOA_Refused covers H3: an SOA owned by a name
|
||||
// other than the zone apex is a parse-error red flag and must be
|
||||
// refused, not silently bumped.
|
||||
func TestLoadRRs_NonApexSOA_Refused(t *testing.T) {
|
||||
dir := t.TempDir()
|
||||
path := filepath.Join(dir, "weird.example.com.zone")
|
||||
content := `$ORIGIN weird.example.com.
|
||||
sub.weird.example.com. 3600 IN SOA ns.example. admin.example. 1 60 60 60 60
|
||||
weird.example.com. 3600 IN NS ns.example.
|
||||
`
|
||||
if err := os.WriteFile(path, []byte(content), 0644); err != nil {
|
||||
t.Fatalf("write fixture: %v", err)
|
||||
}
|
||||
zf := openZoneFile(path, "weird.example.com.")
|
||||
if _, _, err := zf.loadRRs(); err == nil {
|
||||
t.Errorf("loadRRs accepted non-apex SOA; expected refusal")
|
||||
}
|
||||
}
|
||||
|
||||
func TestBumpSerial_NoSOA_ReturnsError(t *testing.T) {
|
||||
rrs := []dns.RR{mustRR(t, `foo.example.com. 60 IN A 192.0.2.1`)}
|
||||
now := time.Now()
|
||||
@ -309,7 +234,7 @@ func TestZoneFile_WriteAtomic_RoundTrip(t *testing.T) {
|
||||
zf := openZoneFile(path, "auth.example.com.")
|
||||
zf.AutoCommit = false // not testing commit here
|
||||
|
||||
rrs, _, err := zf.loadRRs()
|
||||
rrs, err := zf.loadRRs()
|
||||
if err != nil {
|
||||
t.Fatalf("initial load: %v", err)
|
||||
}
|
||||
@ -322,7 +247,7 @@ func TestZoneFile_WriteAtomic_RoundTrip(t *testing.T) {
|
||||
}
|
||||
|
||||
// Re-load and verify.
|
||||
after, _, err := zf.loadRRs()
|
||||
after, err := zf.loadRRs()
|
||||
if err != nil {
|
||||
t.Fatalf("re-load: %v", err)
|
||||
}
|
||||
@ -347,7 +272,7 @@ func TestZoneFile_WriteAtomic_LeavesNoTempFile(t *testing.T) {
|
||||
zf := openZoneFile(path, "auth.example.com.")
|
||||
zf.AutoCommit = false
|
||||
|
||||
rrs, _, _ := zf.loadRRs()
|
||||
rrs, _ := zf.loadRRs()
|
||||
_ = zf.writeAtomic(rrs, time.Now())
|
||||
|
||||
// No .rfc2136-*.zone temp files should remain.
|
||||
@ -357,46 +282,13 @@ func TestZoneFile_WriteAtomic_LeavesNoTempFile(t *testing.T) {
|
||||
}
|
||||
}
|
||||
|
||||
// TestZoneFile_CheckUnchanged_DetectsExternalModification covers H1:
|
||||
// between loadRRs and the next writeAtomic, if anything (rsync, manual
|
||||
// edit, git checkout) clobbered the file, checkUnchanged returns an
|
||||
// error so the caller refuses the UPDATE instead of losing the
|
||||
// external write.
|
||||
func TestZoneFile_CheckUnchanged_DetectsExternalModification(t *testing.T) {
|
||||
dir := withTempZonesDir(t, "auth.example.com")
|
||||
path := filepath.Join(dir, "auth.example.com.zone")
|
||||
zf := openZoneFile(path, "auth.example.com.")
|
||||
zf.AutoCommit = false
|
||||
|
||||
_, snap, err := zf.loadRRs()
|
||||
if err != nil {
|
||||
t.Fatalf("loadRRs: %v", err)
|
||||
}
|
||||
|
||||
// Pristine snapshot — checkUnchanged should pass.
|
||||
if err := zf.checkUnchanged(snap); err != nil {
|
||||
t.Errorf("pristine snapshot rejected: %v", err)
|
||||
}
|
||||
|
||||
// Simulate an external editor clobbering the file (rsync-style:
|
||||
// new content, different size, mtime advances).
|
||||
time.Sleep(10 * time.Millisecond) // ensure mtime differs
|
||||
if err := os.WriteFile(path, []byte("; external edit\n$ORIGIN auth.example.com.\nauth.example.com. 3600 IN SOA ns.example. admin.example. 1 60 60 60 60\n"), 0644); err != nil {
|
||||
t.Fatalf("simulated external edit: %v", err)
|
||||
}
|
||||
|
||||
if err := zf.checkUnchanged(snap); err == nil {
|
||||
t.Errorf("checkUnchanged accepted modified file; expected concurrent-modification error")
|
||||
}
|
||||
}
|
||||
|
||||
func TestZoneFile_WriteAtomic_FileEndsWithNewline(t *testing.T) {
|
||||
dir := withTempZonesDir(t, "auth.example.com")
|
||||
path := filepath.Join(dir, "auth.example.com.zone")
|
||||
zf := openZoneFile(path, "auth.example.com.")
|
||||
zf.AutoCommit = false
|
||||
|
||||
rrs, _, _ := zf.loadRRs()
|
||||
rrs, _ := zf.loadRRs()
|
||||
_ = zf.writeAtomic(rrs, time.Now())
|
||||
|
||||
data, _ := os.ReadFile(path)
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user