2 Commits

Author SHA1 Message Date
25a34cd24d Add atomic .part staging + runtime download root tools
Atomic write pattern (tier-3 polish from headless test finding):
- download_to_file now writes to <dest>.part and renames to <dest> only on
  successful stream completion (os.replace is POSIX-atomic). Failed
  downloads leave only the .part file — no misleading 0-byte dest files
  in the user's downloads directory.
- Resume logic reads from <dest>.part instead of <dest>; the user's
  directory only ever contains complete files or clearly-marked .part files.
- New `already_complete` short-circuit: if dest exists and no .part, skip
  the network entirely (still re-verify MD5 if requested). The headless
  Claude test confirmed this avoids redundant CDN load.
- Symlink rejection re-added at the new code path: even though os.replace
  would only replace (not follow) a symlink at dest, predictable refusal
  beats silent symlink removal.

Runtime download root tools (for stdio MCP mode):
- get_download_root(): reports current root, source (env var vs default),
  existence, writability.
- set_download_root(path): change MCARCHIVE_DOWNLOAD_ROOT mid-session.
  Expands ~, creates the dir, refuses system paths
  (/, /etc, /usr, /bin, /sbin, /var, /sys, /proc, /dev, /boot, /root).
  The lazy-resolved root means the change takes effect on the next
  download_file call without restarting the server.

14 new tests (66 total, all green, ruff clean):
- 4 staging tests: failed download leaves no dest, success leaves no .part,
  already_complete short-circuit, MD5 verification on existing files
- 6 root-tools tests: env reporting, default reporting, ~ expansion,
  system-dir refusal (parametrized), set→download takes effect immediately
- 4 existing tests rewritten to use .part as the resume staging file

Headless Claude smoke test verified end-to-end: get_download_root →
set_download_root → search → list → download → second download
short-circuits with already_complete=true and zero network bytes.
2026-04-21 21:11:56 -06:00
6198defeca Resilience: address Hamilton tier-2 findings
H7 — Process-wide shared httpx.AsyncClient via get_shared_client().
Each tool call no longer pays a TCP+TLS handshake; connection pool is
reused across the server's lifetime. Tests inject mock transports
directly via ArchiveClient(transport=...) so the singleton stays clean.

M1 — Retry/backoff on 429/502/503/504 with Retry-After honored
(both delta-seconds and HTTP-date forms). Exponential backoff with
jitter, capped at 30s, max 3 attempts. Applied to both _fetch_json
and stream_file (retry happens BEFORE any bytes are yielded so it
can't corrupt a partial write).

M2 — Per-(identifier, filename) asyncio.Lock in download_file
serializes concurrent downloads of the same file inside one process.
Different files still download in parallel.

M5 — collection field normalized to list[str] in all output paths
(search docs, scrape items, item metadata). LLMs can write
`if 'foo' in doc['collection']` without checking the type first.

M7 — `is_collection: bool` derived from mediatype on every doc /
metadata response, so LLMs can route collection containers vs.
real media items without re-querying.

H1 — Stream-abort errors (httpx.ReadError, RemoteProtocolError,
ConnectError, ReadTimeout) caught and re-raised as ArchiveError
with bytes-written context so the caller knows where the partial
download ended. Bytes already on disk remain valid for resume.

19 new regression tests (52 total, all green, ruff clean):
- 4 tests covering retry/backoff, exhaustion, HTTP-date Retry-After
- 1 test for stream-abort byte-count surfacing
- 6 tests for collection normalization shapes
- 4 tests for is_collection in real tool flow + shared client lifecycle
- 2 tests verifying download lock: same-file serialized, different files parallel
2026-04-21 20:24:21 -06:00