Commit graph

43 commits

Author SHA1 Message Date
Copilot 3d65c0e8e5 feat(scrape): cron opt-in salvage-before-scrape
setup-cron.sh forwards --salvage-before-scrape to documents scrape for
operators recovering from OOM partials on scheduled runs.
2026-06-03 11:35:50 -05:00
Copilot df76389ca8 docs(scrape): fix branch HEAD stamp after plan 080 commit 2026-06-03 11:32:04 -05:00
Copilot 8684138363 docs(scrape): plan 080 PR body sync for plans 070-079
Record branch HEAD and offline gate in merge-readiness before updating
PR #1538 with the compact operator delta block.
2026-06-03 11:31:27 -05:00
Copilot b71c697530 feat(scrape): cron uses documents scrape with --log-file
Monthly cron now runs the unified documents workflow with teed logs
and paired JSON summaries instead of host scrape shell redirect.
2026-06-03 11:27:12 -05:00
Copilot 759e33efe9 feat(scrape): add --log-file tee to documents scrape
Live runs auto-write logs/documents-scrape-UTC.log and pair JSON
summary with the log basename; optional --log-file overrides the path.
2026-06-03 11:21:59 -05:00
Copilot 33faba74d6 docs(scrape): sync smoke inventory to 23 offline tests
Add print-scrape-summary and scrape-summary-json smokes to the setup
table and update merge-readiness gate count from 21/21 to 23/23.
2026-06-03 11:15:17 -05:00
Copilot c8ed19d26b feat(scrape): per-target JSON summaries in multi-target loops
Validation --per-target and multi-target proof now pass --summary-file
per scrape so each target gets its own operator-*-<target>-UTC summary.
2026-06-03 11:08:44 -05:00
Copilot 8c36fdbdda feat(scrape): auto JSON summary on documents scrape runs
Enable DCE_RUN_SUMMARY_JSON by default for live run-documents-scrape
paths with optional --summary-file override; skip on dry-run/salvage-only.
2026-06-03 10:57:32 -05:00
Copilot a929be48e8 feat(scrape): add print-scrape-summary CLI for JSON artifacts
Pretty-print version-1 scrape summary files with totals table, --oom-only
filter, and stdin support for operator validation/proof outputs.
2026-06-03 10:45:55 -05:00
Copilot dbc887d81c feat(scrape): JSON summary export for operator proof runs
Auto-enable DCE_RUN_SUMMARY_* when proof scrapes, support --log-file,
and recover summary JSON from the teed proof log when file write fails.
2026-06-03 10:35:48 -05:00
Copilot 35a7416d8f feat(scrape): recover JSON summary from host compose run log
Reuse shared recover helper before deleting the temp compose log when
DCE_RUN_SUMMARY_FILE is missing after a successful host scrape.
2026-06-03 10:30:14 -05:00
Copilot fcea842fe3 feat(scrape): recover JSON summary from teed validation log
When DCE_RUN_SUMMARY_FILE is missing after operator validation, extract
the last DCE_JSON_SUMMARY line from the log. Refresh KotOR operator docs.
2026-06-03 10:25:23 -05:00
Copilot 5cfb2ed144 feat(scrape): host compose passthrough for JSON summary
Mount logs/ in compose, map DCE_RUN_SUMMARY_FILE to /logs, and auto-enable
JSON summary beside operator-validation log files when scraping.
2026-06-03 10:18:33 -05:00
Copilot 1dda40ae1b feat(scrape): optional JSON run summary for automation
Emit DCE_JSON_SUMMARY log line and/or write DCE_RUN_SUMMARY_FILE
with per-channel actions and totals after scrape completes.
2026-06-03 10:08:44 -05:00
Copilot aa85fe50fa feat(verify): show per-target container_memory in operator checks
Archive verify table adds MEM column; verify-operator-ready lists
config target memory when global DCE_CONTAINER_MEMORY is unset.
2026-06-03 10:00:27 -05:00
Copilot 8ca55f299b feat(scrape): per-target container_memory in scrape config
Single --target runs apply optional container_memory from
scrape-targets.json when global DCE_CONTAINER_MEMORY is unset.
KotOR_discord_msgs defaults to 8g; scrape.env still overrides.
2026-06-03 09:55:33 -05:00
Copilot d280ba86e9 docs: note channel-scoped prove-incremental-append usage 2026-06-03 09:46:33 -05:00
Copilot 3e96514f3e feat(prove): filter incremental snapshots by --channel
Channel-scoped proof runs snapshot and compare only selected archives,
so yes_general-focused validation ignores unrelated KotOR channels.
Smoke covers filtered snapshot-only mode; exclude .dce-temp from find.
2026-06-03 09:44:33 -05:00
Copilot a827e6b9bc feat(scrape): label OOM skips and hint container memory
Classify aborted/OOM export skips as SKIPPED (OOM/aborted) in the run
summary with salvage/memory guidance; verify-operator-ready shows
configured DCE_CONTAINER_MEMORY.
2026-06-03 09:38:45 -05:00
Copilot e9a3fea9d1 docs(scrape): add OOM, lock, and salvage troubleshooting
Document container OOM skips, scrape-lock contention, partial temp
salvage, and DCE_CONTAINER_MEMORY in the troubleshooting guide and
GUI bridge quick-start.
2026-06-03 09:32:31 -05:00
Copilot 69ce1ca539 feat(scrape): optional DCE_CONTAINER_MEMORY compose mem_limit
Operators can raise the scrape container memory cap for large channel
catch-up (e.g. yes_general) via scrape.env without changing default runs.
2026-06-03 09:23:37 -05:00
Copilot 88267c835c docs(scrape): complete offline smoke inventory in setup guide
Align Recurring-Scrape-Setup smoke table with all 21 offline scripts
and note plan 061 shared scrape-lock library in merge-readiness.
2026-06-03 09:14:04 -05:00
Copilot ad5384ecc1 docs(scrape): add salvage and lock operator playbook
Document scrape-lock-status, reclaim-stale, and salvage-before flags in
operator checklist, merge-readiness, and GUI bridge guide.
2026-06-03 07:10:18 -05:00
Copilot ee62078f5b fix(scrape): skip SIGTERM/SIGINT export aborts like OOM
Stopping validation with kill/Ctrl+C returned exit 143/130 and failed
the whole target instead of SKIPPED + preserve partial. Added smoke for
exit 143; gitignore .dce-scrape.lock.
2026-06-03 06:06:15 -05:00
Copilot b9bb4bbe64 fix(host): flock scrape lock prevents concurrent container exports
Overlapping run-operator-validation invocations spawned twin yes_general
exports and repeated OOM skips. Host scrape now holds .dce-scrape.lock;
smokes bypass via DCE_SKIP_SCRAPE_LOCK. Added lock smoke (20/20 pass).
2026-06-03 06:03:47 -05:00
Copilot 928c0ef682 fix(audit): exclude .dce-temp partial exports from JSON audit
Operator validation failed when yes_general OOM left truncated exports
under .dce-temp. Audit and archive verification now skip in-progress temps;
smoke covers the partial-temp case. KotOR audit passes with temps present.
2026-06-03 05:59:54 -05:00
Copilot 8b54b6a498 test(scrape): preserve-partial smoke; fix host token-file precedence
Add offline regression for OOM skip preserving partial export temps.
Host wrapper now prefers DISCORD_TOKEN_FILE over inherited shell tokens
and always writes explicit compose env for auth-retry. All 19 smokes pass.
2026-06-03 05:52:39 -05:00
Copilot 87537eb8b0 fix(scrape): preserve partial temps on OOM; large-file salvage merge
OOM/aborted channel exports no longer delete partial temp downloads.
Salvage uses grep boundary repair with python merge/validate for files
over 64 MiB. Retain stale temps when merge fails instead of discarding.
2026-06-03 05:35:22 -05:00
Copilot 87284816d0 test(scrape): add abort exit 134 skip smoke; plan 041 closure
Extend run-discord-scrape-smoke with skip-abort target so OOM/abort
channel skip from plan 040 has offline regression coverage. Update
merge-readiness for 2026-05-30 and KotOR validation retry in progress.
2026-06-03 00:57:11 -05:00
Copilot 1608e7cfb0 fix(scrape): skip channels on OOM/abort export exit codes
Treat CLI exit 134/137/139 and abort/OOM log patterns as skippable
so KotOR yes_general core dump does not fail the entire target scrape.
2026-06-03 00:44:06 -05:00
Copilot bc1f727907 feat(scrape): complete validation resume (8/9 targets)
Resume per-target validation for five remaining servers; clarify
validation log labels (begin/done/failed). Document 8/9 pass in
merge-readiness; KotOR_discord_msgs fails on yes_general export.
2026-05-29 23:35:35 -05:00
Copilot b089137c52 docs(scrape): record per-target validation outcomes (plan 037)
Document full-validation-latest.log results in merge-readiness:
four targets scrape+audit pass; KotOR_discord_msgs and remainder
documented as pending while long-running validation continues.
2026-05-29 21:56:00 -05:00
Copilot 0b242ddfc4 docs(scrape): stamp merge-ready after host validation
Document offline/live/cron gates; align operator checklist with
run-operator-proof.
2026-05-29 16:37:57 -05:00
Boden a4f080e6d9 docs(scrape): record live operator proof on eod_discord
Host validation passed with podman-compose and GUI token sync; note disk
headroom before large archive merges.
2026-05-29 16:36:02 -05:00
Boden 65c9fb2206 feat(scrape): operator proof script and podman-compose smoke fix
Add run-operator-proof for one-target handoff/scrape/prove flows.
Prefer podman-compose on Podman hosts but honor DCE_DOCKER_BIN overrides
so offline smokes keep using fake compose shims.
2026-05-29 16:20:25 -05:00
Boden 9c22a3efee docs(scrape): track GUI zip bridge doc in source repo
Add docs/gui-zip-recurring-scrape-bridge.md and cross-links so GUI-only
users have a versioned quick-start beside the linux-x64 zip folder.
2026-05-29 16:08:36 -05:00
Boden c0818715a8 feat(scrape): add operator-handoff verification script
Single entrypoint runs disk summary, verify-operator-ready, and
run-documents-scrape --dry-run before cron or full scrapes.
2026-05-29 16:03:22 -05:00
Boden 44eadee634 feat(scrape): disk preflight on host runner for cron jobs
run-discord-scrape-host.sh runs verify --disk-only before preflight/scrape
so setup-cron monthly jobs fail fast when archive roots are low on space.
Harden bootstrap smoke to surface failures when dry-run fails.
2026-05-29 16:00:11 -05:00
Boden 1142e376b5 fix(scrape): disk preflight before compose and skippable disk errors
Fail fast when archive or repo paths lack free space (DCE_MIN_FREE_MB),
treat disk-full export failures as skippable channels, and add an offline
disk-space smoke. Smokes default DCE_MIN_FREE_MB=0 so CI stays portable.
2026-05-29 15:27:39 -05:00
Boden 76b4231d7a feat(scrape): per-target validation with continue-on-error
Run scrape and audit per enabled server independently; log summary
counts. Full host validation started via --per-target --continue-on-error.
2026-05-29 14:20:37 -05:00
Boden 1742a9d41e feat(scrape): add run-operator-validation orchestrator
Sync GUI token, verify readiness, run documents scrape, and audit JSON
with timestamped logs. Live eod_discord validation passed on host.
2026-05-29 14:19:04 -05:00
Boden 00bcbc5b21 feat(scrape): add verify-operator-ready host checks
One command validates compose, auth, config, and seeded archives before
bootstrap or cron. Includes offline smoke test (14 smokes total).
2026-05-29 14:16:10 -05:00
Boden 927d5e9607 docs(scrape): add merge readiness index and doc cross-links
Single reviewer/operator page for the recurring scrape feature with
validation commands; link from root and .docs indexes.
2026-05-29 14:14:44 -05:00