diff --git a/.docs/Recurring-Scrape-Setup.md b/.docs/Recurring-Scrape-Setup.md index 42f5242a..f5076a98 100644 --- a/.docs/Recurring-Scrape-Setup.md +++ b/.docs/Recurring-Scrape-Setup.md @@ -336,7 +336,7 @@ Space requirements: ## Smoke test validation -Run the full offline suite from the repo root (requires `jq`). **23 offline smokes** run by default; add `--include-container` for a 24th local-only check: +Run the full offline suite from the repo root (requires `jq`). **24 offline smokes** run by default; add `--include-container` for a 25th local-only check: ```bash ./scripts/run-all-smokes.sh @@ -366,6 +366,7 @@ With Docker/Podman, include the container smoke: | `end-to-end-preflight-smoke.sh` | yes | Preflight wiring | | `error-path-smoke.sh` | yes | Failure paths | | `gh-approve-pr-runs-smoke.sh` | yes | Fork PR workflow helper | +| `kotor-yes-general-catchup-smoke.sh` | yes | KotOR yes_general wrapper dry-run | | `operator-handoff-smoke.sh` | yes | Operator handoff dry-run | | `print-scrape-summary-smoke.sh` | yes | JSON summary pretty-print CLI | | `prove-incremental-append-smoke.sh` | yes | Offline prove snapshot/compare | diff --git a/docs/plans/2026-06-04-083-feat-kotor-yes-general-wrapper-plan.md b/docs/plans/2026-06-04-083-feat-kotor-yes-general-wrapper-plan.md new file mode 100644 index 00000000..ab56db36 --- /dev/null +++ b/docs/plans/2026-06-04-083-feat-kotor-yes-general-wrapper-plan.md @@ -0,0 +1,51 @@ +--- +title: "feat: KotOR yes_general catch-up wrapper" +type: feat +status: complete +date: 2026-06-04 +origin: /lfg — plan 082 deferred live KotOR catch-up; encode operator path as one script +--- + +# feat: KotOR yes_general catch-up wrapper + +## Summary + +Add `scripts/run-kotor-yes-general-catchup.sh` — thin wrapper for channel `221726893064454144` with default log/summary paths, `--salvage-before-scrape`, and subcommands for validation/prove/salvage-only/dry-run. + +## Problem Frame + +KotOR yes_general catch-up is documented in five places with long flag chains. Operators need one entry point; LFG still cannot run live Discord scrape in CI. + +## Requirements + +| ID | Requirement | +|----|-------------| +| R1 | Default live run: `--salvage-before-scrape` + documents scrape + `--log-file logs/kotor-yes-general.log` | +| R2 | `--dry-run`, `--salvage-only`, `--validation`, `--prove` modes | +| R3 | `--log-file` and `--config` overrides | +| R4 | Prints summary inspect hint after live scrape | +| R5 | `kotor-yes-general-catchup-smoke.sh` dry-run passes offline | +| R6 | Docs updated; smoke count **24/24** in setup doc + merge-readiness | +| R7 | `DCE_MIN_FREE_MB=0 ./scripts/run-all-smokes.sh` → 24/24 | + +## Implementation Units + +### U1. Wrapper script + +**Files:** `scripts/run-kotor-yes-general-catchup.sh`, `scripts/tests/kotor-yes-general-catchup-smoke.sh` + +### U2. Docs + +**Files:** `docs/recurring-scrape-merge-readiness.md`, `docs/recurring-scrape-operator-checklist.md`, `.docs/Recurring-Scrape-Setup.md` + +## Verification + +```bash +DCE_MIN_FREE_MB=0 ./scripts/run-all-smokes.sh +``` + +## Scope Boundaries + +### Deferred + +- Running live KotOR catch-up inside LFG/CI (operator host only) diff --git a/docs/recurring-scrape-merge-readiness.md b/docs/recurring-scrape-merge-readiness.md index 303e7a47..bd2d92e9 100644 --- a/docs/recurring-scrape-merge-readiness.md +++ b/docs/recurring-scrape-merge-readiness.md @@ -4,7 +4,7 @@ | Gate | Status | |------|--------| -| Offline smokes (`run-all-smokes.sh`) | 23/23 pass | +| Offline smokes (`run-all-smokes.sh`) | 24/24 pass | | Branch HEAD (fork) | `18a22a6` — PR #1538 pruned stale Latest blocks (plan 082) | | Live proof (`run-operator-proof.sh --sync-gui --target eod_discord`) | Passed on maintainer host | | Monthly cron (`setup-cron.sh`) | Installed (`00 04 1 * *`); dry-run preflight OK for all enabled targets | @@ -147,6 +147,13 @@ DCE_MIN_FREE_MB=0 ./scripts/run-operator-validation.sh --target KotOR_discord_ms Large `yes_general` may still skip without a higher container cap; `KotOR_discord_msgs` sets `container_memory: "8g"` in `scrape-targets.json` for single-target runs (override globally with `DCE_CONTAINER_MEMORY` in `scrape.env`): +```bash +./scripts/run-kotor-yes-general-catchup.sh +# writes logs/kotor-yes-general.log + .summary.json; --dry-run | --validation | --prove +``` + +Manual equivalent: + ```bash DCE_MIN_FREE_MB=0 ./scripts/run-operator-validation.sh \ --salvage-before-scrape --target KotOR_discord_msgs \ @@ -195,6 +202,8 @@ DCE_MIN_FREE_MB=0 ./scripts/run-operator-validation.sh \ **Plan 082 (2026-06-04):** PR #1538 pruned 30+ stale `Latest` blocks; single plans 070–081 operator delta remains. +**Plan 083 (2026-06-04):** `run-kotor-yes-general-catchup.sh` — one-command yes_general path (salvage-before, log, summary hint). + **Disk:** ~65 GiB free on `/home` (2026-05-30); large channel merges still need headroom. ## CI note (fork PRs) diff --git a/docs/recurring-scrape-operator-checklist.md b/docs/recurring-scrape-operator-checklist.md index bebf1599..388f3549 100644 --- a/docs/recurring-scrape-operator-checklist.md +++ b/docs/recurring-scrape-operator-checklist.md @@ -60,7 +60,14 @@ Salvage then incremental scrape: # Live documents scrape auto-tees to logs/documents-scrape-.log (or --log-file); summary at .summary.json ``` -**KotOR yes_general** (`221726893064454144`): first catch-up after a 2021 archive cursor can take hours and may OOM; salvage preserved partials before retrying. Stop duplicate validation processes (MyBook vs Downloads checkouts share the same lock). `KotOR_discord_msgs` sets `container_memory: "8g"` in `scrape-targets.json` for single-target runs; override globally with `DCE_CONTAINER_MEMORY` in `scrape.env` if needed. Channel-scoped proof: +**KotOR yes_general** (`221726893064454144`): first catch-up after a 2021 archive cursor can take hours and may OOM; salvage preserved partials before retrying. One-command path: + +```bash +./scripts/run-kotor-yes-general-catchup.sh +# Or: --dry-run | --salvage-only | --validation | --prove +``` + +Manual equivalent: ```bash ./scripts/run-operator-validation.sh --salvage-before-scrape \ diff --git a/scripts/run-kotor-yes-general-catchup.sh b/scripts/run-kotor-yes-general-catchup.sh new file mode 100755 index 00000000..8fa80a59 --- /dev/null +++ b/scripts/run-kotor-yes-general-catchup.sh @@ -0,0 +1,122 @@ +#!/usr/bin/env bash + +set -Eeuo pipefail + +SCRIPT_DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd -P) +REPO_ROOT="${DCE_REPO_ROOT:-$(cd "$SCRIPT_DIR/.." && pwd -P)}" +CONFIG_PATH="${DCE_CONFIG_FILE:-$REPO_ROOT/config/scrape-targets.json}" +LOG_DIR="${DCE_LOG_DIR:-$REPO_ROOT/logs}" +LOG_FILE="${DCE_KOTOR_LOG_FILE:-$LOG_DIR/kotor-yes-general.log}" +DOCUMENTS="$REPO_ROOT/scripts/run-documents-scrape.sh" +VALIDATION="$REPO_ROOT/scripts/run-operator-validation.sh" +PROVE="$REPO_ROOT/scripts/prove-incremental-append.sh" +PRINT_SUMMARY="$REPO_ROOT/scripts/print-scrape-summary.sh" + +TARGET=KotOR_discord_msgs +CHANNEL=221726893064454144 + +usage() { + cat <&2 + exit 1 +} + +main() { + local dry_run=0 salvage_only=0 validation=0 prove=0 + + while (($#)); do + case "$1" in + --dry-run) + dry_run=1 + shift + ;; + --salvage-only) + salvage_only=1 + shift + ;; + --validation) + validation=1 + shift + ;; + --prove) + prove=1 + shift + ;; + --log-file) + [[ $# -ge 2 ]] || die "Missing value for --log-file." + LOG_FILE=$2 + shift 2 + ;; + --config) + [[ $# -ge 2 ]] || die "Missing value for --config." + CONFIG_PATH=$2 + shift 2 + ;; + --help|-h) + usage + exit 0 + ;; + *) + die "Unknown option: $1" + ;; + esac + done + + local modes=0 + (( dry_run == 1 )) && modes=$((modes + 1)) + (( salvage_only == 1 )) && modes=$((modes + 1)) + (( validation == 1 )) && modes=$((modes + 1)) + (( prove == 1 )) && modes=$((modes + 1)) + (( modes > 1 )) && die "Use only one of --dry-run, --salvage-only, --validation, or --prove." + + local -a common=(--config "$CONFIG_PATH" --target "$TARGET" --channel "$CHANNEL") + + if (( prove == 1 )); then + exec "$PROVE" "${common[@]}" + fi + + if (( validation == 1 )); then + exec "$VALIDATION" --salvage-before-scrape "${common[@]}" --log-file "$LOG_FILE" + fi + + if (( dry_run == 1 )); then + exec "$DOCUMENTS" --dry-run "${common[@]}" + fi + + if (( salvage_only == 1 )); then + exec "$DOCUMENTS" --salvage-only "${common[@]}" + fi + + printf 'KotOR yes_general catch-up: target=%s channel=%s\n' "$TARGET" "$CHANNEL" + printf 'Log file: %s\n' "$LOG_FILE" + "$DOCUMENTS" --salvage-before-scrape "${common[@]}" --log-file "$LOG_FILE" + local st=$? + printf 'Inspect summary: %s %s.summary.json\n' "$PRINT_SUMMARY" "${LOG_FILE%.log}" + exit "$st" +} + +main "$@" diff --git a/scripts/tests/kotor-yes-general-catchup-smoke.sh b/scripts/tests/kotor-yes-general-catchup-smoke.sh new file mode 100755 index 00000000..62491764 --- /dev/null +++ b/scripts/tests/kotor-yes-general-catchup-smoke.sh @@ -0,0 +1,61 @@ +#!/usr/bin/env bash + +set -Eeuo pipefail + +REPO_ROOT=$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd -P) +RUNNER="$REPO_ROOT/scripts/run-kotor-yes-general-catchup.sh" +TMP_DIR=$(mktemp -d "${TMPDIR:-/tmp}/dce-kotor-catchup-smoke.XXXXXX") +ARCHIVE="$TMP_DIR/archive/kotor" +CONFIG_PATH="$TMP_DIR/config.json" + +cleanup() { + rm -rf "$TMP_DIR" +} +trap cleanup EXIT + +mkdir -p "$ARCHIVE" +printf '{"messages":[{"id":"1"}],"channel":{"id":"221726893064454144"}}\n' \ + >"$ARCHIVE/Guild - yes_general [221726893064454144].json" + +cat >"$CONFIG_PATH" <"$OUT" 2>&1 + +grep -q 'KotOR_discord_msgs' "$OUT" || { + printf 'ERROR: dry-run missing target in plan output\n' >&2 + cat "$OUT" >&2 + exit 1 +} +grep -q '221726893064454144' "$RUNNER" || { + printf 'ERROR: wrapper missing yes_general channel id\n' >&2 + exit 1 +} +grep -q 'Documents scrape run plan' "$OUT" || { + printf 'ERROR: dry-run missing documents scrape plan\n' >&2 + exit 1 +} +grep -q 'JSON summary file:' "$OUT" && { + printf 'ERROR: dry-run should not enable JSON summary\n' >&2 + exit 1 +} + +HELP=$("$RUNNER" --help 2>&1) +grep -q 'yes_general' <<<"$HELP" || { + printf 'ERROR: help missing yes_general reference\n' >&2 + exit 1 +} + +printf 'kotor-yes-general-catchup-smoke: ok\n'