feat(scrape): shared KotOR catch-up hint lib (plan 086)

Extract kotor-catchup-hint.sh for operator-handoff, verify-operator-ready,
and bootstrap; extend smokes and scrape.env.example.
This commit is contained in:
Copilot 2026-06-03 12:37:32 -05:00
parent 79129433c7
commit e1667c6cf0
9 changed files with 142 additions and 15 deletions

View file

@ -0,0 +1,47 @@
---
title: "feat: shared KotOR catch-up hint lib"
type: feat
status: complete
date: 2026-06-04
origin: /lfg — plan 085 added handoff hint; verify-operator-ready and bootstrap still omit KotOR wrapper
---
# feat: shared KotOR catch-up hint lib
## Summary
Extract `scripts/lib/kotor-catchup-hint.sh` from operator-handoff and call it from `verify-operator-ready.sh` and `bootstrap-recurring-scrape.sh` when `KotOR_discord_msgs` is enabled.
## Requirements
| ID | Requirement |
|----|-------------|
| R1 | Shared lib prints wrapper + summary inspect lines |
| R2 | `operator-handoff.sh` sources lib (no duplicated jq logic) |
| R3 | `verify-operator-ready.sh` prints hint after "Operator ready. Next:" |
| R4 | `bootstrap-recurring-scrape.sh` prints hint after dry-run and live bootstrap complete |
| R5 | `scrape.env.example` cites `run-kotor-yes-general-catchup.sh` for yes_general |
| R6 | Smokes assert hint from verify + bootstrap KotOR fixtures |
| R7 | `DCE_MIN_FREE_MB=0 ./scripts/run-all-smokes.sh` → 24/24 |
## Implementation Units
### U1. Lib + script wiring
**Files:** `scripts/lib/kotor-catchup-hint.sh`, `scripts/operator-handoff.sh`, `scripts/verify-operator-ready.sh`, `scripts/bootstrap-recurring-scrape.sh`, `scrape.env.example`
### U2. Smokes + docs
**Files:** `scripts/tests/verify-operator-ready-smoke.sh`, `scripts/tests/bootstrap-recurring-scrape-smoke.sh`, `docs/recurring-scrape-merge-readiness.md`
## Verification
```bash
DCE_MIN_FREE_MB=0 ./scripts/run-all-smokes.sh
```
## Scope Boundaries
### Deferred
- Live KotOR catch-up on operator host

View file

@ -208,6 +208,8 @@ DCE_MIN_FREE_MB=0 ./scripts/run-operator-validation.sh \
**Plan 085 (2026-06-04):** `operator-handoff.sh` prints KotOR catch-up hint when `KotOR_discord_msgs` is enabled. **Plan 085 (2026-06-04):** `operator-handoff.sh` prints KotOR catch-up hint when `KotOR_discord_msgs` is enabled.
**Plan 086 (2026-06-04):** Shared `lib/kotor-catchup-hint.sh`; verify-operator-ready + bootstrap print same hint.
**Disk:** ~65 GiB free on `/home` (2026-05-30); large channel merges still need headroom. **Disk:** ~65 GiB free on `/home` (2026-05-30); large channel merges still need headroom.
## CI note (fork PRs) ## CI note (fork PRs)

View file

@ -20,8 +20,8 @@ DCE_GID=1000
DCE_USERNS_MODE= DCE_USERNS_MODE=
# Optional: raise scrape container memory for multi-year channel catch-up (yes_general, etc.). # Optional: raise scrape container memory for multi-year channel catch-up (yes_general, etc.).
# KotOR yes_general one-command path: ./scripts/run-kotor-yes-general-catchup.sh
# Examples: 8g, 8192m. Default 0 = no compose memory cap. # Examples: 8g, 8192m. Default 0 = no compose memory cap.
# Optional: raise container memory for large multi-year channel catch-up (compose mem_limit).
# Per-target: set container_memory on a target in config/scrape-targets.json (single --target runs). # Per-target: set container_memory on a target in config/scrape-targets.json (single --target runs).
# Global override (wins over config): uncomment below. # Global override (wins over config): uncomment below.
# DCE_CONTAINER_MEMORY=8g # DCE_CONTAINER_MEMORY=8g

View file

@ -10,6 +10,8 @@ COMPOSE_FILE="${DCE_COMPOSE_FILE:-$REPO_ROOT/docker-compose.yml}"
HOST_RUNNER="$REPO_ROOT/scripts/run-discord-scrape-host.sh" HOST_RUNNER="$REPO_ROOT/scripts/run-discord-scrape-host.sh"
VERIFY_SCRIPT="$REPO_ROOT/scripts/verify-documents-archives.sh" VERIFY_SCRIPT="$REPO_ROOT/scripts/verify-documents-archives.sh"
SETUP_AUTH="$REPO_ROOT/scripts/setup-scrape-auth.sh" SETUP_AUTH="$REPO_ROOT/scripts/setup-scrape-auth.sh"
# shellcheck source=lib/kotor-catchup-hint.sh
source "$SCRIPT_DIR/lib/kotor-catchup-hint.sh"
DRY_RUN=0 DRY_RUN=0
SKIP_BUILD=0 SKIP_BUILD=0
@ -118,6 +120,7 @@ main() {
if (( DRY_RUN == 1 )); then if (( DRY_RUN == 1 )); then
printf 'Dry run complete: archive paths verified under configured output_dir values.\n' printf 'Dry run complete: archive paths verified under configured output_dir values.\n'
printf 'Next: cp scrape.env.example scrape.env, set DISCORD_TOKEN, then rerun without --dry-run.\n' printf 'Next: cp scrape.env.example scrape.env, set DISCORD_TOKEN, then rerun without --dry-run.\n'
print_kotor_catchup_hint "$CONFIG_PATH"
exit 0 exit 0
fi fi
@ -161,6 +164,7 @@ main() {
printf ' Scrape now: %s\n' "$REPO_ROOT/scripts/run-documents-scrape.sh" printf ' Scrape now: %s\n' "$REPO_ROOT/scripts/run-documents-scrape.sh"
printf ' Install cron: %s --dry-run\n' "$REPO_ROOT/scripts/setup-cron.sh" printf ' Install cron: %s --dry-run\n' "$REPO_ROOT/scripts/setup-cron.sh"
print_kotor_catchup_hint "$CONFIG_PATH"
} }
main "$@" main "$@"

View file

@ -0,0 +1,19 @@
#!/usr/bin/env bash
# Shared post-check hint when KotOR_discord_msgs is enabled in scrape config.
kotor_catchup_enabled() {
local config_path=$1
command -v jq >/dev/null 2>&1 || return 1
[[ -f "$config_path" ]] || return 1
jq -e '.targets[] | select(.name == "KotOR_discord_msgs" and (.enabled // true) == true)' \
"$config_path" >/dev/null 2>&1
}
print_kotor_catchup_hint() {
local config_path=$1
kotor_catchup_enabled "$config_path" || return 0
printf '\nKotOR yes_general catch-up (channel 221726893064454144):\n'
printf ' ./scripts/run-kotor-yes-general-catchup.sh\n'
printf ' ./scripts/print-scrape-summary.sh logs/kotor-yes-general.summary.json\n'
}

View file

@ -6,6 +6,8 @@ SCRIPT_DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd -P)
REPO_ROOT="${DCE_REPO_ROOT:-$(cd "$SCRIPT_DIR/.." && pwd -P)}" REPO_ROOT="${DCE_REPO_ROOT:-$(cd "$SCRIPT_DIR/.." && pwd -P)}"
# shellcheck source=lib/scrape-run-plan.sh # shellcheck source=lib/scrape-run-plan.sh
source "$SCRIPT_DIR/lib/scrape-run-plan.sh" source "$SCRIPT_DIR/lib/scrape-run-plan.sh"
# shellcheck source=lib/kotor-catchup-hint.sh
source "$SCRIPT_DIR/lib/kotor-catchup-hint.sh"
CONFIG_PATH="${DCE_CONFIG_FILE:-$REPO_ROOT/config/scrape-targets.json}" CONFIG_PATH="${DCE_CONFIG_FILE:-$REPO_ROOT/config/scrape-targets.json}"
VERIFY_READY="$REPO_ROOT/scripts/verify-operator-ready.sh" VERIFY_READY="$REPO_ROOT/scripts/verify-operator-ready.sh"
DOCUMENTS_SCRAPE="$REPO_ROOT/scripts/run-documents-scrape.sh" DOCUMENTS_SCRAPE="$REPO_ROOT/scripts/run-documents-scrape.sh"
@ -60,19 +62,6 @@ require_command() {
command -v "$1" >/dev/null 2>&1 || die "Required command '$1' is missing." command -v "$1" >/dev/null 2>&1 || die "Required command '$1' is missing."
} }
kotor_catchup_enabled() {
require_command jq
jq -e '.targets[] | select(.name == "KotOR_discord_msgs" and (.enabled // true) == true)' \
"$CONFIG_PATH" >/dev/null 2>&1
}
print_kotor_catchup_hint() {
kotor_catchup_enabled || return 0
printf '\nKotOR yes_general catch-up (channel 221726893064454144):\n'
printf ' ./scripts/run-kotor-yes-general-catchup.sh\n'
printf ' ./scripts/print-scrape-summary.sh logs/kotor-yes-general.summary.json\n'
}
main() { main() {
while (($#)); do while (($#)); do
case "$1" in case "$1" in
@ -154,7 +143,7 @@ main() {
if (( ! SALVAGE_ONLY )); then if (( ! SALVAGE_ONLY )); then
printf ' ./scripts/setup-cron.sh --dry-run\n' printf ' ./scripts/setup-cron.sh --dry-run\n'
fi fi
print_kotor_catchup_hint print_kotor_catchup_hint "$CONFIG_PATH"
} }
main "$@" main "$@"

View file

@ -52,4 +52,38 @@ if [[ "$bootstrap_status" -ne 0 ]] || ! grep -q 'Dry run complete' <<<"$bootstra
exit 1 exit 1
fi fi
KOTOR_ARCHIVE="$TMP_DIR/archive/kotor"
mkdir -p "$KOTOR_ARCHIVE"
printf '{"messages":[{"id":"1","timestamp":"2020-01-01T00:00:00+00:00"}],"channel":{"id":"221726893064454144"}}\n' \
>"$KOTOR_ARCHIVE/Guild - yes_general [221726893064454144].json"
KOTOR_CONFIG="$TMP_DIR/kotor-config.json"
cat >"$KOTOR_CONFIG" <<JSON
{
"archive_root": "$TMP_DIR",
"targets": [
{
"name": "KotOR_discord_msgs",
"kind": "guild",
"output_dir": "$KOTOR_ARCHIVE",
"enabled": true
}
]
}
JSON
set +e
kotor_output=$("$BOOTSTRAP" --dry-run --config "$KOTOR_CONFIG" 2>&1)
kotor_status=$?
set -e
if [[ "$kotor_status" -ne 0 ]] || ! grep -q 'Dry run complete' <<<"$kotor_output"; then
printf 'bootstrap KotOR dry-run failed (status=%s)\n' "$kotor_status" >&2
printf '%s\n' "$kotor_output" >&2
exit 1
fi
grep -q 'run-kotor-yes-general-catchup.sh' <<<"$kotor_output" || {
printf 'bootstrap dry-run missing KotOR catch-up hint\n' >&2
exit 1
}
printf 'bootstrap-recurring-scrape-smoke: ok\n' printf 'bootstrap-recurring-scrape-smoke: ok\n'

View file

@ -92,4 +92,33 @@ if DCE_MIN_FREE_MB=0 DCE_REPO_ROOT="$REPO_ROOT" DCE_CONFIG_FILE="$CONFIG_PATH" D
exit 1 exit 1
fi fi
KOTOR_ARCHIVE="$TMP_DIR/archive/kotor"
mkdir -p "$KOTOR_ARCHIVE"
printf '{"messages":[{"id":"1"}],"channel":{"id":"221726893064454144"}}\n' \
>"$KOTOR_ARCHIVE/Guild - yes_general [221726893064454144].json"
KOTOR_CONFIG="$TMP_DIR/kotor-config.json"
cat >"$KOTOR_CONFIG" <<JSON
{
"archive_root": "$ARCHIVE_ROOT",
"targets": [
{
"name": "KotOR_discord_msgs",
"kind": "guild",
"output_dir": "$KOTOR_ARCHIVE",
"enabled": true
}
]
}
JSON
kotor_output=$(
DCE_MIN_FREE_MB=0 DCE_REPO_ROOT="$REPO_ROOT" DCE_CONFIG_FILE="$KOTOR_CONFIG" DCE_ENV_FILE="$ENV_PATH" \
"$VERIFY" --config "$KOTOR_CONFIG" 2>&1
)
grep -q 'run-kotor-yes-general-catchup.sh' <<<"$kotor_output" || {
printf 'ERROR: verify-operator-ready missing KotOR catch-up hint\n' >&2
printf '%s\n' "$kotor_output" >&2
exit 1
}
printf 'verify-operator-ready-smoke: ok\n' printf 'verify-operator-ready-smoke: ok\n'

View file

@ -6,6 +6,8 @@ SCRIPT_DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd -P)
REPO_ROOT="${DCE_REPO_ROOT:-$(cd "$SCRIPT_DIR/.." && pwd -P)}" REPO_ROOT="${DCE_REPO_ROOT:-$(cd "$SCRIPT_DIR/.." && pwd -P)}"
# shellcheck source=lib/scrape-run-plan.sh # shellcheck source=lib/scrape-run-plan.sh
source "$SCRIPT_DIR/lib/scrape-run-plan.sh" source "$SCRIPT_DIR/lib/scrape-run-plan.sh"
# shellcheck source=lib/kotor-catchup-hint.sh
source "$SCRIPT_DIR/lib/kotor-catchup-hint.sh"
CONFIG_PATH="${DCE_CONFIG_FILE:-$REPO_ROOT/config/scrape-targets.json}" CONFIG_PATH="${DCE_CONFIG_FILE:-$REPO_ROOT/config/scrape-targets.json}"
ENV_FILE="${DCE_ENV_FILE:-$REPO_ROOT/scrape.env}" ENV_FILE="${DCE_ENV_FILE:-$REPO_ROOT/scrape.env}"
HOST_RUNNER="$REPO_ROOT/scripts/run-discord-scrape-host.sh" HOST_RUNNER="$REPO_ROOT/scripts/run-discord-scrape-host.sh"
@ -197,6 +199,7 @@ main() {
printf '\nOperator ready. Next:\n' printf '\nOperator ready. Next:\n'
printf ' ./scripts/run-documents-scrape.sh\n' printf ' ./scripts/run-documents-scrape.sh\n'
printf ' ./scripts/setup-cron.sh --dry-run\n' printf ' ./scripts/setup-cron.sh --dry-run\n'
print_kotor_catchup_hint "$CONFIG_PATH"
} }
main "$@" main "$@"