diff --git a/docs/plans/2026-06-04-071-feat-tee-log-json-summary-fallback-plan.md b/docs/plans/2026-06-04-071-feat-tee-log-json-summary-fallback-plan.md new file mode 100644 index 00000000..d2ef54c5 --- /dev/null +++ b/docs/plans/2026-06-04-071-feat-tee-log-json-summary-fallback-plan.md @@ -0,0 +1,88 @@ +--- +title: "feat: Recover JSON scrape summary from tee log" +type: feat +status: complete +date: 2026-06-04 +origin: /lfg — plan 070 deferred auto-extract when DCE_RUN_SUMMARY_FILE write fails inside container +--- + +# feat: Recover JSON scrape summary from tee log + +## Summary + +When operator validation enables JSON summary export but the container does not write `DCE_RUN_SUMMARY_FILE`, recover the summary from the last `DCE_JSON_SUMMARY:` line in the teed validation log. + +## Problem Frame + +Plan 070 mounted `logs/` and mapped summary paths for compose runs. File writes can still fail (permissions, missing mount on ad-hoc runs, partial compose failures). The scrape script always logs `DCE_JSON_SUMMARY:` when `DCE_RUN_SUMMARY_JSON=1`, and operator validation tees all output to `--log-file`. A host-side fallback avoids losing machine-readable totals. + +## Requirements + +| ID | Requirement | +|----|-------------| +| R1 | Shared helper extracts compact JSON after the last `DCE_JSON_SUMMARY:` prefix in a log file | +| R2 | Helper validates JSON with `jq` and writes pretty-printed output to the destination path | +| R3 | `run-operator-validation.sh` invokes fallback when JSON export enabled and summary file missing or empty after tee completes | +| R4 | Recovery success logs `JSON summary recovered from log:` with the file path | +| R5 | Offline smoke covers extract-from-log without live Discord | +| R6 | `DCE_MIN_FREE_MB=0 ./scripts/run-all-smokes.sh` → 21/21 | + +## Key Technical Decisions + +- **Last line wins:** Multiple scrapes in one validation run may emit several summaries; use the last `DCE_JSON_SUMMARY:` line (most recent scrape totals). +- **No overwrite:** Only recover when destination file is missing or zero-length; do not replace an existing valid file. + +## Implementation Units + +### U1. Extract helper library + +**Goal:** Reusable log-line recovery for JSON summaries. + +**Files:** `scripts/lib/scrape-summary-json.sh`, `scripts/tests/scrape-summary-json-smoke.sh` + +**Approach:** `extract_json_summary_from_log(source_log dest_file)` greps for `DCE_JSON_SUMMARY:`, takes the last match, strips prefix, `jq .` to dest. Return 0 on success, 1 when no line or invalid JSON. + +**Test scenarios:** +- Happy path: log with `[timestamp] DCE_JSON_SUMMARY: {"version":1,...}` → valid pretty JSON file +- No marker line → returns 1, dest unchanged +- Invalid JSON after prefix → returns 1 + +**Verification:** smoke script passes standalone. + +### U2. Wire into operator validation + +**Goal:** Auto-recover after teed validation when file write failed. + +**Dependencies:** U1 + +**Files:** `scripts/run-operator-validation.sh`, `scripts/tests/run-operator-validation-smoke.sh` + +**Approach:** Source lib after tee; if `export_json_summary` and `DCE_RUN_SUMMARY_FILE` set and file not `-s`, call extract; log recovery on success. + +**Test scenarios:** +- Smoke simulates log-only summary (write fake log + call helper path, or dry-run skip unchanged) +- Existing dry-run smoke still asserts no JSON summary path logged + +**Verification:** operator-validation smoke passes. + +### U3. Docs stamp + +**Goal:** Record plan 071 in merge-readiness. + +**Files:** `docs/recurring-scrape-merge-readiness.md` + +**Approach:** Add Plan 071 bullet; refresh stale KotOR block (lines 147–153) to cite per-target `container_memory: "8g"` and channel-scoped validation with `.summary.json`. + +## Verification + +```bash +DCE_MIN_FREE_MB=0 ./scripts/run-all-smokes.sh +``` + +## Scope Boundaries + +### Deferred + +- Live KotOR catch-up on host +- Host runner post-scrape recovery when stdout is not teed to a file +- Merging multiple per-target summaries into one JSON artifact diff --git a/docs/recurring-scrape-merge-readiness.md b/docs/recurring-scrape-merge-readiness.md index 1e0c1195..b6a1b9c4 100644 --- a/docs/recurring-scrape-merge-readiness.md +++ b/docs/recurring-scrape-merge-readiness.md @@ -144,12 +144,14 @@ docker compose build # or podman-compose build DCE_MIN_FREE_MB=0 ./scripts/run-operator-validation.sh --target KotOR_discord_msgs ``` -Large `yes_general` may still skip without a higher container cap; set `DCE_CONTAINER_MEMORY=8g` in `scrape.env` and export that channel separately: +Large `yes_general` may still skip without a higher container cap; `KotOR_discord_msgs` sets `container_memory: "8g"` in `scrape-targets.json` for single-target runs (override globally with `DCE_CONTAINER_MEMORY` in `scrape.env`): ```bash -# scrape.env: DCE_CONTAINER_MEMORY=8g DCE_MIN_FREE_MB=0 ./scripts/run-operator-validation.sh \ - --salvage-before-scrape --target KotOR_discord_msgs --channel 221726893064454144 + --salvage-before-scrape --target KotOR_discord_msgs \ + --channel 221726893064454144 \ + --log-file logs/kotor-yes-general.log +# Also writes logs/kotor-yes-general.summary.json (or recovers from log if file write fails) ``` **Plan 063 (2026-06-04):** Optional `DCE_CONTAINER_MEMORY` compose `mem_limit` for large channel catch-up (default 0 = unlimited). @@ -168,6 +170,8 @@ DCE_MIN_FREE_MB=0 ./scripts/run-operator-validation.sh \ **Plan 070 (2026-06-04):** Compose mounts `logs/` at `/logs`; host runner passthrough; operator-validation auto-writes `*.summary.json` beside `--log-file`. +**Plan 071 (2026-06-04):** When summary file write fails, operator validation recovers JSON from the last `DCE_JSON_SUMMARY:` line in the teed log. + **Disk:** ~65 GiB free on `/home` (2026-05-30); large channel merges still need headroom. ## CI note (fork PRs) diff --git a/scripts/lib/scrape-summary-json.sh b/scripts/lib/scrape-summary-json.sh new file mode 100644 index 00000000..6ee82f55 --- /dev/null +++ b/scripts/lib/scrape-summary-json.sh @@ -0,0 +1,26 @@ +#!/usr/bin/env bash + +# Recover machine-readable scrape summaries from teed operator logs. + +extract_json_summary_from_log() { + local source_log=$1 + local dest_file=$2 + local line json_payload + + [[ -n "$source_log" && -n "$dest_file" ]] || return 1 + [[ -f "$source_log" && -r "$source_log" ]] || return 1 + command -v jq >/dev/null 2>&1 || return 1 + + line=$(grep 'DCE_JSON_SUMMARY:' "$source_log" | tail -1) || return 1 + [[ -n "$line" ]] || return 1 + + json_payload=${line#*DCE_JSON_SUMMARY: } + [[ -n "$json_payload" ]] || return 1 + + if ! jq -e . >/dev/null 2>&1 <<<"$json_payload"; then + return 1 + fi + + mkdir -p "$(dirname "$dest_file")" + jq . <<<"$json_payload" >"$dest_file" +} diff --git a/scripts/run-operator-validation.sh b/scripts/run-operator-validation.sh index ece1d6fb..3d3789d9 100755 --- a/scripts/run-operator-validation.sh +++ b/scripts/run-operator-validation.sh @@ -326,6 +326,16 @@ main() { } 2>&1 | tee -a "$LOG_FILE" local pipeline_status=${PIPESTATUS[0]} + if (( export_json_summary )) && [[ -n "${DCE_RUN_SUMMARY_FILE:-}" ]]; then + if [[ ! -s "${DCE_RUN_SUMMARY_FILE}" ]]; then + # shellcheck source=lib/scrape-summary-json.sh + source "$SCRIPT_DIR/lib/scrape-summary-json.sh" + if extract_json_summary_from_log "$LOG_FILE" "$DCE_RUN_SUMMARY_FILE"; then + printf 'JSON summary recovered from log: %s\n' "$DCE_RUN_SUMMARY_FILE" + fi + fi + fi + printf 'Log: %s\n' "$LOG_FILE" exit "$pipeline_status" } diff --git a/scripts/tests/scrape-summary-json-smoke.sh b/scripts/tests/scrape-summary-json-smoke.sh new file mode 100755 index 00000000..f8221246 --- /dev/null +++ b/scripts/tests/scrape-summary-json-smoke.sh @@ -0,0 +1,63 @@ +#!/usr/bin/env bash + +set -Eeuo pipefail + +REPO_ROOT=$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd -P) +# shellcheck source=../lib/scrape-summary-json.sh +source "$REPO_ROOT/scripts/lib/scrape-summary-json.sh" + +TMP_DIR=$(mktemp -d "${TMPDIR:-/tmp}/dce-summary-json-smoke.XXXXXX") +trap 'rm -rf "$TMP_DIR"' EXIT + +LOG_FILE="$TMP_DIR/scrape.log" +OUT_FILE="$TMP_DIR/recovered.summary.json" + +cat >"$LOG_FILE" <<'LOG' +[2026-06-04T12:00:00Z] scrape started +[2026-06-04T12:01:00Z] DCE_JSON_SUMMARY: {"version":1,"totals":{"created":0,"merged":1,"unchanged":2,"skipped":0,"skipped_oom":0,"messages_appended":3}} +LOG + +extract_json_summary_from_log "$LOG_FILE" "$OUT_FILE" || { + printf 'ERROR: expected extract to succeed on valid marker line\n' >&2 + exit 1 +} + +[[ -s "$OUT_FILE" ]] || { + printf 'ERROR: recovered summary file missing\n' >&2 + exit 1 +} + +jq -e '.totals.merged == 1 and .totals.messages_appended == 3' "$OUT_FILE" >/dev/null || { + printf 'ERROR: recovered JSON content mismatch\n' >&2 + exit 1 +} + +printf '[2026-06-04T12:02:00Z] DCE_JSON_SUMMARY: {"version":1,"totals":{"merged":9}}\n' >>"$LOG_FILE" +extract_json_summary_from_log "$LOG_FILE" "$OUT_FILE" || { + printf 'ERROR: expected second extract to succeed\n' >&2 + exit 1 +} + +jq -e '.totals.merged == 9' "$OUT_FILE" >/dev/null || { + printf 'ERROR: expected last DCE_JSON_SUMMARY line to win\n' >&2 + exit 1 +} + +if extract_json_summary_from_log "$TMP_DIR/missing.log" "$OUT_FILE" 2>/dev/null; then + printf 'ERROR: extract should fail on missing log\n' >&2 + exit 1 +fi + +printf '[2026-06-04T12:03:00Z] no summary here\n' >"$TMP_DIR/empty.log" +if extract_json_summary_from_log "$TMP_DIR/empty.log" "$OUT_FILE" 2>/dev/null; then + printf 'ERROR: extract should fail when marker absent\n' >&2 + exit 1 +fi + +printf '[2026-06-04T12:04:00Z] DCE_JSON_SUMMARY: not-json\n' >"$TMP_DIR/bad.log" +if extract_json_summary_from_log "$TMP_DIR/bad.log" "$OUT_FILE" 2>/dev/null; then + printf 'ERROR: extract should fail on invalid JSON\n' >&2 + exit 1 +fi + +printf 'scrape-summary-json-smoke: ok\n'