From 32d7e92ac894b5e2fa251a176eec2adfe941e13f Mon Sep 17 00:00:00 2001 From: Boden Date: Fri, 29 May 2026 00:21:10 -0500 Subject: [PATCH] chore(ci): run Documents workflow smokes in recurring-scrape job Wire documents-scrape and verify-documents-auth smoke tests into CI and document which scripts run locally versus in GitHub Actions. Co-authored-by: Cursor --- .docs/Recurring-Scrape-Setup.md | 27 +++++ .github/workflows/main.yml | 2 + ...t-recurring-scrape-merge-readiness-plan.md | 104 ++++++++++++++++++ 3 files changed, 133 insertions(+) create mode 100644 docs/plans/2026-05-29-010-feat-recurring-scrape-merge-readiness-plan.md diff --git a/.docs/Recurring-Scrape-Setup.md b/.docs/Recurring-Scrape-Setup.md index e90ca701..167509b6 100644 --- a/.docs/Recurring-Scrape-Setup.md +++ b/.docs/Recurring-Scrape-Setup.md @@ -306,6 +306,33 @@ Space requirements: - **Large channels**: 50-100 MB per year - **Full guild**: 500 MB - several GB depending on activity +## Smoke test validation + +Run the full local suite from the repo root (requires `jq`; `container-smoke.sh` also needs Docker/Podman and a writable `archive_root` from `config/scrape-targets.json`): + +```bash +chmod +x scripts/*.sh scripts/tests/*.sh +for script in scripts/tests/*.sh; do + echo "==> $script" + "$script" +done +``` + +| Script | CI (`recurring-scrape-smoke`) | Notes | +|--------|-------------------------------|-------| +| `run-discord-scrape-smoke.sh` | yes | Append-only merge coverage | +| `error-path-smoke.sh` | yes | Failure paths | +| `cron-idempotency-smoke.sh` | yes | Cron installer idempotency | +| `end-to-end-preflight-smoke.sh` | yes | Preflight wiring | +| `setup-cron-smoke.sh` | yes | Cron setup dry-run | +| `run-discord-scrape-host-smoke.sh` | yes | Host wrapper | +| `gh-approve-pr-runs-smoke.sh` | yes | Fork PR workflow helper | +| `documents-scrape-smoke.sh` | yes | Unified Documents workflow | +| `verify-documents-auth-smoke.sh` | yes | Archive verify + auth bootstrap | +| `container-smoke.sh` | no (local) | Docker build + `help` / `list-targets` | + +GitHub Actions runs the CI-marked scripts on every push/PR via `.github/workflows/main.yml` job `recurring-scrape-smoke`. + ## Next Steps - [Troubleshooting common issues](Recurring-Scrape-Troubleshooting.md) diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml index 8e940d1c..010e4531 100644 --- a/.github/workflows/main.yml +++ b/.github/workflows/main.yml @@ -74,6 +74,8 @@ jobs: ./scripts/tests/setup-cron-smoke.sh ./scripts/tests/run-discord-scrape-host-smoke.sh ./scripts/tests/gh-approve-pr-runs-smoke.sh + ./scripts/tests/documents-scrape-smoke.sh + ./scripts/tests/verify-documents-auth-smoke.sh test: # Tests need access to secrets, so we can't run them against PRs because of limited trust diff --git a/docs/plans/2026-05-29-010-feat-recurring-scrape-merge-readiness-plan.md b/docs/plans/2026-05-29-010-feat-recurring-scrape-merge-readiness-plan.md new file mode 100644 index 00000000..78d7fc09 --- /dev/null +++ b/docs/plans/2026-05-29-010-feat-recurring-scrape-merge-readiness-plan.md @@ -0,0 +1,104 @@ +--- +title: feat: Recurring scrape merge readiness +type: feat +status: completed +date: 2026-05-29 +origin: LFG — PR #1538 ready; CI smoke suite missing newer Documents workflow tests +--- + +# feat: Recurring scrape merge readiness + +## Summary + +PR #1538 lands the recurring Documents scrape workflow (verify, auth bootstrap, unified operator entrypoints, GUI token discovery). Local smoke coverage exists for the newer scripts, but `.github/workflows/main.yml` still runs only the original six smoke tests. Close that gap so CI exercises the Documents operator path before merge. + +--- + +## Problem Frame + +Operators rely on `run-documents-scrape.sh`, `verify-documents-archives.sh`, and `setup-scrape-auth.sh`. Smoke tests exist for each (`documents-scrape-smoke.sh`, `verify-documents-auth-smoke.sh`) but are not wired into CI. A regression in the unified workflow could merge undetected while the legacy scrape smokes stay green. + +`container-smoke.sh` requires Docker build and host archive mounts — keep it local-only for now; do not block this pass on container CI unless trivial to add. + +--- + +## Requirements + +| ID | Requirement | +|----|-------------| +| R1 | CI `recurring-scrape-smoke` job runs `documents-scrape-smoke.sh` and `verify-documents-auth-smoke.sh` | +| R2 | `.docs/Recurring-Scrape-Setup.md` lists the full smoke suite (local + CI) consistently | +| R3 | All smoke scripts pass locally after the CI change | + +--- + +## Scope Boundaries + +- No changes to core C# exporter or merge semantics. +- No attempt to unblock upstream fork `action_required` CI (maintainer approval). +- `container-smoke.sh` stays optional/local unless Docker is already available in the job without new infrastructure. + +### Deferred to Follow-Up Work + +- Add `container-smoke.sh` to CI with Docker-in-GitHub-Actions setup. +- Live-token grow-only proof on production archives (operator-run, not committed). + +--- + +## Implementation Units + +### U1. Expand CI recurring-scrape-smoke job + +**Goal:** Run Documents workflow smoke tests in GitHub Actions. + +**Requirements:** R1 + +**Files:** +- Modify: `.github/workflows/main.yml` + +**Approach:** Append `./scripts/tests/documents-scrape-smoke.sh` and `./scripts/tests/verify-documents-auth-smoke.sh` to the existing `Run recurring scrape smoke tests` step after chmod. + +**Test scenarios:** +- Workflow YAML invokes both new scripts (grep or dry-run review). +- Local run of both scripts exits 0. + +**Verification:** `bash -n` on workflow not needed; local smoke pass + workflow file contains both script paths. + +--- + +### U2. Align operator documentation + +**Goal:** Document which smokes run in CI vs locally. + +**Requirements:** R2 + +**Dependencies:** U1 + +**Files:** +- Modify: `.docs/Recurring-Scrape-Setup.md` + +**Approach:** Add a short "Validation" subsection listing all smoke scripts; mark which run in CI vs local-only (`container-smoke.sh`). + +**Test scenarios:** +- Doc mentions `documents-scrape-smoke.sh` and `verify-documents-auth-smoke.sh` under CI coverage. + +**Verification:** Manual read of updated section. + +--- + +### U3. Run full local smoke suite + +**Goal:** Confirm no regressions before push. + +**Requirements:** R3 + +**Dependencies:** U1, U2 + +**Files:** (none — validation only) + +**Approach:** Run every `scripts/tests/*.sh` locally; fix any failures in scope. + +**Test scenarios:** +- All ten smoke scripts exit 0. + +**Verification:** Single shell loop over `scripts/tests/*.sh`.