chore(ci): run Documents workflow smokes in recurring-scrape job

Wire documents-scrape and verify-documents-auth smoke tests into CI and
document which scripts run locally versus in GitHub Actions.

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Boden 2026-05-29 00:21:10 -05:00
parent 57d472f8e8
commit 32d7e92ac8
3 changed files with 133 additions and 0 deletions

View file

@ -306,6 +306,33 @@ Space requirements:
- **Large channels**: 50-100 MB per year
- **Full guild**: 500 MB - several GB depending on activity
## Smoke test validation
Run the full local suite from the repo root (requires `jq`; `container-smoke.sh` also needs Docker/Podman and a writable `archive_root` from `config/scrape-targets.json`):
```bash
chmod +x scripts/*.sh scripts/tests/*.sh
for script in scripts/tests/*.sh; do
echo "==> $script"
"$script"
done
```
| Script | CI (`recurring-scrape-smoke`) | Notes |
|--------|-------------------------------|-------|
| `run-discord-scrape-smoke.sh` | yes | Append-only merge coverage |
| `error-path-smoke.sh` | yes | Failure paths |
| `cron-idempotency-smoke.sh` | yes | Cron installer idempotency |
| `end-to-end-preflight-smoke.sh` | yes | Preflight wiring |
| `setup-cron-smoke.sh` | yes | Cron setup dry-run |
| `run-discord-scrape-host-smoke.sh` | yes | Host wrapper |
| `gh-approve-pr-runs-smoke.sh` | yes | Fork PR workflow helper |
| `documents-scrape-smoke.sh` | yes | Unified Documents workflow |
| `verify-documents-auth-smoke.sh` | yes | Archive verify + auth bootstrap |
| `container-smoke.sh` | no (local) | Docker build + `help` / `list-targets` |
GitHub Actions runs the CI-marked scripts on every push/PR via `.github/workflows/main.yml` job `recurring-scrape-smoke`.
## Next Steps
- [Troubleshooting common issues](Recurring-Scrape-Troubleshooting.md)

View file

@ -74,6 +74,8 @@ jobs:
./scripts/tests/setup-cron-smoke.sh
./scripts/tests/run-discord-scrape-host-smoke.sh
./scripts/tests/gh-approve-pr-runs-smoke.sh
./scripts/tests/documents-scrape-smoke.sh
./scripts/tests/verify-documents-auth-smoke.sh
test:
# Tests need access to secrets, so we can't run them against PRs because of limited trust

View file

@ -0,0 +1,104 @@
---
title: feat: Recurring scrape merge readiness
type: feat
status: completed
date: 2026-05-29
origin: LFG — PR #1538 ready; CI smoke suite missing newer Documents workflow tests
---
# feat: Recurring scrape merge readiness
## Summary
PR #1538 lands the recurring Documents scrape workflow (verify, auth bootstrap, unified operator entrypoints, GUI token discovery). Local smoke coverage exists for the newer scripts, but `.github/workflows/main.yml` still runs only the original six smoke tests. Close that gap so CI exercises the Documents operator path before merge.
---
## Problem Frame
Operators rely on `run-documents-scrape.sh`, `verify-documents-archives.sh`, and `setup-scrape-auth.sh`. Smoke tests exist for each (`documents-scrape-smoke.sh`, `verify-documents-auth-smoke.sh`) but are not wired into CI. A regression in the unified workflow could merge undetected while the legacy scrape smokes stay green.
`container-smoke.sh` requires Docker build and host archive mounts — keep it local-only for now; do not block this pass on container CI unless trivial to add.
---
## Requirements
| ID | Requirement |
|----|-------------|
| R1 | CI `recurring-scrape-smoke` job runs `documents-scrape-smoke.sh` and `verify-documents-auth-smoke.sh` |
| R2 | `.docs/Recurring-Scrape-Setup.md` lists the full smoke suite (local + CI) consistently |
| R3 | All smoke scripts pass locally after the CI change |
---
## Scope Boundaries
- No changes to core C# exporter or merge semantics.
- No attempt to unblock upstream fork `action_required` CI (maintainer approval).
- `container-smoke.sh` stays optional/local unless Docker is already available in the job without new infrastructure.
### Deferred to Follow-Up Work
- Add `container-smoke.sh` to CI with Docker-in-GitHub-Actions setup.
- Live-token grow-only proof on production archives (operator-run, not committed).
---
## Implementation Units
### U1. Expand CI recurring-scrape-smoke job
**Goal:** Run Documents workflow smoke tests in GitHub Actions.
**Requirements:** R1
**Files:**
- Modify: `.github/workflows/main.yml`
**Approach:** Append `./scripts/tests/documents-scrape-smoke.sh` and `./scripts/tests/verify-documents-auth-smoke.sh` to the existing `Run recurring scrape smoke tests` step after chmod.
**Test scenarios:**
- Workflow YAML invokes both new scripts (grep or dry-run review).
- Local run of both scripts exits 0.
**Verification:** `bash -n` on workflow not needed; local smoke pass + workflow file contains both script paths.
---
### U2. Align operator documentation
**Goal:** Document which smokes run in CI vs locally.
**Requirements:** R2
**Dependencies:** U1
**Files:**
- Modify: `.docs/Recurring-Scrape-Setup.md`
**Approach:** Add a short "Validation" subsection listing all smoke scripts; mark which run in CI vs local-only (`container-smoke.sh`).
**Test scenarios:**
- Doc mentions `documents-scrape-smoke.sh` and `verify-documents-auth-smoke.sh` under CI coverage.
**Verification:** Manual read of updated section.
---
### U3. Run full local smoke suite
**Goal:** Confirm no regressions before push.
**Requirements:** R3
**Dependencies:** U1, U2
**Files:** (none — validation only)
**Approach:** Run every `scripts/tests/*.sh` locally; fix any failures in scope.
**Test scenarios:**
- All ten smoke scripts exit 0.
**Verification:** Single shell loop over `scripts/tests/*.sh`.