feat: add gh PR run approval helper for fork CI unblock

Adds scripts/gh-approve-pr-runs.sh with GITHUB_TOKEN bootstrap, explicit
admin-rights policy classification, smoke coverage, and CI wiring. Marks
the remaining 2026-05-24 recurring scrape plans completed.

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Boden 2026-05-28 00:30:49 -05:00
parent df499568d9
commit 7cab7280c4
9 changed files with 301 additions and 3 deletions

View file

@ -98,6 +98,8 @@ If you authenticate with a **bot token**, do not rely on guild-name or DM discov
The host wrapper (`scripts/run-discord-scrape-host.sh`) classifies Discord auth failures and retries once after reloading `DISCORD_TOKEN_FILE` (if configured). Persistent auth failure still exits non-zero. The host wrapper (`scripts/run-discord-scrape-host.sh`) classifies Discord auth failures and retries once after reloading `DISCORD_TOKEN_FILE` (if configured). Persistent auth failure still exits non-zero.
For fork PRs blocked on GitHub Actions approval, repository admins can run `./scripts/gh-approve-pr-runs.sh --repo Tyrrrz/DiscordChatExporter RUN_ID` after exporting `GITHUB_TOKEN`. The script bootstraps `gh` auth and surfaces admin-rights policy blockers separately from token failures.
If you run the recurring flow through podman on an SELinux-enabled host, keep the bind mounts relabeled (`:z`). The checked-in `docker-compose.yml` already applies this to the recurring wrapper mounts. If you run the recurring flow through podman on an SELinux-enabled host, keep the bind mounts relabeled (`:z`). The checked-in `docker-compose.yml` already applies this to the recurring wrapper mounts.
For rootless podman, set `DCE_USERNS_MODE=keep-id` in `scrape.env` so the mounted archive roots stay writable as your host user instead of appearing as `root:root` inside the container. Keep `DCE_UID` and `DCE_GID` matched to your host user as well. For rootless podman, set `DCE_USERNS_MODE=keep-id` in `scrape.env` so the mounted archive roots stay writable as your host user instead of appearing as `root:root` inside the container. Keep `DCE_UID` and `DCE_GID` matched to your host user as well.

View file

@ -64,6 +64,15 @@ If any selected target fails that authenticated probe, `setup-cron.sh` stops wit
For recurring runs, `setup-cron.sh` now installs a cron command that executes `scripts/run-discord-scrape-host.sh scrape ...`. The host wrapper retries once when it detects Discord auth failures (`401`/`403`) by reloading `DISCORD_TOKEN_FILE` if configured. This keeps cron non-interactive and fail-closed. For recurring runs, `setup-cron.sh` now installs a cron command that executes `scripts/run-discord-scrape-host.sh scrape ...`. The host wrapper retries once when it detects Discord auth failures (`401`/`403`) by reloading `DISCORD_TOKEN_FILE` if configured. This keeps cron non-interactive and fail-closed.
When contributing fork PRs to the upstream repository, GitHub Actions runs may wait on maintainer approval. If you have repository admin rights and a `GITHUB_TOKEN` with sufficient scopes, you can attempt approval with:
```bash
export GITHUB_TOKEN=... # or define it in ~/.bashrc
./scripts/gh-approve-pr-runs.sh --repo Tyrrrz/DiscordChatExporter RUN_ID [RUN_ID...]
```
The helper bootstraps `gh auth login --with-token` when needed. If GitHub responds that admin rights are required, the script exits with an explicit policy-blocker message instead of a generic auth failure.
If you are running the recurring wrapper through podman on an SELinux-enabled host, keep the bind mounts relabeled (`:z`). The checked-in `docker-compose.yml` already includes that for the recurring config and archive mounts. If you are running the recurring wrapper through podman on an SELinux-enabled host, keep the bind mounts relabeled (`:z`). The checked-in `docker-compose.yml` already includes that for the recurring config and archive mounts.
For rootless podman, set `DCE_USERNS_MODE=keep-id` in `scrape.env` so the mounted `Documents` archive roots stay writable as your host user during scheduled runs. Keep `DCE_UID` and `DCE_GID` matched to your host user as well. For rootless podman, set `DCE_USERNS_MODE=keep-id` in `scrape.env` so the mounted `Documents` archive roots stay writable as your host user during scheduled runs. Keep `DCE_UID` and `DCE_GID` matched to your host user as well.

View file

@ -73,6 +73,7 @@ jobs:
./scripts/tests/end-to-end-preflight-smoke.sh ./scripts/tests/end-to-end-preflight-smoke.sh
./scripts/tests/setup-cron-smoke.sh ./scripts/tests/setup-cron-smoke.sh
./scripts/tests/run-discord-scrape-host-smoke.sh ./scripts/tests/run-discord-scrape-host-smoke.sh
./scripts/tests/gh-approve-pr-runs-smoke.sh
test: test:
# Tests need access to secrets, so we can't run them against PRs because of limited trust # Tests need access to secrets, so we can't run them against PRs because of limited trust

View file

@ -1,7 +1,7 @@
--- ---
title: feat: Add recurring CLI scrape automation title: feat: Add recurring CLI scrape automation
type: feat type: feat
status: active status: completed
date: 2026-05-24 date: 2026-05-24
--- ---

View file

@ -3,7 +3,7 @@ date: 2026-05-24
sequence: 001 sequence: 001
plan_type: fix plan_type: fix
title: Harden GitHub and Discord reauth recovery title: Harden GitHub and Discord reauth recovery
status: active status: completed
--- ---
# fix: Harden GitHub and Discord reauth recovery # fix: Harden GitHub and Discord reauth recovery

View file

@ -1,7 +1,7 @@
--- ---
title: fix: Verify live archive path updates title: fix: Verify live archive path updates
type: fix type: fix
status: active status: completed
date: 2026-05-24 date: 2026-05-24
--- ---

View file

@ -0,0 +1,42 @@
---
title: fix: Complete auth recovery and close recurring scrape plans
type: fix
status: completed
date: 2026-05-28
origin: Active plans 2026-05-24-* with remaining U2 from auth-reauth plan
---
# fix: Complete auth recovery and close recurring scrape plans
## Summary
The recurring scrape feature branch is functionally complete after validation (003) and hardening (004). One implementation unit remains from `docs/plans/2026-05-24-001-fix-auth-reauth-recovery-plan.md` (U2: GitHub approval helper), and several sibling plans should be marked completed to reflect landed work.
## Problem Frame
Cross-repo PRs to `Tyrrrz/DiscordChatExporter` can block on GitHub Actions approval. Operators need an explicit, fail-closed helper that bootstraps `gh` from `GITHUB_TOKEN` and attempts run approval while surfacing admin-rights policy blockers separately from transient auth failures.
## Requirements
| ID | Requirement | Files |
|----|-------------|-------|
| U1 | Add `scripts/gh-approve-pr-runs.sh` with token bootstrap, run approval attempts, and explicit 403 admin-rights classification | new script, smoke test |
| U2 | Document the helper in operator docs | `.docs/Scheduling-Linux.md`, `.docs/Docker.md` |
| U3 | Mark completed 2026-05-24 active plans as `status: completed` | `docs/plans/2026-05-24-*.md` |
## Out of Scope
- Circumventing upstream admin approval policy
- Core C# or archive-path changes (already landed)
## Test Scenarios
- Missing `GITHUB_TOKEN` → clear error, exit non-zero
- Valid token + mock `gh` → approval API invoked for each run ID
- Mock `gh` returning admin-rights 403 → explicit policy blocker message
## Success Criteria
- Smoke test passes
- All recurring-scrape plans through 004 marked completed
- PR #1538 documents gh-approve usage for fork CI unblock attempts

142
scripts/gh-approve-pr-runs.sh Executable file
View file

@ -0,0 +1,142 @@
#!/usr/bin/env bash
set -Eeuo pipefail
GH_BIN="${GH_BIN:-gh}"
BASHRC="${GH_APPROVE_BASHRC:-${HOME}/.bashrc}"
REPO_SPEC=""
declare -a RUN_IDS=()
usage() {
cat <<EOF
Usage:
$(basename "$0") --repo OWNER/NAME RUN_ID [RUN_ID...]
Attempt to approve GitHub Actions workflow runs (for example, fork PR runs
waiting on maintainer approval). Bootstraps gh auth from GITHUB_TOKEN when needed.
Options:
--repo OWNER/NAME Repository containing the workflow runs (required)
--help Show this help text
Environment:
GITHUB_TOKEN Personal access token with actions:write (or repo admin)
GH_BIN gh executable (default: gh)
GH_APPROVE_BASHRC Shell rc file to source for token bootstrap (default: ~/.bashrc)
EOF
}
die() {
printf 'ERROR: %s\n' "$*" >&2
exit 1
}
require_program() {
command -v "$1" >/dev/null 2>&1 || die "Required command '$1' is missing."
}
maybe_source_bashrc() {
[[ -f "$BASHRC" ]] || return 0
# shellcheck disable=SC1090
source "$BASHRC"
}
ensure_gh_auth() {
[[ -n "${GITHUB_TOKEN:-}" ]] || die "GITHUB_TOKEN is not set. Export a token or define it in $BASHRC."
if ! "$GH_BIN" auth status >/dev/null 2>&1; then
printf 'GitHub CLI is not authenticated; logging in from GITHUB_TOKEN...\n' >&2
printf '%s\n' "$GITHUB_TOKEN" | "$GH_BIN" auth login --with-token \
|| die "gh auth login --with-token failed."
fi
"$GH_BIN" auth status >/dev/null 2>&1 || die "GitHub CLI authentication is still invalid after token login."
}
classify_approve_failure() {
local output=$1
if grep -Eqi 'must have admin rights|admin rights to this repository|Resource not accessible by integration' <<<"$output"; then
printf 'POLICY_BLOCKER: GitHub requires repository admin rights to approve this workflow run. This is an upstream permission/policy limit, not a transient auth failure.\n' >&2
return 2
fi
if grep -Eqi 'Bad credentials|HTTP 401|HTTP 403' <<<"$output"; then
printf 'AUTH_FAILURE: GitHub rejected the approval request. Verify GITHUB_TOKEN scopes and gh auth status.\n' >&2
return 1
fi
return 1
}
approve_run() {
local repo=$1 run_id=$2
local output
if output=$("$GH_BIN" api -X POST "repos/${repo}/actions/runs/${run_id}/approve" 2>&1); then
printf 'Approved workflow run %s on %s.\n' "$run_id" "$repo"
return 0
fi
classify_approve_failure "$output"
local classify_rc=$?
printf '%s\n' "$output" >&2
return "$classify_rc"
}
main() {
while (($#)); do
case "$1" in
--repo)
[[ $# -ge 2 ]] || die "Missing value for --repo."
REPO_SPEC=$2
shift 2
;;
--help|-h)
usage
exit 0
;;
-*)
die "Unknown option: $1"
;;
*)
RUN_IDS+=("$1")
shift
;;
esac
done
[[ -n "$REPO_SPEC" ]] || {
usage
exit 1
}
((${#RUN_IDS[@]} > 0)) || die "Provide at least one workflow RUN_ID."
[[ "$REPO_SPEC" == */* ]] || die "--repo must use OWNER/NAME format."
require_program "$GH_BIN"
maybe_source_bashrc
ensure_gh_auth
local run_id failures=0 policy_blockers=0
for run_id in "${RUN_IDS[@]}"; do
if approve_run "$REPO_SPEC" "$run_id"; then
continue
fi
local rc=$?
failures=$((failures + 1))
if (( rc == 2 )); then
policy_blockers=$((policy_blockers + 1))
fi
done
if (( policy_blockers > 0 )); then
die "One or more runs could not be approved because the authenticated user lacks required repository admin rights."
fi
if (( failures > 0 )); then
die "Failed to approve ${failures} workflow run(s)."
fi
}
main "$@"

View file

@ -0,0 +1,102 @@
#!/usr/bin/env bash
set -Eeuo pipefail
REPO_ROOT=$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd -P)
TMP_DIR=$(mktemp -d "${TMPDIR:-/tmp}/dce-gh-approve-smoke.XXXXXX")
FAKE_GH="$TMP_DIR/gh"
GH_LOG="$TMP_DIR/gh.log"
GH_STATE="$TMP_DIR/gh.state"
cleanup() {
rm -rf "$TMP_DIR"
}
trap cleanup EXIT
cat >"$FAKE_GH" <<'EOF'
#!/usr/bin/env bash
set -Eeuo pipefail
log=${FAKE_GH_LOG:?}
mode=${FAKE_GH_MODE:?}
state=${FAKE_GH_STATE:?}
printf '%s\n' "$*" >>"$log"
case "$1" in
auth)
if [[ "${2:-}" == "login" ]]; then
touch "$state"
exit 0
fi
if [[ -f "$state" ]] || [[ "$mode" == "authenticated" ]]; then
exit 0
fi
exit 1
;;
api)
if [[ "$mode" == "policy-blocker" ]]; then
printf 'Must have admin rights to Repository.\n' >&2
exit 1
fi
if [[ "$mode" == "auth-fail" ]]; then
printf 'HTTP 401: Bad credentials\n' >&2
exit 1
fi
exit 0
;;
*)
echo "unexpected gh invocation: $*" >&2
exit 1
;;
esac
EOF
chmod +x "$FAKE_GH"
run_helper() {
local mode=$1
shift
: >"$GH_LOG"
rm -f "$GH_STATE"
GITHUB_TOKEN=test-token \
GH_BIN="$FAKE_GH" \
GH_APPROVE_BASHRC=/dev/null \
FAKE_GH_LOG="$GH_LOG" \
FAKE_GH_MODE="$mode" \
FAKE_GH_STATE="$GH_STATE" \
"$REPO_ROOT/scripts/gh-approve-pr-runs.sh" "$@"
}
if run_helper authenticated --repo Tyrrrz/DiscordChatExporter 12345 67890; then
grep -q 'api -X POST repos/Tyrrrz/DiscordChatExporter/actions/runs/12345/approve' "$GH_LOG" \
|| { echo "expected first run approval API call" >&2; exit 1; }
grep -q 'api -X POST repos/Tyrrrz/DiscordChatExporter/actions/runs/67890/approve' "$GH_LOG" \
|| { echo "expected second run approval API call" >&2; exit 1; }
else
echo "expected successful approval for authenticated mode" >&2
exit 1
fi
if run_helper unauthenticated --repo Tyrrrz/DiscordChatExporter 11111; then
grep -q 'auth login --with-token' "$GH_LOG" || { echo "expected gh auth login from token" >&2; exit 1; }
else
echo "expected bootstrap + approval for unauthenticated gh" >&2
exit 1
fi
if run_helper policy-blocker --repo Tyrrrz/DiscordChatExporter 22222 2>/tmp/gh-policy.err; then
echo "expected policy blocker to fail" >&2
exit 1
fi
grep -q 'POLICY_BLOCKER' /tmp/gh-policy.err || { echo "expected policy blocker classification" >&2; exit 1; }
if GITHUB_TOKEN= GH_BIN="$FAKE_GH" GH_APPROVE_BASHRC=/dev/null \
"$REPO_ROOT/scripts/gh-approve-pr-runs.sh" --repo Tyrrrz/DiscordChatExporter 1 2>/tmp/gh-missing-token.err; then
echo "expected missing token failure" >&2
exit 1
fi
grep -q 'GITHUB_TOKEN is not set' /tmp/gh-missing-token.err || { echo "expected missing token message" >&2; exit 1; }
echo "gh-approve-pr-runs smoke test passed"