mirror of
https://github.com/Tyrrrz/DiscordChatExporter.git
synced 2026-06-10 00:02:37 -06:00
Use max message ID for incremental exports, validate custom cron expressions, drop eval from host/preflight paths, restrict reauth to executable repo scripts, and run smoke tests in CI. Co-authored-by: Cursor <cursoragent@cursor.com>
2.8 KiB
2.8 KiB
title: fix: Harden recurring scrape scripts from code review residuals
type: fix
status: completed
date: 2026-05-27
origin: PR #1538 residual review findings
fix: Harden recurring scrape scripts from code review residuals
Summary
Address actionable security and correctness findings deferred from the finalization validation pass on feat/recurring-cli-scrape. Scope is limited to script-layer hardening and CI wiring—no C# exporter changes.
Problem Frame
The recurring scrape automation is validated by smoke tests but still has review-flagged risks: arbitrary command execution via eval/bash -lc, unvalidated custom cron expressions, incorrect incremental cursor selection, and smoke tests not running in CI.
Requirements
| ID | Requirement | Files |
|---|---|---|
| U1 | Incremental exports must use the highest message snowflake ID, not the last array element | scripts/run-discord-scrape.sh, smoke fixture + assertion |
| U2 | --cron expressions must be validated (five fields, safe charset) before crontab install |
scripts/setup-cron.sh, smoke test |
| U3 | Host runner and setup preflight must invoke compose without eval |
scripts/run-discord-scrape-host.sh, scripts/setup-cron.sh |
| U4 | DCE_REAUTH_COMMAND must be an executable script under the repo root (no bash -lc arbitrary strings) |
scripts/run-discord-scrape-host.sh |
| U5 | Recurring scrape smoke suite runs in GitHub Actions on PR/push | .github/workflows/main.yml |
Implementation Units
U1 — max message ID cursor
- Change
last_message_idtomax_by(.id)with empty-array guard. - Add fixture where max ID is not the last message; extend fake CLI to log/assert
--aftervalue.
U2 — cron validation
- Add
validate_cron_expressioncalled whenCRON_EXPRESSIONis set. - Reject invalid field counts and shell metacharacters.
U3 — remove eval
- Refactor
compose_run_commandto populate a bash array via nameref; execute with"${args[@]}". - Refactor
build_target_args/run_preflightsimilarly.
U4 — reauth script allowlist
- Resolve and require absolute path under
$REPO_ROOT, executable file. - Execute directly instead of
bash -lc.
U5 — CI smoke job
- Add
recurring-scrape-smokejob: installjq, run allscripts/tests/*-smoke.shscripts.
Test Scenarios
- Smoke: unordered archive still passes
--afterwith max ID. - Smoke: invalid
--cronexits non-zero before crontab mutation. - CI: new job passes on clean branch.
Out of Scope
- Changing upstream PR approval for fork workflows.
- Replacing
%qcron job line construction (values are operator-controlled paths).
Success Criteria
- All existing smoke scripts pass locally.
- Residual P0/P1 script findings from review are resolved or explicitly superseded by this plan.