mirror of
https://github.com/Tyrrrz/DiscordChatExporter.git
synced 2026-06-10 00:02:37 -06:00
Preflight probes skip forbidden channels when seeded archives exist. Cron installer passes container config path and supports --config override. Compose and docs align with append-only ~/Documents scrape workflow.
68 lines
2.9 KiB
Markdown
68 lines
2.9 KiB
Markdown
---
|
|
title: feat: Documents recurring scrape verification and operator closure
|
|
type: feat
|
|
status: completed
|
|
date: 2026-05-29
|
|
origin: LFG — Docker/cron append-only Discord scrape for ~/Documents archive folders
|
|
---
|
|
|
|
# feat: Documents recurring scrape verification and operator closure
|
|
|
|
## Summary
|
|
|
|
Close the recurring Discord scrape vertical slice: source-built Docker image, compose mounts for `config/scrape-targets.json` and `/home/brunner56/Documents` archives, append-only JSON merge in `scripts/run-discord-scrape.sh`, monthly cron via `scripts/setup-cron.sh`, and runtime proof (preflight + incremental scrape on at least one enabled target).
|
|
|
|
## Problem Frame
|
|
|
|
Operators need monthly (configurable) incremental exports into existing `~/Documents/*_discord*` folders without re-downloading full history or overwriting archives when Discord deletes messages server-side. Infrastructure exists on `feat/recurring-cli-scrape`; this pass validates end-to-end behavior and documents the operator path.
|
|
|
|
## Requirements
|
|
|
|
| ID | Requirement |
|
|
|----|-------------|
|
|
| R1 | `Dockerfile` builds `DiscordChatExporter.Cli` from source; compose mounts config, scripts, and `archive_root` |
|
|
| R2 | `config/scrape-targets.json` maps user Documents folders; empty `channel_ids` exports all accessible channels per target |
|
|
| R3 | `run-discord-scrape.sh` uses `--after` + merge-by-id; rejects shrink merges |
|
|
| R4 | `setup-cron.sh` defaults to monthly schedule; supports `--target`, `--guild`, `--channel`, `--interval`, `--cron` |
|
|
| R5 | `scrape.env` (gitignored) supplies token for compose; never commit secrets |
|
|
| R6 | Preflight and one-target scrape succeed against live Discord API |
|
|
| R7 | Smoke tests pass; operator docs list validation commands |
|
|
|
|
## Scope Boundaries
|
|
|
|
- No changes to upstream C# merge API (wrapper-only append).
|
|
- Do not enable `discord_dms` without user token.
|
|
- Token stays in `scrape.env` only.
|
|
|
|
## Implementation Units
|
|
|
|
### U1. Harden bootstrap and compose paths
|
|
|
|
**Requirements:** R1, R2
|
|
|
|
**Files:** `scripts/run-discord-scrape.sh`, `docker-compose.yml`, `Dockerfile`
|
|
|
|
**Test scenarios:** Archive seed files bootstrap channel-map; compose bind-mount resolves host Documents path.
|
|
|
|
### U2. Cron installer and docs alignment
|
|
|
|
**Requirements:** R4, R7
|
|
|
|
**Files:** `scripts/setup-cron.sh`, `.docs/Recurring-Scrape-Setup.md`, `Readme.md`
|
|
|
|
**Test scenarios:** `setup-cron.sh --dry-run` emits monthly block; `--remove` idempotent.
|
|
|
|
### U3. Runtime verification
|
|
|
|
**Requirements:** R5, R6
|
|
|
|
**Commands:** `docker compose build`, `run-discord-scrape-host.sh preflight`, scrape `--target` with smallest enabled archive.
|
|
|
|
**Test scenarios:** Message count non-decreasing after scrape; logs show `--after` when archive non-empty.
|
|
|
|
## Verification Ladder
|
|
|
|
1. `bash -n` on changed shell scripts
|
|
2. `scripts/tests/setup-cron-smoke.sh`, `run-discord-scrape-smoke.sh`
|
|
3. `docker compose build` + preflight + single-target scrape
|