docs(scrape): add merge readiness index and doc cross-links

Single reviewer/operator page for the recurring scrape feature with
validation commands; link from root and .docs indexes.
This commit is contained in:
Boden 2026-05-29 14:14:44 -05:00
parent 89091d76ef
commit 927d5e9607
5 changed files with 67 additions and 1 deletions

View file

@ -15,6 +15,10 @@
- [Windows](Scheduling-Windows.md)
- [macOS](Scheduling-MacOS.md)
- [Linux](Scheduling-Linux.md)
- Recurring append-only scrape (Docker + cron, this fork):
- [Setup guide](Recurring-Scrape-Setup.md)
- [Troubleshooting](Recurring-Scrape-Troubleshooting.md)
- [Merge readiness](../docs/recurring-scrape-merge-readiness.md)
## Video tutorial

View file

@ -81,7 +81,7 @@ To learn more about the war and how you can help, [click here](https://tyrrrz.me
## See also
- [**Recurring Exports**](.docs/Recurring-Scrape-Setup.md) — automated scheduled exports using cron (Linux/macOS). From the source repo run `./scripts/bootstrap-recurring-scrape.sh` (verify, build, preflight). If you only have the GUI zip (`DiscordChatExporter.linux-x64`), use `./bootstrap-recurring-scrape.sh` or `scripts/scrape-here.sh` in the sibling source repository.
- [**Recurring Exports**](.docs/Recurring-Scrape-Setup.md) — append-only incremental JSON exports via Docker/cron (Linux/macOS). From the source repo: `./scripts/bootstrap-recurring-scrape.sh`, then `./scripts/run-documents-scrape.sh`; validate with `./scripts/run-all-smokes.sh`. GUI zip users: `../DiscordChatExporter.linux-x64/RECURRING-SCRAPE.md`. Maintainer summary: [docs/recurring-scrape-merge-readiness.md](docs/recurring-scrape-merge-readiness.md).
- [**Documented solutions**](docs/solutions/) — searchable learnings (append-only scrape, Docker/cron workflow); YAML frontmatter: `module`, `tags`, `problem_type`
- [**Chat Analytics**](https://github.com/mlomb/chat-analytics) — solution for analyzing chat patterns of Discord users, using exports produced by **DiscordChatExporter**.
- [**DiscordChatExporter-frontend**](https://github.com/slatinsky/DiscordChatExporter-frontend) — convenient viewer for exports produced by **DiscordChatExporter**.

View file

@ -0,0 +1,25 @@
---
title: docs: Merge readiness index and doc cross-links
type: docs
status: complete
date: 2026-05-29
origin: Repeated /lfg — feature stack complete; surface merge/operator entrypoints
---
# docs: Merge readiness index and doc cross-links
## Summary
Recurring scrape automation is implemented and tested. Add a merge-readiness doc for reviewers and wire documentation indexes so operators find setup, troubleshooting, and validation in one hop.
## Requirements
| ID | Requirement |
|----|-------------|
| R1 | `docs/recurring-scrape-merge-readiness.md` summarizes feature, validation commands, operator flow |
| R2 | `.docs/Readme.md` links recurring scrape setup and troubleshooting |
| R3 | Root `Readme.md` See also mentions `run-all-smokes.sh` validation |
## Verification
- `./scripts/run-all-smokes.sh` passes

View file

@ -0,0 +1,35 @@
# Recurring scrape — merge readiness
Fork branch `feat/recurring-cli-scrape` adds append-only, Docker-based incremental exports with optional monthly cron. Intended for personal archive trees under a configurable `archive_root` (for example `~/Documents/*`).
## What ships
- **Config:** `config/scrape-targets.json` — per-server `output_dir`, optional `channel_ids`, `enabled` flags
- **Core:** `scripts/run-discord-scrape.sh` — incremental `--after`, merge-by-id, fail-closed path safety
- **Host:** `scripts/run-discord-scrape-host.sh`, `scripts/run-documents-scrape.sh`, `scripts/bootstrap-recurring-scrape.sh`
- **Auth:** `scrape.env`, `scripts/setup-scrape-auth.sh`, `scripts/sync-token-from-gui.sh`
- **Cron:** `scripts/setup-cron.sh` (`--interval monthly` default)
- **Integrity:** `scripts/audit-archive-json.sh`, `scripts/salvage-truncated-export.sh`, `scripts/prove-incremental-append.sh`
- **CI:** `.github/workflows/main.yml` job `recurring-scrape-smoke` runs `./scripts/run-all-smokes.sh`
## Validate before merge
```bash
./scripts/run-all-smokes.sh
./scripts/run-all-smokes.sh --include-container # optional; needs Docker/Podman
```
## Operator quick path
```bash
cp scrape.env.example scrape.env # or ./scripts/sync-token-from-gui.sh --force
./scripts/bootstrap-recurring-scrape.sh
./scripts/run-documents-scrape.sh
./scripts/setup-cron.sh --dry-run
```
Detail: [.docs/Recurring-Scrape-Setup.md](../.docs/Recurring-Scrape-Setup.md) · [operator checklist](recurring-scrape-operator-checklist.md) · [troubleshooting](../.docs/Recurring-Scrape-Troubleshooting.md)
## CI note (fork PRs)
Upstream workflows may show `action_required` for cross-repo PRs from `th3w1zard1/DiscordChatExporter` until a maintainer approves workflow runs. Local `run-all-smokes.sh` is the authoritative offline gate.

View file

@ -39,4 +39,6 @@ Validate scripts after changes:
./scripts/run-all-smokes.sh
```
Merge / review summary: [recurring-scrape-merge-readiness.md](recurring-scrape-merge-readiness.md)
Full detail: [.docs/Recurring-Scrape-Setup.md](../.docs/Recurring-Scrape-Setup.md)