feat(scrape): sync token from GUI Settings.dat for live exports

Add sync-token-from-gui.sh; bootstrap points to it when channels are
forbidden. Verified live incremental scrape on eod_discord with GUI token.
This commit is contained in:
Boden 2026-05-29 14:05:45 -05:00
parent 8c7ae90f3f
commit a0db7aec52
4 changed files with 108 additions and 2 deletions

View file

@ -0,0 +1,37 @@
---
title: feat: GUI token sync and live append proof
type: feat
status: completed
date: 2026-05-29
origin: LFG — Settings.dat holds user token; prove incremental scrape works beyond bot-token 403 skips
---
# feat: GUI token sync and live append proof
## Summary
Sync `scrape.env` from DiscordChatExporter GUI `Settings.dat`, add `sync-token-from-gui.sh`, wire bootstrap to suggest it, and verify one live append-only scrape with grow-only proof.
## Requirements
| ID | Requirement |
|----|-------------|
| R1 | `sync-token-from-gui.sh` writes `scrape.env` from discovered GUI token |
| R2 | Bootstrap mentions GUI sync when forbidden channels detected |
| R3 | Live preflight succeeds with GUI token on at least one target |
| R4 | `prove-incremental-append.sh` passes on one Documents target |
| R5 | linux-x64 `docker-compose` wrapper delegates to source repo |
## Implementation Units
### U1. GUI token sync script
**Files:** `scripts/sync-token-from-gui.sh`
### U2. Bootstrap + linux-x64 wrapper
**Files:** `scripts/bootstrap-recurring-scrape.sh`, `DiscordChatExporter.linux-x64/docker-compose.sh`
### U3. Live verification (runtime)
**Commands:** sync token, scrape one target, prove append

View file

@ -4,7 +4,7 @@ Use this after cloning or opening the **source** repo (`DiscordChatExporter`, no
## One-time setup ## One-time setup
1. `cp scrape.env.example scrape.env` and set `DISCORD_TOKEN` (user token recommended for guild history). 1. `cp scrape.env.example scrape.env` and set `DISCORD_TOKEN`, or `./scripts/sync-token-from-gui.sh --force` (reads GUI `Settings.dat`).
2. `./scripts/bootstrap-recurring-scrape.sh --dry-run` — confirm every **enabled** target has seeded JSON under `output_dir`. 2. `./scripts/bootstrap-recurring-scrape.sh --dry-run` — confirm every **enabled** target has seeded JSON under `output_dir`.
3. `./scripts/bootstrap-recurring-scrape.sh` — verify archives, build image, preflight Discord. 3. `./scripts/bootstrap-recurring-scrape.sh` — verify archives, build image, preflight Discord.
4. `./scripts/run-documents-scrape.sh` — first incremental append-only scrape. 4. `./scripts/run-documents-scrape.sh` — first incremental append-only scrape.

View file

@ -149,7 +149,8 @@ main() {
if grep -q 'inaccessible, but .* seeded archive' "$preflight_log" \ if grep -q 'inaccessible, but .* seeded archive' "$preflight_log" \
|| grep -qiE 'failed: forbidden|Missing Access' "$preflight_log"; then || grep -qiE 'failed: forbidden|Missing Access' "$preflight_log"; then
printf '\nToken note: many channels returned forbidden. That usually means a bot token without message-history access.\n' printf '\nToken note: many channels returned forbidden. That usually means a bot token without message-history access.\n'
printf ' For live incremental downloads, put a user token in %s (see .docs/Token-and-IDs.md).\n' "$ENV_FILE" printf ' For live incremental downloads, run: %s --force\n' "$REPO_ROOT/scripts/sync-token-from-gui.sh"
printf ' Or put a user token in %s (see .docs/Token-and-IDs.md).\n' "$ENV_FILE"
printf ' Append-only archives are still safe: existing JSON is updated in place and never fully re-downloaded.\n' printf ' Append-only archives are still safe: existing JSON is updated in place and never fully re-downloaded.\n'
fi fi
rm -f "$preflight_log" rm -f "$preflight_log"

68
scripts/sync-token-from-gui.sh Executable file
View file

@ -0,0 +1,68 @@
#!/usr/bin/env bash
set -Eeuo pipefail
SCRIPT_DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd -P)
REPO_ROOT="${DCE_REPO_ROOT:-$(cd "$SCRIPT_DIR/.." && pwd -P)}"
ENV_FILE="${DCE_ENV_FILE:-$REPO_ROOT/scrape.env}"
DISCOVER="$REPO_ROOT/scripts/discover-discord-token.sh"
SETUP_AUTH="$REPO_ROOT/scripts/setup-scrape-auth.sh"
FORCE=0
usage() {
cat <<EOF
Usage:
$(basename "$0") [--env-file PATH] [--force]
Write scrape.env from the DiscordChatExporter GUI Settings.dat or other
discovered token sources (see scripts/discover-discord-token.sh).
Options:
--env-file PATH Output env file (default: scrape.env)
--force Overwrite existing scrape.env
--help Show this help text
EOF
}
die() {
printf 'ERROR: %s\n' "$*" >&2
exit 1
}
main() {
while (($#)); do
case "$1" in
--env-file)
[[ $# -ge 2 ]] || die "Missing value for --env-file."
ENV_FILE=$2
shift 2
;;
--force)
FORCE=1
shift
;;
--help|-h)
usage
exit 0
;;
*)
die "Unknown option: $1"
;;
esac
done
[[ -x "$DISCOVER" ]] || die "Missing discover script: $DISCOVER"
if [[ -f "$ENV_FILE" && $FORCE -eq 0 ]]; then
die "$ENV_FILE already exists. Use --force to replace it from the GUI token."
fi
local token
token=$("$DISCOVER") || die "Could not discover a Discord token. Set DISCORDCHATEXPORTER_SETTINGS_PATH or export DISCORD_TOKEN."
DISCORD_TOKEN=$token "$SETUP_AUTH" --env-file "$ENV_FILE" --force
chmod 600 "$ENV_FILE" 2>/dev/null || true
printf 'Updated %s from discovered GUI/client token.\n' "$ENV_FILE"
}
main "$@"