Document container OOM skips, scrape-lock contention, partial temp salvage, and DCE_CONTAINER_MEMORY in the troubleshooting guide and GUI bridge quick-start.
12 KiB
Recurring Discord Scrape Automation - Troubleshooting Guide
This guide covers common issues and their solutions.
Setup Issues
"Required file not found" Error
Symptoms: Setup fails with "Required file not found: /path/to/config.json"
Solutions:
- Verify config file exists:
ls -la config/scrape-targets.json - Check file permissions:
chmod 644 config/scrape-targets.json - Use absolute path in setup command:
./scripts/setup-cron.sh --config $(pwd)/config/scrape-targets.json
"Invalid JSON config" Error
Symptoms: Setup fails with "Invalid JSON config: ..."
Solutions:
- Validate JSON syntax:
jq empty config/scrape-targets.json - Common mistakes:
- Trailing commas in arrays/objects
- Unquoted keys
- Missing closing braces
- Use an online JSON validator if needed
"DISCORD_TOKEN must be set" Error
Symptoms: Preflight or scrape fails with token error
Solutions:
-
Set token in current session:
export DISCORD_TOKEN="your-token-here" ./scripts/run-discord-scrape.sh preflight -
Or set in scrape.env and source it:
source scrape.env ./scripts/run-discord-scrape.sh preflight -
Or use DISCORD_TOKEN_FILE for file-based tokens:
export DISCORD_TOKEN_FILE="/path/to/token/file" chmod 600 /path/to/token/file
"Target output_dir is outside archive_root" Error
Symptoms: Setup fails with path validation error
Solution: Update config to ensure output_dir is under archive_root:
{
"archive_root": "/home/user/discord-archives",
"targets": [
{
"output_dir": "/home/user/discord-archives/target1" // ✓ Under archive_root
}
]
}
Not this:
{
"archive_root": "/home/user/discord-archives",
"targets": [
{
"output_dir": "/tmp/exports" // ✗ Outside archive_root
}
]
}
Authentication Issues
"Guild discovery failed" Error
Symptoms: Preflight or scrape fails with guild discovery message
Causes:
- Using a bot token (cannot enumerate guilds)
- Invalid token
- Token lacks required permissions
Solutions:
-
For bot tokens: Provide explicit guild and channel IDs:
{ "name": "my-target", "guild_ids": ["123456789"], "channel_ids": ["111222333"] } -
For user tokens: Ensure the token is valid:
- Generate a new token from Discord Developer Portal
- Test token validity:
DISCORD_TOKEN=xxx ./scripts/run-discord-scrape.sh list-targets
-
Check permissions:
- Bot needs at least "Read Messages/View Channels" and "Read Message History"
- User token needs access to the target guilds/channels
"Export ... belongs to channel XXX, expected YYY" Error
Symptoms: Scrape fails when updating an existing archive
Cause: Archive's embedded channel ID doesn't match the configured channel
Solutions:
-
Update config to match archive:
- Check the existing archive file for the correct channel ID
- Update channel_ids in config
-
Or move the archive:
mv archive/old-location.json archive/target1/ -
Or update the channel mapping manually:
jq '.["111"] = "path/to/archive.json"' archive/.dce-meta/channel-map.json > tmp.json && mv tmp.json archive/.dce-meta/channel-map.json
Cron Schedule Issues
Cron Job Not Running
Symptoms: Cron job installed but exports aren't happening
Diagnostic steps:
-
Verify cron is installed:
crontab -l | grep discord-scrape -
Check if cron daemon is running:
sudo systemctl status cron # or on macOS: sudo launchctl list | grep cron -
Check system logs:
# Linux sudo grep CRON /var/log/syslog # or sudo grep discord-scrape /var/log/cron # macOS log stream --predicate 'eventMessage contains[c] "cron"' -
Test the script manually:
source scrape.env bash scripts/run-discord-scrape-host.sh scrape
"No such file or directory" in Cron Logs
Symptoms: Cron log shows script not found even though it exists
Causes:
- Path in crontab uses relative paths
- Directory changed since cron was installed
- Script permissions changed
Solutions:
-
Re-install cron with absolute paths:
cd /path/to/DiscordChatExporter ./scripts/setup-cron.sh --config $(pwd)/config/scrape-targets.json -
Ensure script is executable:
chmod +x scripts/run-discord-scrape-host.sh chmod +x scripts/run-discord-scrape.sh chmod +x scripts/setup-cron.sh
Cron Jobs Running at Wrong Time
Symptoms: Export runs at unexpected times
Solutions:
-
Check timezone setting:
date # System time timedatectl # System timezone -
Verify crontab schedule:
crontab -l -
Update schedule:
./scripts/setup-cron.sh --interval "daily" --at "2:00" -
Validate cron expression at crontab.guru
Export Issues
Exports Complete but Produce Empty Files
Symptoms: Archive files created but contain minimal/no messages
Solutions:
-
Verify channels are accessible:
export DISCORD_TOKEN="your-token" ./scripts/run-discord-scrape.sh preflight -
Check channel permissions:
- Ensure token has "Read Message History"
- Verify channel is not archived/deleted
-
Manual test export:
./scripts/run-discord-scrape.sh scrape --target target-name
"Archive is not valid JSON" Error
Symptoms: Existing archive file becomes corrupted
Solutions:
-
Audit all archives for a target:
./scripts/audit-archive-json.sh --target target-name -
Validate one file:
jq empty archive-file.json -
Truncated export (parse error mid-message): salvage drops the incomplete tail and keeps earlier messages. A timestamped
.bak.*backup is created first:./scripts/salvage-truncated-export.sh path/to/export.json -
If corrupted beyond salvage, restore from backup (if available)
-
If no backup, move the archive aside and re-export:
mv archive-file.json archive-file.json.bak ./scripts/run-discord-scrape.sh scrape --target target-name
Incremental Exports Are Too Slow
Symptoms: Each scheduled export takes several minutes
Solutions:
-
Check API rate limiting:
- Discord limits API calls per user
- Too many frequent exports can trigger rate limiting
- Increase interval between exports:
--interval "weekly"
-
Reduce scope:
- Export only recent messages: configure
afterdate in export - Split large channels into separate targets
- Export only recent messages: configure
-
Check system resources:
- Disk I/O bottleneck:
iostat -x 1 - CPU usage:
top - Memory:
free -h
- Disk I/O bottleneck:
Channel Export SKIPPED (OOM / Aborted / Killed)
Symptoms: Log shows SKIPPED for one channel, Aborted (core dumped), Killed, or out of memory; other channels in the target may still succeed.
Cause: Large multi-year catch-up (for example KotOR yes_general) builds a big in-memory JSON export inside the container. Partial progress is kept under output_dir/.dce-temp/ for salvage on the next run.
Solutions:
-
Salvage partial temps before re-scraping (avoids re-downloading from the archive cursor):
./scripts/scrape-lock-status.sh ./scripts/operator-handoff.sh --salvage-only --target KotOR_discord_msgs --channel 221726893064454144 -
Raise container memory in
scrape.env(default0= no compose cap):# scrape.env DCE_CONTAINER_MEMORY=8gThen run a channel-scoped catch-up:
DCE_MIN_FREE_MB=0 ./scripts/run-operator-validation.sh \ --salvage-before-scrape \ --target KotOR_discord_msgs \ --channel 221726893064454144 \ --log-file logs/kotor-yes-general.log -
Ensure only one scrape holds
{archive_root}/.dce-scrape.lock(see next section). -
Confirm host disk headroom — merges need temporary space on the archive volume (
df -h ~/Documents).
Scrape Lock Already Held
Symptoms: Scrape lock is held or Another scrape is already running when starting validation or documents scrape.
Cause: Only one scrape should run per archive_root. A long validation, cron job, or a second checkout (for example Downloads vs MyBook) can hold {archive_root}/.dce-scrape.lock.
Solutions:
-
Inspect lock state:
./scripts/scrape-lock-status.sh -
Wait for the active scrape to finish if PID is live.
-
Reclaim stale lock after a crash (only when status shows stale/free):
./scripts/scrape-lock-status.sh --reclaim-stale -
Do not delete the lock while a scrape is still running — twin exports can OOM-loop on the same channel.
Partial Export Stuck in .dce-temp
Symptoms: Large folder under output_dir/.dce-temp/export.<channel_id>.*; archive cursor not advancing; audit excludes .dce-temp (expected).
Solutions:
-
Stop any active export writing that temp (check lock status and running
podman/dockerprocesses). -
Salvage quiescent temps (default skips temps modified in the last ~120s):
./scripts/run-documents-scrape.sh --salvage-only --target NAME [--channel ID] -
Force salvage of an active temp only after confirming nothing is writing:
DCE_SALVAGE_ACTIVE_TEMPS=1 ./scripts/run-documents-scrape.sh --salvage-only --target NAME --channel ID -
Truncated JSON in the archive file itself (not
.dce-temp):./scripts/salvage-truncated-export.sh path/to/archive.json
"Failed to write archive" or Permission Denied
Symptoms: Export fails with write permission errors
Solutions:
-
Check directory permissions:
ls -la archive/target-name/ chmod 755 archive/target-name/ chmod 644 archive/target-name/*.json -
If using Docker/Podman, set user mode:
# For rootless podman export DCE_USERNS_MODE=keep-id export DCE_UID=$(id -u) export DCE_GID=$(id -g) -
Check SELinux (if enabled):
getenforce # If "Enforcing", add `:z` to mount options: # docker-compose.yml should already have this
Docker/Container Issues
"Failed to build image" Error
Symptoms: Docker build fails during setup
Solutions:
-
Verify Docker is running:
docker ps docker version -
Check disk space:
docker system df -
Clean up and retry:
docker system prune -a docker-compose build --no-cache -
If using Podman:
podman system prune -a podman-compose build --no-cache
"Cannot connect to Docker daemon" Error
Symptoms: Setup fails to reach Docker
Solutions:
-
For Docker:
sudo systemctl start docker sudo usermod -aG docker $USER newgrp docker -
For Podman (rootless):
systemctl --user start podman systemctl --user enable podman
Authorization / Token Refresh Issues
Host Retry Auth Flow Not Working
Symptoms: Export fails with 401/403 errors even with DISCORD_TOKEN_FILE set
Solutions:
-
Verify token file is readable:
cat $DISCORD_TOKEN_FILE -
Ensure proper permissions:
chmod 600 $DISCORD_TOKEN_FILE -
Check token is fresh:
- Tokens can expire
- Generate a new token from Discord Developer Portal
- Update the token file
-
Verify host wrapper is being used:
grep run-discord-scrape-host scripts/run-discord-scrape-host.sh
Getting Help
If you're still stuck:
-
Check existing issues: https://github.com/Tyrrrz/DiscordChatExporter/issues
-
Run preflight in verbose mode:
set -x # Enable debug output ./scripts/run-discord-scrape.sh preflight -
Check logs:
# Docker logs docker-compose logs --tail 50 # Cron logs (on Linux) sudo journalctl -u cron --since "1 hour ago" -
Collect error details for reporting issues:
- Config (sanitize token)
- Full error message
- OS/Docker version
- Steps to reproduce