mirror of
https://github.com/Tyrrrz/DiscordChatExporter.git
synced 2026-06-09 15:52:37 -06:00
OOM/aborted channel exports no longer delete partial temp downloads. Salvage uses grep boundary repair with python merge/validate for files over 64 MiB. Retain stale temps when merge fails instead of discarding.
1.5 KiB
1.5 KiB
| title | type | status | date | origin |
|---|---|---|---|---|
| fix: Preserve partial exports on OOM skip; large-file salvage | fix | complete | 2026-06-03 | /lfg — yes_general re-downloads because OOM skip deletes partial temp; salvage fails on 500MB+ JSON |
fix: Preserve partial exports on OOM skip; large-file salvage
Problem
- OOM skip discards progress: When export exits 134/137/139,
scrape_targetSKIPs the channel andrm -rfs the temp dir — losing partial downloads (514 MB, 1 GB). - Salvage fails on large files: Python marker salvage +
jq emptyon 500 MB+ truncated JSON fails in container (mktemp/ memory). - Re-download loop: Stale temps discarded → incremental starts from 2021 archive cursor → 35+ min re-fetch every run.
Requirements
| ID | Requirement |
|---|---|
| R1 | On SKIPPED export (exit 2), do not delete temp dir — leave for next-run salvage |
| R2 | salvage_truncated_json uses grep/head boundary repair; mktemp uses ${TMPDIR:-/tmp} |
| R3 | Skip full-file jq empty on exports > 64 MiB; validate via python message-count probe |
| R4 | Large merge (>64 MiB combined) uses python id-merge instead of jq |
| R5 | Smoke tests pass; salvage-stale smoke unchanged |
| R6 | Salvage current 1 GB yes_general temp, merge into archive, verify --after advances |
Verification
./scripts/tests/run-discord-scrape-smoke.sh
DCE_MIN_FREE_MB=0 ./scripts/run-all-smokes.sh
# After merge, incremental should show recent dateRange.after not 2021