mirror of
https://github.com/Tyrrrz/DiscordChatExporter.git
synced 2026-06-09 15:52:37 -06:00
feat(scrape): optional DCE_CONTAINER_MEMORY compose mem_limit
Operators can raise the scrape container memory cap for large channel catch-up (e.g. yes_general) via scrape.env without changing default runs.
This commit is contained in:
parent
88267c835c
commit
69ce1ca539
|
|
@ -4,6 +4,8 @@ services:
|
||||||
context: .
|
context: .
|
||||||
dockerfile: Dockerfile
|
dockerfile: Dockerfile
|
||||||
image: discordchatexporter-cron:local
|
image: discordchatexporter-cron:local
|
||||||
|
# 0 = no cap (default). Set DCE_CONTAINER_MEMORY=8g in scrape.env for large channel catch-up.
|
||||||
|
mem_limit: ${DCE_CONTAINER_MEMORY:-0}
|
||||||
init: true
|
init: true
|
||||||
user: "${DCE_UID:-1000}:${DCE_GID:-1000}"
|
user: "${DCE_UID:-1000}:${DCE_GID:-1000}"
|
||||||
userns_mode: "${DCE_USERNS_MODE:-}"
|
userns_mode: "${DCE_USERNS_MODE:-}"
|
||||||
|
|
|
||||||
|
|
@ -0,0 +1,67 @@
|
||||||
|
---
|
||||||
|
title: "feat: Optional container memory limit for large channel exports"
|
||||||
|
type: feat
|
||||||
|
status: complete
|
||||||
|
date: 2026-06-04
|
||||||
|
origin: /lfg — yes_general OOM repeatedly deferred; operators need a documented knob without changing default scrape behavior
|
||||||
|
---
|
||||||
|
|
||||||
|
# feat: Optional container memory limit for large channel exports
|
||||||
|
|
||||||
|
## Summary
|
||||||
|
|
||||||
|
Add `DCE_CONTAINER_MEMORY` so operators can raise the scrape container memory cap for multi-year catch-up channels like KotOR `yes_general` without affecting default runs (unlimited / runtime default when unset).
|
||||||
|
|
||||||
|
## Problem
|
||||||
|
|
||||||
|
`yes_general` (`221726893064454144`) legitimately fetches years of history on first catch-up. The .NET exporter inside the container OOMs on large in-memory JSON builds. Plans 043–051 preserved partial temps and salvage paths, but every full export retry still hits the same memory ceiling unless the operator manually tweaks Podman/Docker.
|
||||||
|
|
||||||
|
## Requirements
|
||||||
|
|
||||||
|
| ID | Requirement |
|
||||||
|
|----|-------------|
|
||||||
|
| R1 | `docker-compose.yml` applies `mem_limit` from `DCE_CONTAINER_MEMORY` (0 = no compose cap) |
|
||||||
|
| R2 | `run-discord-scrape-host.sh` passes `DCE_CONTAINER_MEMORY` into compose env temp when set in shell or `scrape.env` |
|
||||||
|
| R3 | `scrape.env.example` documents `DCE_CONTAINER_MEMORY` with yes_general example (`8g`) |
|
||||||
|
| R4 | Operator docs mention the knob for large-channel catch-up |
|
||||||
|
| R5 | Host smoke asserts compose env receives `DCE_CONTAINER_MEMORY=8g` when configured |
|
||||||
|
| R6 | `DCE_MIN_FREE_MB=0 ./scripts/run-all-smokes.sh` passes (21/21) |
|
||||||
|
|
||||||
|
## Implementation Units
|
||||||
|
|
||||||
|
### U1. Compose memory limit wiring
|
||||||
|
|
||||||
|
**Files:** `docker-compose.yml`, `scripts/run-discord-scrape-host.sh`, `scrape.env.example`
|
||||||
|
|
||||||
|
- Service `mem_limit: ${DCE_CONTAINER_MEMORY:-0}` (0 = unlimited for Docker/Podman)
|
||||||
|
- `write_compose_env_temp` writes `DCE_CONTAINER_MEMORY` (explicit value or `0`)
|
||||||
|
- Host usage text documents env var
|
||||||
|
|
||||||
|
### U2. Operator documentation
|
||||||
|
|
||||||
|
**Files:** `docs/recurring-scrape-operator-checklist.md`, `docs/recurring-scrape-merge-readiness.md`
|
||||||
|
|
||||||
|
- yes_general section: set `DCE_CONTAINER_MEMORY=8g` (or host-appropriate) before channel-scoped validation
|
||||||
|
- Merge-readiness plan 063 stamp
|
||||||
|
|
||||||
|
### U3. Smoke coverage
|
||||||
|
|
||||||
|
**Files:** `scripts/tests/run-discord-scrape-host-smoke.sh`
|
||||||
|
|
||||||
|
- Fake compose logs loaded `DCE_CONTAINER_MEMORY` from compose env file
|
||||||
|
- Assert `8g` when set in scrape.env fixture
|
||||||
|
|
||||||
|
## Verification
|
||||||
|
|
||||||
|
```bash
|
||||||
|
./scripts/tests/run-discord-scrape-host-smoke.sh
|
||||||
|
DCE_MIN_FREE_MB=0 ./scripts/run-all-smokes.sh
|
||||||
|
```
|
||||||
|
|
||||||
|
## Scope Boundaries
|
||||||
|
|
||||||
|
### Deferred
|
||||||
|
|
||||||
|
- Live KotOR catch-up execution inside LFG
|
||||||
|
- Per-channel memory overrides in `scrape-targets.json`
|
||||||
|
- Streaming export to avoid in-memory JSON (upstream DCE feature)
|
||||||
|
|
@ -144,7 +144,15 @@ docker compose build # or podman-compose build
|
||||||
DCE_MIN_FREE_MB=0 ./scripts/run-operator-validation.sh --target KotOR_discord_msgs
|
DCE_MIN_FREE_MB=0 ./scripts/run-operator-validation.sh --target KotOR_discord_msgs
|
||||||
```
|
```
|
||||||
|
|
||||||
Large `yes_general` may still skip; export that channel separately with more container memory if needed.
|
Large `yes_general` may still skip without a higher container cap; set `DCE_CONTAINER_MEMORY=8g` in `scrape.env` and export that channel separately:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# scrape.env: DCE_CONTAINER_MEMORY=8g
|
||||||
|
DCE_MIN_FREE_MB=0 ./scripts/run-operator-validation.sh \
|
||||||
|
--salvage-before-scrape --target KotOR_discord_msgs --channel 221726893064454144
|
||||||
|
```
|
||||||
|
|
||||||
|
**Plan 063 (2026-06-04):** Optional `DCE_CONTAINER_MEMORY` compose `mem_limit` for large channel catch-up (default 0 = unlimited).
|
||||||
|
|
||||||
**Disk:** ~65 GiB free on `/home` (2026-05-30); large channel merges still need headroom.
|
**Disk:** ~65 GiB free on `/home` (2026-05-30); large channel merges still need headroom.
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -58,7 +58,13 @@ Salvage then incremental scrape:
|
||||||
./scripts/run-operator-proof.sh --salvage-before-scrape --sync-gui --target NAME
|
./scripts/run-operator-proof.sh --salvage-before-scrape --sync-gui --target NAME
|
||||||
```
|
```
|
||||||
|
|
||||||
**KotOR yes_general** (`221726893064454144`): first catch-up after a 2021 archive cursor can take hours and may OOM; salvage preserved partials before retrying. Stop duplicate validation processes (MyBook vs Downloads checkouts share the same lock).
|
**KotOR yes_general** (`221726893064454144`): first catch-up after a 2021 archive cursor can take hours and may OOM; salvage preserved partials before retrying. Stop duplicate validation processes (MyBook vs Downloads checkouts share the same lock). For large catch-up, set `DCE_CONTAINER_MEMORY=8g` in `scrape.env` (or export before the run), then:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
./scripts/run-operator-validation.sh --salvage-before-scrape \
|
||||||
|
--target KotOR_discord_msgs --channel 221726893064454144 \
|
||||||
|
--log-file logs/kotor-yes-general.log
|
||||||
|
```
|
||||||
|
|
||||||
## GUI zip only
|
## GUI zip only
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -18,3 +18,7 @@ DCE_GID=1000
|
||||||
# For rootless podman, set this to keep-id so mounted archive roots stay writable.
|
# For rootless podman, set this to keep-id so mounted archive roots stay writable.
|
||||||
# Leave it empty on Docker unless you explicitly need a user namespace mode there.
|
# Leave it empty on Docker unless you explicitly need a user namespace mode there.
|
||||||
DCE_USERNS_MODE=
|
DCE_USERNS_MODE=
|
||||||
|
|
||||||
|
# Optional: raise scrape container memory for multi-year channel catch-up (yes_general, etc.).
|
||||||
|
# Examples: 8g, 8192m. Default 0 = no compose memory cap.
|
||||||
|
# DCE_CONTAINER_MEMORY=8g
|
||||||
|
|
|
||||||
|
|
@ -42,6 +42,7 @@ Environment:
|
||||||
DCE_REAUTH_COMMAND Optional absolute path to an executable reauth script under the repo root.
|
DCE_REAUTH_COMMAND Optional absolute path to an executable reauth script under the repo root.
|
||||||
DCE_COMPOSE_TTY When zero, compose run passes -T (no pseudo-TTY). Default omits -T
|
DCE_COMPOSE_TTY When zero, compose run passes -T (no pseudo-TTY). Default omits -T
|
||||||
so compose backends allocate a TTY for line-buffered progress logs.
|
so compose backends allocate a TTY for line-buffered progress logs.
|
||||||
|
DCE_CONTAINER_MEMORY Optional container memory cap (e.g. 8g, 8192m). Default 0 = unlimited.
|
||||||
|
|
||||||
Notes:
|
Notes:
|
||||||
When $ENV_FILE is missing, exported DISCORD_TOKEN or DISCORD_TOKEN_FILE is used instead.
|
When $ENV_FILE is missing, exported DISCORD_TOKEN or DISCORD_TOKEN_FILE is used instead.
|
||||||
|
|
@ -183,6 +184,11 @@ write_compose_env_temp() {
|
||||||
if [[ -n "${DCE_GID:-}" ]]; then
|
if [[ -n "${DCE_GID:-}" ]]; then
|
||||||
printf 'DCE_GID=%s\n' "$DCE_GID" >>"$COMPOSE_ENV_TEMP"
|
printf 'DCE_GID=%s\n' "$DCE_GID" >>"$COMPOSE_ENV_TEMP"
|
||||||
fi
|
fi
|
||||||
|
if [[ -n "${DCE_CONTAINER_MEMORY:-}" ]]; then
|
||||||
|
printf 'DCE_CONTAINER_MEMORY=%s\n' "$DCE_CONTAINER_MEMORY" >>"$COMPOSE_ENV_TEMP"
|
||||||
|
else
|
||||||
|
printf 'DCE_CONTAINER_MEMORY=0\n' >>"$COMPOSE_ENV_TEMP"
|
||||||
|
fi
|
||||||
}
|
}
|
||||||
|
|
||||||
configure_rootless_compose() {
|
configure_rootless_compose() {
|
||||||
|
|
|
||||||
|
|
@ -195,7 +195,28 @@ COMPOSE_TTY_LOG="$TMP_DIR/compose-tty-default.log"
|
||||||
FAKE_COMPOSE="$TMP_DIR/fake-compose"
|
FAKE_COMPOSE="$TMP_DIR/fake-compose"
|
||||||
cat >"$FAKE_COMPOSE" <<'EOF'
|
cat >"$FAKE_COMPOSE" <<'EOF'
|
||||||
#!/usr/bin/env bash
|
#!/usr/bin/env bash
|
||||||
printf '%s\n' "$*" >>"${FAKE_COMPOSE_ARGS_LOG:?}"
|
all_args=( "$@" )
|
||||||
|
while (($#)); do
|
||||||
|
case "$1" in
|
||||||
|
--env-file)
|
||||||
|
if [[ $# -ge 2 && -f "$2" ]]; then
|
||||||
|
while IFS='=' read -r env_key env_value || [[ -n "$env_key" ]]; do
|
||||||
|
[[ -z "$env_key" || "$env_key" =~ ^# ]] && continue
|
||||||
|
env_key=${env_key#export }
|
||||||
|
env_key=${env_key%%[[:space:]]*}
|
||||||
|
printf -v "$env_key" '%s' "$env_value"
|
||||||
|
export "$env_key"
|
||||||
|
done <"$2"
|
||||||
|
fi
|
||||||
|
shift 2
|
||||||
|
;;
|
||||||
|
*)
|
||||||
|
shift
|
||||||
|
;;
|
||||||
|
esac
|
||||||
|
done
|
||||||
|
printf 'env:DCE_CONTAINER_MEMORY=%s\n' "${DCE_CONTAINER_MEMORY:-}" >>"${FAKE_COMPOSE_ARGS_LOG:?}"
|
||||||
|
printf '%s\n' "${all_args[*]}" >>"${FAKE_COMPOSE_ARGS_LOG:?}"
|
||||||
printf 'run succeeded\n'
|
printf 'run succeeded\n'
|
||||||
EOF
|
EOF
|
||||||
chmod +x "$FAKE_COMPOSE"
|
chmod +x "$FAKE_COMPOSE"
|
||||||
|
|
@ -220,4 +241,17 @@ grep -qE '(^|[[:space:]])-T([[:space:]]|$)' "$COMPOSE_NOTTY_LOG" || {
|
||||||
exit 1
|
exit 1
|
||||||
}
|
}
|
||||||
|
|
||||||
|
MEM_ENV="$TMP_DIR/mem.env"
|
||||||
|
cat >"$MEM_ENV" <<EOF
|
||||||
|
DISCORD_TOKEN=dummy
|
||||||
|
DCE_CONTAINER_MEMORY=8g
|
||||||
|
EOF
|
||||||
|
COMPOSE_MEM_LOG="$TMP_DIR/compose-mem.log"
|
||||||
|
run_host_compose_capture "$MEM_ENV" "$FAKE_COMPOSE" "$COMPOSE_MEM_LOG" >/dev/null
|
||||||
|
grep -q 'env:DCE_CONTAINER_MEMORY=8g' "$COMPOSE_MEM_LOG" || {
|
||||||
|
echo "expected DCE_CONTAINER_MEMORY=8g in compose env file passthrough" >&2
|
||||||
|
cat "$COMPOSE_MEM_LOG" >&2
|
||||||
|
exit 1
|
||||||
|
}
|
||||||
|
|
||||||
echo "run-discord-scrape-host smoke test passed"
|
echo "run-discord-scrape-host smoke test passed"
|
||||||
|
|
|
||||||
Loading…
Reference in a new issue