DiscordChatExporter/docs/plans/2026-06-04-046-fix-scrape-run-lock-plan.md
Copilot b9bb4bbe64 fix(host): flock scrape lock prevents concurrent container exports
Overlapping run-operator-validation invocations spawned twin yes_general
exports and repeated OOM skips. Host scrape now holds .dce-scrape.lock;
smokes bypass via DCE_SKIP_SCRAPE_LOCK. Added lock smoke (20/20 pass).
2026-06-03 06:03:47 -05:00

1.3 KiB
Raw Blame History

title type status date origin
fix: Scrape run lock prevents concurrent container exports fix complete 2026-06-04 /lfg — duplicate KotOR validation runs left two yes_general exports OOM-looping

fix: Scrape run lock prevents concurrent container exports

Problem

Two overlapping run-operator-validation.sh --target KotOR_discord_msgs processes each started a full container scrape. Both exported yes_general (221726893064454144) with the same --after cursor, creating twin .dce-temp/export.* dirs (~2934 MiB each) and repeated OOM skips.

Cron uses flock, but manual/host validation does not — overlapping runs are unguarded.

Requirements

ID Requirement
R1 run-discord-scrape-host.sh scrape acquires non-blocking flock on $REPO_ROOT/.dce-scrape.lock
R2 DCE_SKIP_SCRAPE_LOCK=1 bypasses lock (smoke tests)
R3 Clear error when lock held; preflight unaffected
R4 Offline smoke asserts second scrape fails while lock held
R5 run-all-smokes.sh passes (19/19); docs note concurrent-run hazard

Verification

./scripts/tests/run-discord-scrape-host-lock-smoke.sh
DCE_MIN_FREE_MB=0 ./scripts/run-all-smokes.sh

Out of scope

  • Completing yes_general multi-hour catch-up inside LFG
  • Container memory limits / tuning