screentinker/server/lib
ScreenTinker dbac699854 fix(#143): content-ack flood control — per-device rate budget + loop-lag valve
#142's content-ack dedup is insufficient: a device cycling 2-4 content IDs makes
every ack look unique so dedup never fires, while aggregate volume from ~30 devices
saturates the event loop (the #142 reconnect throttle kept the server responsive,
which is how this was even observable).

Folded ONE control on the content-ack path (no competing limiters; reconnect-
throttle.js untouched) in lib/content-ack-limiter.js:
- Step 1 — per-device RATE budget: caps TOTAL non-duplicate acks per device per
  window regardless of differing content_id (the case dedup misses). Over budget =
  DROP silently (the per-ack log+emit is the cost); log ONCE per device per window
  when shedding starts. Keeps the #142 dedup (dedup'd repeats don't consume budget).
  Per-device, in-memory, resets on restart (modeled on lastPlayLogAt; does NOT reuse
  reconnect-throttle's ban-semantics bucket).
  Env (TUNING GUESSES, validate vs Bold's fleet): CONTENT_ACK_MAX_PER_WINDOW=20,
  CONTENT_ACK_RATE_WINDOW_MS=10000 (=2/s, above legit ~<=1/s, below the flood).
- Step 2 — global pressure valve: reuses the #142 loop-lag band (+ its hysteresis,
  no second control loop). Under CRITICAL band, shed content-acks even for an
  in-budget device; reconnects + dashboard/HTTP are ALWAYS processed; a healthy
  device in a non-critical band is never touched by the valve. Valve open/close
  logged once at the band edge in services/loop-lag.js (not per shed message).

Tests (unique ports 3985/3986, not the 3982/3983/3984 set):
- unit: the #143 regression (cycling ids evading dedup IS rate-limited), under/over
  budget, dedup still works + doesn't consume budget, valve sheds in-budget under
  critical while normal is untouched, rate precedence, window reset, per-device
  isolation.
- integration: socket flood is capped to budget with a single shed-start log;
  under-budget passes every ack; valve OPEN sheds content-acks while a reconnect +
  /api/status still succeed.
Full suite green serial AND parallel (208 tests).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-27 22:21:57 -05:00
..
agency-layouts.js feat: agency zone-grant issuance UI + reactive placement card (#73) 2026-06-14 15:12:55 -05:00
agency-targets.js feat: full-screen-only guardrail for agency designations (#73) 2026-06-14 17:36:30 -05:00
branding.js fix(security): patch quick-win findings from the codebase review 2026-06-08 19:02:19 -05:00
command-queue.js feat(socket): delivery queue for offline-device emits 2026-05-14 13:06:43 -05:00
content-ack-limiter.js fix(#143): content-ack flood control — per-device rate budget + loop-lag valve 2026-06-27 22:21:57 -05:00
content-ingest.js refactor(content): extract the upload ingest into a shared lib (#73) 2026-06-13 22:48:42 -05:00
device-sanitize.js fix(security): patch quick-win findings from the codebase review 2026-06-08 19:02:19 -05:00
image-gen.js feat(ai): generate background + foreground images for signs (#41 Phase 2) 2026-06-09 13:40:14 -05:00
pair-lockout.js fix(api): harden device pairing against brute-force (#87) 2026-06-12 20:16:12 -05:00
permissions.js feat(roles): add cross-org platform_operator staff role (#13) 2026-06-05 10:30:21 -05:00
reconnect-throttle.js fix(#142): load-aware per-device reconnect throttle (the outage fix) 2026-06-27 19:18:00 -05:00
schedule-eval.js feat(scheduling): per-item schedule blocks (#74 dayparting, #75 auto-expire) 2026-06-11 15:46:41 -05:00
schema-check.js fix(db): observable migrations + fail-fast schema verification (#37) 2026-06-09 09:31:52 -05:00
secretbox.js feat(ai): AI content design in the Designer, BYO endpoint (#41 Phase 1) 2026-06-09 12:23:55 -05:00
socket-rooms.js feat(socket): Phase 2.3 workspace-scoped dashboard socket rooms + per-command permission gates. Dashboard namespace was previously a flat broadcast - every connected dashboard received every device's status/screenshot/playback events platform-wide (foreign device names + IPs included). Inbound socket commands gated by a legacy admin/superadmin role check that was dead code post-Phase-1 rename. 2026-05-12 11:34:24 -05:00
tenancy.js feat(roles): add cross-org platform_operator staff role (#13) 2026-06-05 10:30:21 -05:00
tenant-cascade-migration.js fix(db): cascade tenant resources on workspace/org delete (#18 follow-up) 2026-06-08 16:01:52 -05:00
totp-lockout.js feat(server): TOTP primitives - encrypted secret, hashed recovery codes, verify lockout (#100) 2026-06-13 20:48:55 -05:00
totp.js feat(server): TOTP primitives - encrypted secret, hashed recovery codes, verify lockout (#100) 2026-06-13 20:48:55 -05:00
user-deletion.js feat(admin): Delete Organization + Workspace with cascade (#36) 2026-06-09 09:22:21 -05:00
zone-validate.js fix: per-item mute round-trip + multi-zone orphan-zone fallback & warnings 2026-06-22 23:16:29 -05:00