/api/update/check offered the update whenever client !== latest (raw string
inequality, not semver) with no backoff. A device that can't APPLY the update
(broken OTA client 1.7.12, signing/Fire OS) keeps reporting the same version and is
told update_available=true on every poll; a fast poll loop saturates the event loop
(prod loop-lag 49s). All requests share one NAT IP, so IP-keying is useless.
server-only breaker (lib/ota-breaker.js), two independent axes:
- RATE breaker (primary, immediate): a key checking >THRESHOLD (3) times within
WINDOW (60s) is looping -> throttle update_available with exponential backoff
(30s->2m->8m->cap 30m). Healthy devices poll ~12 min and never approach this, so
rollout/stragglers are inherently safe -- NO grace-for-flood timer; slow == safe.
- PHANTOM guard (immediate): unrecognized version, or a prerelease of an OLDER core
(superseded old-minor beta e.g. 1.9.1-beta4), gets no-offer on the first check. A
RECENT real older version (beta3 vs latest beta4; stable 1.7.12) stays offerable.
- Never offers a downgrade (client >= latest -> no offer).
KEYING (#144 option 3): keyed on device_id when present, else reported version.
- server.js:581 accepts + logs ?device_id=, passes it to the breaker.
- UpdateChecker.kt:122 appends &device_id=<config.deviceId> (existing registered id;
omitted until provisioned). One-line client change.
beta4+ clients get precise per-device throttling; stuck legacy clients sending only
?version= are caught by the version-keyed + rate + phantom logic. Response gains
additive `reason` + `retry_after_seconds` (old clients ignore).
BOUNDED STATE: a periodic sweep (startSweep, wired in server.js) evicts buckets idle
> IDLE_RESET_MS so the keyed Map can't grow unbounded (churned device_ids); not
reset-on-access only.
SCOPE (deliberate): this targets the FAST flood + phantoms. The slow #144 drip
(stable 1.7.12 polling ~every 12 min, ~20/hr) stays below >3/60s and is NOT
throttled -- catching it needs #144 option-3 "skip-this-version after N cycles",
which is intentionally NOT in this build.
NOTE: carries a CLIENT/APK change -> versionCode must increment at the beta4 bump and
the release keystore is required for the APK. The device_id path only helps devices
that can install beta4+; the stuck legacy fleet is covered by the version-keyed path.
Tests: unit (lib/ota-breaker, injected time) a-f + comparator + escalation + sweep +
slow-drip-scope; HTTP integration (real endpoint, device_id passthrough). Full suite
green serial AND parallel (234). OTA-only delta -- reconnect/reclaim/shed/content-ack/
block untouched.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>