/api/update/check offered the update whenever client !== latest (raw string
inequality, not semver) with no backoff. A device that can't APPLY the update
(broken OTA client 1.7.12, signing/Fire OS) keeps reporting the same version and is
told update_available=true on every poll; a fast poll loop saturates the event loop
(prod loop-lag 49s). All requests share one NAT IP, so IP-keying is useless.
server-only breaker (lib/ota-breaker.js), two independent axes:
- RATE breaker (primary, immediate): a key checking >THRESHOLD (3) times within
WINDOW (60s) is looping -> throttle update_available with exponential backoff
(30s->2m->8m->cap 30m). Healthy devices poll ~12 min and never approach this, so
rollout/stragglers are inherently safe -- NO grace-for-flood timer; slow == safe.
- PHANTOM guard (immediate): unrecognized version, or a prerelease of an OLDER core
(superseded old-minor beta e.g. 1.9.1-beta4), gets no-offer on the first check. A
RECENT real older version (beta3 vs latest beta4; stable 1.7.12) stays offerable.
- Never offers a downgrade (client >= latest -> no offer).
KEYING (#144 option 3): keyed on device_id when present, else reported version.
- server.js:581 accepts + logs ?device_id=, passes it to the breaker.
- UpdateChecker.kt:122 appends &device_id=<config.deviceId> (existing registered id;
omitted until provisioned). One-line client change.
beta4+ clients get precise per-device throttling; stuck legacy clients sending only
?version= are caught by the version-keyed + rate + phantom logic. Response gains
additive `reason` + `retry_after_seconds` (old clients ignore).
BOUNDED STATE: a periodic sweep (startSweep, wired in server.js) evicts buckets idle
> IDLE_RESET_MS so the keyed Map can't grow unbounded (churned device_ids); not
reset-on-access only.
SCOPE (deliberate): this targets the FAST flood + phantoms. The slow #144 drip
(stable 1.7.12 polling ~every 12 min, ~20/hr) stays below >3/60s and is NOT
throttled -- catching it needs #144 option-3 "skip-this-version after N cycles",
which is intentionally NOT in this build.
NOTE: carries a CLIENT/APK change -> versionCode must increment at the beta4 bump and
the release keystore is required for the APK. The device_id path only helps devices
that can install beta4+; the stuck legacy fleet is covered by the version-keyed path.
Tests: unit (lib/ota-breaker, injected time) a-f + comparator + escalation + sweep +
slow-drip-scope; HTTP integration (real endpoint, device_id passthrough). Full suite
green serial AND parallel (234). OTA-only delta -- reconnect/reclaim/shed/content-ack/
block untouched.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Bold: screens sit on the Connect page showing the server URL = paired server-side
but never told, so the app never starts playing.
Flow / gap (Step A):
- CLIENT leaves the Connect page ONLY on the 'device:paired' event — web player
(player/index.html) hides the setup screen; Android ProvisioningActivity.onPaired
launches MainActivity + finish(). That event is the sole signal.
- SERVER pushes 'device:paired' to the device's room from POST /api/provision/pair
(server.js) at pair time — but ONLY reaches a LIVE socket then. The normal
device_id reconnect path emitted device:registered + device:playlist-update but
NOT device:paired. So a screen paired while disconnected, or that reconnects after
pairing (exactly the screens cycling on the Connect page), is paired server-side
(user_id set, receiving playlists) yet never gets device:paired -> stuck on Connect.
Fix (server-only, uses the EXISTING client listener — no client update needed, which
matters because we can't push a client update to stuck screens): on the device_id
reconnect, if the device is paired (user_id set), re-emit 'device:paired'
{device_id, name}. Push-on-pair (server.js) already covers the live-at-pair-time
case; this covers paired-then-reconnect. A paired screen now leaves Connect and
plays on its next reconnect with no client change and no manual re-pair.
Tests (port 3989, real flow): provision -> pair via /api/provision/pair (socket
closed) -> reconnect RECEIVES device:paired (+name +playlist) — the stuck-screen
repro; an unpaired device gets NO device:paired (stays on the pairing flow); the fix
reuses the existing device:paired event (no new protocol). Full suite green serial
AND parallel (220); dbac699 / 404c330 / e734281 intact.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Bold beta1: three devices spam "Fingerprint reclaim rejected ... device active
(status=offline, ~2500s since heartbeat, liveConn=false)" twice/~2s indefinitely —
contradictory: gone by every signal yet treated as active.
Root cause (NOT a missing clear — corrected the hypothesis). The reject condition
was `liveConn || status==='online' || secondsSince < RECLAIM_GRACE_SECONDS(24h)`.
For the observed devices liveConn=false and status=offline, so the ONLY true term
is `secondsSince < 24h` — an effective 24h CALENDAR grace, not a stale flag. Audited
the clears: liveConn (deviceConnections) is removed on the debounced disconnect
(heartbeat.removeConnection) AND the offline_timeout sweep (deviceConnections.delete);
status is set 'offline' on both. liveConn=false + status=offline PROVE the clears
ran — there is nothing stale to clear. The 24h time gate (mislabeled "device active")
blocked a legitimately-gone device from reclaiming for up to 24h, so it retried
every ~2s forever-in-practice. The "twice per ~2s" is two reclaim ATTEMPTS per cycle
(client reconnect + re-pair-on-auth-error), each hitting the single console.warn —
not double-logging in one attempt.
Fix:
- Decide "still alive" from RUNTIME signals: `!!liveConn || secondsSince <
reclaimSettleSeconds`. A device with no live socket and a heartbeat older than the
settle window is gone -> reclaimable. A live (or just-seen) device is still
rejected, so reclaim-abuse protection holds. NOT just ignoring "active" — it fixes
WHY it was stuck (the 24h gate). RECLAIM_SETTLE_SECONDS default 300 (was 24h).
SECURITY TRADEOFF flagged in config: shortens the anti-fingerprint-theft window;
raise to re-tighten. Tuning guess to validate vs Bold.
- Log throttle: the deferral logs at most once per device per RECLAIM_REJECT_LOG_
WINDOW_MS (default 60s) — collapses the double-log + the per-2s flood (same
discipline as the content-ack shed log). Cleared when a reclaim proceeds.
Recovery of the 3 wedged devices (2febcaa9, 1984694c, 139159eb): they SELF-HEAL on
their next reclaim attempt (~2s) once this ships — their heartbeats are ~2500s stale
(>300s settle) and liveConn=false, so the reclaim now succeeds. No operator SQL needed.
Tests (port 3988): gone device reclaims; live device still rejected; clear-on-leave
(disconnect clears liveConn -> stale device reclaims); deferral log <=1 per window.
Full suite green serial+parallel (217). reconnect-throttle.js, the dbac699 content-ack
limiter, and the 404c330 block/auth code untouched.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Highest-priority #143 item (operator finding from Bold): nulling a device's token
did NOT lock it out — device 75c2a08a immediately reconnected and saturated the
loop. Two distinct defects:
1. Auth short-circuit (the cause). device:register used
if (device.device_token && !validateDeviceToken(...)) { reject }
so a NULL/empty STORED token made the guard falsy -> validation SKIPPED, and the
next block even MINTED a fresh token and persisted it. Nulling a token thus
RE-PROVISIONED the device instead of locking it out. Fix: drop the
`device.device_token &&` guard -> `if (!validateDeviceToken(device_id, device_token))`
(validateDeviceToken already returns false for null-stored/missing/mismatch), and
remove the legacy "mint a token for a null-token device" path (the re-provision
vector). An already-provisioned device (every row, incl. 'provisioning', is created
WITH a token) presenting null/empty/invalid is now REJECTED + disconnected.
The first-pairing seam is unaffected: a brand-new device has NO device_id and goes
through the pairing_code branch (which mints id+token) — a different code path.
2. No server-side kill switch. Added a `blocked` column (devices.blocked INTEGER
NOT NULL DEFAULT 0; schema.sql + a database.js migration). The block is the FIRST
gate at the top of device:register — before the fingerprint block, the reconnect
throttle, any DB writes, or playlist build — so a blocked device's socket is
refused immediately (auth-error 'Device blocked' + disconnect, zero further work).
It does NOT rely on null-token (the thing that failed). The row is re-read every
register, so a DIRECT SQLite edit takes effect on the device's NEXT reconnect with
NO server restart. Operator statements (dashboard-down, hand-edit):
block: UPDATE devices SET blocked = 1 WHERE id = '<device_id>';
unblock: UPDATE devices SET blocked = 0 WHERE id = '<device_id>';
Tests (port 3987): nulled-token provisioned device is REJECTED (75c2a08a repro);
blocked=1 refused at the first gate (no register/playlist); unblock reconnects;
first-pairing still works; normal valid-token device unaffected. Full suite green
serial AND parallel (213); reconnect-throttle.js + the dbac699 content-ack limiter
untouched.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
#142's content-ack dedup is insufficient: a device cycling 2-4 content IDs makes
every ack look unique so dedup never fires, while aggregate volume from ~30 devices
saturates the event loop (the #142 reconnect throttle kept the server responsive,
which is how this was even observable).
Folded ONE control on the content-ack path (no competing limiters; reconnect-
throttle.js untouched) in lib/content-ack-limiter.js:
- Step 1 — per-device RATE budget: caps TOTAL non-duplicate acks per device per
window regardless of differing content_id (the case dedup misses). Over budget =
DROP silently (the per-ack log+emit is the cost); log ONCE per device per window
when shedding starts. Keeps the #142 dedup (dedup'd repeats don't consume budget).
Per-device, in-memory, resets on restart (modeled on lastPlayLogAt; does NOT reuse
reconnect-throttle's ban-semantics bucket).
Env (TUNING GUESSES, validate vs Bold's fleet): CONTENT_ACK_MAX_PER_WINDOW=20,
CONTENT_ACK_RATE_WINDOW_MS=10000 (=2/s, above legit ~<=1/s, below the flood).
- Step 2 — global pressure valve: reuses the #142 loop-lag band (+ its hysteresis,
no second control loop). Under CRITICAL band, shed content-acks even for an
in-budget device; reconnects + dashboard/HTTP are ALWAYS processed; a healthy
device in a non-critical band is never touched by the valve. Valve open/close
logged once at the band edge in services/loop-lag.js (not per shed message).
Tests (unique ports 3985/3986, not the 3982/3983/3984 set):
- unit: the #143 regression (cycling ids evading dedup IS rate-limited), under/over
budget, dedup still works + doesn't consume budget, valve sheds in-budget under
critical while normal is untouched, rate precedence, window reset, per-device
isolation.
- integration: socket flood is capped to budget with a single shed-start log;
under-budget passes every ack; valve OPEN sheds content-acks while a reconnect +
/api/status still succeed.
Full suite green serial AND parallel (208 tests).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Documents the #142 changes and tells operators with an already-bloated
device_status_log to reclaim space with a one-time manual VACUUM in a maintenance
window (retention now bounds further growth). Explains why auto-VACUUM is not
enabled. New doc: docs/maintenance-device-status-log.md.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Brings the full #142 stack onto main on top of the 1.9.1 stable cut:
- device_status_log index + de-dupe
- event-loop lag telemetry (bounded)
- load-aware per-device reconnect throttle (the outage fix)
- global device_status_log retention sweep (STATUS_LOG_RETENTION_DAYS)
- content-ack dedup
- provisioning-row cleanup window 365d -> 24h
services/heartbeat.js deleted unclaimed provisioning devices with
created_at < now - (365 * 86400) — a YEAR — while its own comment said "older
than 24 hours". So socket-register pairing junk lingered ~365x longer than
intended. Change the window to 24 * 3600 to match the comment.
Correctness fix only — does NOT touch the pre-auth register path or add a rate
limiter (that pre-auth hardening is a separate security issue, out of this cut).
Extracted the sweep into pruneProvisioningDevices() (still in heartbeat.js, called
from the same interval) so it is unit-testable. Test asserts a >24h unclaimed
provisioning row is swept while a <24h row, an imported row (user_id set), and a
non-provisioning row are kept.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
device:content-ack logged + emitted every message, so a device repeatedly
reporting the same "content <id>: ready" (observed from an older app version)
added avoidable load per message.
- Suppress identical (device_id, content_id, status) reports within
config.contentAckDedupMs (default 10s), modeled on the lastPlayLogAt throttle.
A status change has a different key and passes immediately; a fresh report after
the window passes too. In-memory, resets on restart. The handler does no DB
writes, so this is purely shedding redundant log+emit work.
test: integration over a real authenticated device socket — a burst of identical
"ready" collapses to one log/emit, a "ready" after the window passes, and a status
change is never deduped. Unique PORT (3984).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The per-device insert-time prune (deviceSocket.js) only ever touches a device
that is actively inserting, so it misses two paths: removed/idle devices whose
rows linger forever, and heartbeat.js's offline_timeout insert that bypasses
logDeviceStatus entirely. The reporter's 1.2M-row bloat accumulated UNDER a 7-day
per-device prune for exactly this reason.
- pruneStatusLog() (db/database.js): a GLOBAL time-range sweep across ALL devices,
modeled on the play_logs prune. Run once on startup (recovers a bloated table
right after deploy) and on the heartbeat interval (services/heartbeat.js).
- STATUS_LOG_RETENTION_DAYS env, default 3 (lower than the old hardcoded 7d; the
dashboard only shows a 24h uptime window, so 2-3d is ample for diagnostics).
- Deliberately NO per-device row cap: Step 3's throttle already bounds how fast a
storming device can generate status rows, so a cap would add sweep complexity
for little gain (noted for later if needed).
- NO VACUUM / auto_vacuum here (kept off the hot path); space reclaim is left as a
separate decision (see report).
test: deterministic in-process unit test proves the sweep deletes over-retention
rows across all devices — including a device absent from the devices table and an
offline_timeout row — while keeping recent rows; idempotent on an empty table.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Gates genuine reconnects PER DEVICE before the heavy register work (DB writes +
playlist build) runs, so a single flapping device can no longer saturate the
event loop and take down the server.
- Actuator is per-device, keyed on device_id (modeled on lastPlayLogAt). A device
is flagged only when it exceeds reconnectBaseMax genuine reconnects per window.
Same-socket playlist refreshes (isPlaylistRefresh) are exempt.
- Load-awareness is BANDED (normal/elevated/critical from the step-2 lag signal),
not a continuous controller. The band only MULTIPLIES an already-flagged
device's backoff; global lag never gates a healthy device.
- Hysteresis: escalate immediately while storming (tighten fast); decay one level
per reconnectReleaseMs of calm (release slow).
- HARD CEILING per device, independent of band and warm-up — a slow-ramp attacker
can't train through it.
- COLD START: for reconnectWarmupMs after boot, force the normal band and apply
only the hard ceiling, so a full-fleet reconnect after a deploy doesn't throttle
healthy screens. State is in-memory, resets on restart.
- Observability: every throttle engagement logs device, band, observed vs allowed
rate, and backoff. Throttled device gets device:throttled + a deferred disconnect.
Tests (api.test.js style):
- unit: healthy-never-throttled, storm-throttled-with-growing-backoff, band
multiplies backoff, hard-ceiling-even-in-warmup, warm-up leniency, neighbor
isolation, slow release.
- integration GATE (the required one): full-fleet reconnect right after restart
throttles NO healthy device; a single device storming IS throttled; a neighbor
stays unaffected while another storms.
- also fixes pre-existing test PORT collisions (my new integration files clashed
with totp.test.js:3979 and totp-keyrotation.test.js:3980 -> moved to 3982/3983);
full suite now green serially AND in parallel.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Continuously samples event-loop delay via perf_hooks.monitorEventLoopDelay()
(C++-backed histogram; cheap). Each window persists mean/p50/p99/max to a new
event_loop_lag table and recomputes a coarse load band (normal/elevated/critical)
from the window p99. Standalone value: current lag is exposed on /api/status and
band changes are logged, so site lag is diagnosable independent of throttling.
The band feeds the #142 reconnect throttle (next commit) but ships first as its
own subsystem.
- event_loop_lag is bounded from day one: indexed on sampled_at + scheduled prune
(LAG_TELEMETRY_RETENTION_DAYS, small default) modeled on the play_logs prune.
Deliberately NOT another unbounded-growth table.
- Band transitions are asymmetric: jump up immediately (tighten fast), release one
level at a time after N calm samples below a deadband (release slow, no flap).
Pure nextBand() function, unit-tested deterministically.
- config: LAG_SAMPLE_INTERVAL_MS, LAG_RESOLUTION_MS, LAG_TELEMETRY_RETENTION_DAYS,
LAG_PRUNE_INTERVAL_MS, LAG_ELEVATED_MS, LAG_CRITICAL_MS, LAG_RELEASE_SAMPLES.
- tests: band-transition unit tests; integration proves sampling persists, stays
bounded under the prune, and surfaces on /api/status.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The dashboard uptime query (WHERE device_id=? AND timestamp>?) and the
per-device retention prune (WHERE device_id=? AND timestamp<?) were both full
table scans. At 1M+ rows (the outage report) this was the dashboard-degradation
cause that persisted even after the reconnect storm stopped.
- schema.sql: add idx_device_status_log_device_ts(device_id, timestamp); both
queries now SEARCH ... USING INDEX instead of SCAN (verified via EXPLAIN).
- database.js: same index as a migration for existing DBs (idempotent).
- schema.sql defined device_status_log twice; drop the duplicate.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add a pre-push fast-forward check: fetch origin/main and abort if it has commits not in local HEAD, BEFORE the annotated tag is created. Prevents the beta9 incident where origin/main had advanced by one commit so 'git push origin main' was rejected, but the tag pushed anyway and fired release.yml from a commit not on main. Best-effort fetch — warns and proceeds when offline (the push stays the backstop).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A mute toggle wrote the draft playlist_items + emitted a live device:mute-changed but only markDraft()'d — it never updated playlists.published_snapshot, the copy the device actually plays. So the device's item.muted stayed 0 and every loop/reload re-applied full volume: dashboard icon red but audio kept playing (Android; web's native <video> loop masked it). emitMuteChanged now surgically patches the matching item's muted (0/1) inside the published_snapshot and re-pushes the playlist, so loops re-apply the correct flag. Surgical patch (not publishPlaylist) so a mute toggle can't prematurely publish other draft edits or flip publish state. Adds a regression test that fails without the patch.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Covers API 24 (7.0) + 25 (7.1.2); all 26+ APIs were already guarded with graceful else branches; no dependency bumps. Validated on API 24 + 25 emulators: install, foreground service, #139 OTA verify on the legacy GET_SIGNATURES path (incl. tampered-refuse), EncryptedSharedPreferences, and playback.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Community contribution from @ChrisChrome (tested on Debian 13 headless). Adds scripts/debian-13-setup.sh — server/player/both modes, systemd units, kiosk autologin, and management scripts (status/update/logs) — modeled on the Raspberry Pi setup. Also fixes Chromium fullscreen by detecting screen resolution at runtime (replacing --start-fullscreen), applied to both the Debian and Pi scripts, plus a README entry.
Maintainer review fix: the kiosk wait-loop now polls /api/status (the server's real readiness endpoint) instead of the non-existent /api/health, which had been silently burning the ~120s timeout on every all-in-one boot (bug inherited from the Pi script, fixed in both).
Follow-up to the cache/backoff loop fix (aa23cf0): make a device that can't
self-install visible to operators, and fix the signature-verify bug that kept the
whole #139 fix from engaging on the actual Fire OS target.
Dashboard surface (Phase 2):
- devices gains ota_status / ota_target_version / ota_attempts / ota_updated_at
via the idempotent ALTER TABLE ADD COLUMN migration (non-destructive,
default-backfilled, idempotent on re-run).
- The device reports ota_status (OtaThrottle.statusFor -> none | pending |
manual_update_required) in device_info; the server persists it on register
(the reconnect backstop). devices d.* already surfaces it to the dashboard.
- Dashboard shows a non-blocking amber badge when manual_update_required
("Update available (vX) - install failed N times, manual update required");
i18n key in en.js (non-en inherits via the en fallback). Server suite +1 test.
Event-driven status (Option B):
- New device:ota-status WS message, emitted on STATE TRANSITIONS only
(enter-backoff -> manual_update_required, clear -> none), so the badge updates
promptly without waiting for a reconnect and without per-poll/heartbeat chatter.
Server handler persists the same fields; an unknown/forged device_id is a safe
no-op. The register-path persist stays as the reconnect backstop.
Signature-verify fix (the critical piece):
verifyApkSignature read the downloaded APK's signer via
getPackageArchiveInfo(GET_SIGNING_CERTIFICATES).signingInfo, but that field is
null for ARCHIVE files on API 28/29 (populated only from API 30). On Fire OS 8
(Android 9 / API 28) - the actual deployment target - this returned 0 certs from
a correctly-signed APK, so every OTA was refused as "tampered," the cache was
deleted, and the full APK re-downloaded every check cycle. This was the real
cause of the #139 re-download loop, NOT a silent-install failure: the cache and
backoff added in this branch sit behind this verify gate and never engaged on
the target.
Fix: below API 30, read the archive's signer via the legacy GET_SIGNATURES +
.signatures (its v1/JAR cert, which IS populated on 28/29). Keep
GET_SIGNING_CERTIFICATES + signingInfo for API >= 30 and for the installed-app
read (which works on 28+). The archive's signer is still extracted and compared
to the installed app's signer; a mismatch or zero-cert APK is still rejected.
This reads the cert correctly on old APIs - it does not weaken verification.
Verified on emulators:
- API 28: verify now passes for a legit APK (was: 0 certs, refused). Full backoff
then engages - 8.5MB pulled once, cache-hit on retries, backoff after 3,
manual_update_required emitted once; clears on successful update.
- API 28 negative: a re-signed (different-key) APK is still refused on cert
MISMATCH - no hole opened.
- API 30: unchanged path still passes (no regression).
- server suite 173/173, OtaThrottleTest 7/7.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Devices that download an OTA APK but cannot silently install it (Fire TV: no
device-owner path) re-downloaded the full APK every check cycle indefinitely -
install never completes, version never advances, next check re-triggers.
Client (UpdateChecker.kt, ServerConfig.kt, OtaThrottle.kt):
- Reuse a cached, signature-verified APK instead of re-downloading every cycle;
delete leftover invalid files; keep the verified APK on disk as the
manual-install artifact.
- Persisted per-version attempt budget (EncryptedSharedPreferences) so it
survives the Fire OS app restarts that drive the loop. An attempt is counted
only when an install is launched - a download/verify failure does not consume
the budget, so a transient network problem cannot park a healthy device in
backoff. After 3 failed installs, back off to one retry per 24h.
- Clear OTA state and caches when a check returns update_available=false while
state is pending (app relaunched as the new version).
- Report OTA status to the dashboard via device:log (tag ota) on state
transitions only (enter-backoff, clear) to avoid flooding the channel.
- Extract throttle decision logic into a pure OtaThrottle object (no Android
deps) with JUnit coverage (OtaThrottleTest) for the state transitions.
Server (server.js):
- Reword /download/apk log from "OTA update in progress" to "APK served" and
rate-limit to once per IP / 10 min so a looping device cannot flood the log.
Note: client-cooperative fix - prevents the loop in cohorts running this APK.
Currently-stuck beta4 devices still require a one-time manual update.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Version bump for the beta6 strobe artifact: VERSION, server package + lock, and
the Android client (versionName 1.9.1-beta6, versionCode 26 so OTA sees it as an
upgrade over beta5/25). Also removes two leftover untracked PiP test scratch files
(frontend/alert-overlay.html, frontend/overlay.js) from a prior session.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Web player captureAndSend() now composites each zone from its real rendered
geometry (object-fit honoured, aspect-correct), draws CORS-safe labelled
placeholders for cross-origin/iframe zones, and splits rendering into a
socket-free renderCaptureCanvas(). One full-quality path serves both the
on-demand screenshot and the 1fps stream. Android untouched.
captureAndSend() grabbed a single querySelector('video'|'img') and stretched it
across a fixed 960x540 canvas, so multi-zone Now-Playing screenshots and the 1fps
remote stream showed one zone stretched fullscreen instead of the actual layout.
- Multi-zone layouts now composite each zone from its REAL rendered geometry
(getBoundingClientRect relative to the container, scaled proportionally onto the
canvas), so positions/sizes stay true to the layout.
- Canvas height derives from the container aspect (not a hardcoded 540); media is
drawn honouring its object-fit (cover/contain/fill) instead of being stretched.
- Cross-origin / iframe zones (YouTube, widgets) can't be read back without
CORS-tainting the whole canvas (which makes toDataURL throw and kills the entire
capture). They now get a deliberate, labelled placeholder ("YouTube"/"Widget"/
"Video") so the shot still shows the layout structure with that zone marked,
instead of a transparent hole or a failed capture.
- Split rendering into renderCaptureCanvas() (socket-free, headlessly verifiable)
and captureAndSend() (encode + emit). One full-quality path serves BOTH the
on-demand screenshot and the 1fps stream — the composite is only a few drawImage
calls over already-decoded media, so no separate low-quality stream path.
Web player only; Android (view.draw already composites correctly) untouched.
Verified headlessly on a 3-zone device: red/green image zones render in their
correct positions, the YouTube zone shows a labelled placeholder, and the capture
succeeds with no CORS taint.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Brings in the mute pair (GET /api/devices/:id muted column + Android YouTube mute),
the zone-orphan fallback (web + Android player recovery, lib/zone-validate single
source, assign-time clearing, dashboard warnings + i18n), and 5 server regression
tests guarding the data contracts. 172/172 pass.
Two independent multi-zone bugs, plus operator-facing warnings, i18n, and
regression tests guarding the data contracts.
Bug 1 — per-item mute was a no-op end to end:
- GET /api/devices/:id dropped the `muted` column from its assignments SELECT,
so the dashboard toggle never reflected state (the muted=false case in
particular). Column restored to the device payload.
- Android player now honours the per-item mute flag for YouTube (initial state
+ live via the IFrame JS API).
Bug 2 — items whose zone_id belongs to a different layout were silently dropped:
- Player fallback (web + Android): an orphaned zone_id is recovered into the
largest zone instead of vanishing, with telemetry.
- server/lib/zone-validate.js is the single source of truth for the orphan rule
(zone not in the device's active layout); used by the device payload
(per-item `orphan` flag + `active_layout_zones`) and the device list
(`orphan_count`).
- Assign-time hardening: a stale zone_id (not in the device's active layout) is
cleared to null on POST/PUT rather than persisted as a new orphan.
- scripts/find-orphan-zone-items.js: read-only sweep for existing orphans.
Dashboard warnings (operator-facing, never on the live player):
- Per-item badge + reassign affordance, device-list glance, preview banner.
- Graceful degradation: the zone selector falls back to /api/layouts/:id so it
can't vanish on a stale payload.
i18n: orphan-zone strings added to en/es/fr/de/pt/it (hi falls back by design;
count strings interpolate through tn()).
Tests: server/test/device-zone-contract.test.js adds 5 regression tests for the
data contracts above (muted true/false round-trip, active_layout_zones, orphan
flag + count, orphan-clears-on-reassign, assign-time clearing). 172/172 pass.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Two device-REPORTING fixes from the #134 investigation (the PiP rendering itself
was #135).
1) "Device reconnects every ~45s" was a logging artifact, not instability. The
player re-emits a full device:register on the SAME socket every ~45-60s
(requestPlaylistRefresh) to pull a fresh playlist; the server logged
"Device reconnected" for every register of a known device. The attached 4-day
log showed 1415 "reconnected" vs 30 real socket connects and 0 heartbeat
timeouts — the socket never dropped, so #134's "PiP lost between reconnects"
was a misdiagnosis. Fix: only log a genuine reconnect (new socket); a
same-socket re-register is a refresh (currentDeviceId === device_id) and stays
quiet. The playlist still refreshes.
2) Device reported 720p while the monitor showed a 1080 signal. DeviceInfo
reported getRealMetrics() — the UI RENDER SURFACE — but TV boxes render the UI
at 720p and upscale to a 1080p HDMI signal. Now report BOTH: screen_width/height
= the output mode (Display.Mode.physicalWidth/Height), render_width/height =
the render surface (getRealMetrics). Two new nullable devices columns, stored on
pairing INSERT + reconnect UPDATE, exposed via the device API, shown on the
dashboard as "1920x1080 (UI 1280x720)" when they differ.
Backward compatible (required + verified on emulator): a device that omits
render_* — or sends no device_info at all — still registers, with render_* = null,
on both the INSERT and UPDATE paths. New columns nullable; stores use
`?? null` / `|| null`. All 167 server tests pass.
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(#109): render Android PiP overlay above the YouTube WebView video plane
The PiP overlay (#109) returned sent:1 and showed its title in `uiautomator
dump`, but nothing painted on screen while YouTube was playing. By elimination
(YouTube-specific, landscape so no off-screen transform, real on-screen bounds
in the dump) the cause is surface occlusion: pipLayout sat as the last child of
rootLayout — the SAME compositing band as R.id.youtubeWebView — so the playing
video surface drew over it.
Fix (task option 1a): reparent pipLayout out of rootLayout to the window
content (android.R.id.content) as a top-level sibling drawn after rootLayout, so
it composites above the WebView. MainActivity.mirrorTransformToPip() copies
rootView's orientation/wall transform onto it so corner positions still track
the rotated content (web/Tizen parity). show() also bringToFront()+
requestLayout()+invalidate() on attach (covers the cause-3 measure/visibility
path). Remote-view screenshots now capture the content root so the PiP is still
included.
Instrumentation (Phase 1, default OFF): PipOverlay.pipDebug paints a solid
magenta box + border with media on top (box paints even if media never loads)
and logs box/pipLayout/rootView/youtubeWebView geometry over device:log tag
"pip"; loadImageInto also logs on success. Toggled via device:command
{type:"pip_debug"} (routed through MainActivity.onCommand).
Server: POST /api/pip and the clear handler log one concise [pip] dispatch line
(target + sent/offline) so journalctl shows PiP activity.
Validated end-to-end on an emulator (pixel10/API34) paired to an isolated local
server with YouTube playing: no crash, the PiP box composites above the live
video frame (center + top-right), clear removes it, and the portrait transform
mirror rotates the overlay with the stage (no off-screen). The Fire TV
hardware-overlay punch-through still needs real hardware (emulator composites
video inline); pipDebug + docs/109-android-pip-visibility.md cover that.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(#109): image PiPs never painted — set slot token before decode
Emulator e2e of an image PiP (a QR PNG) found the image area always blank (box
background + title only). Pre-existing defect, also on main, independent of the
occlusion reparent.
Root cause in PipOverlay.show(): teardown() clears `current` to null, then
loadImageInto() captured `token = current` (null) as its drop-if-replaced guard,
but `current` was set to the new pip_id AFTER the media was built. The image
decode finishes on a background thread and posts back after show() returns, so
`token != current` (null != pip_id) was always true and every decoded bitmap was
dropped. Web PiPs and the box/title were unaffected, which masked it.
Fix: set `current = pip_id` before building media so loadImageInto's token
matches. Verified on emulator — a QR image PiP now renders over both a static
image and live YouTube (hardware screencap + the app's software view.draw
capture both show it).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* docs(#109): record web PiP (HTML+JS) verification on emulator
Web PiP type loads its WebView and executes JS (a page stamping JS OK · <time>
rendered over live YouTube). No code change — web PiPs don't use the image path
that had the token bug. Completes the image/web/box content-type verification.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(#109): implement PiP close_button on Android (was a documented no-op)
The server forwarded close_button (routes/pip.js) and it's in openapi.yaml, but
no player rendered it — Tizen deferred "close-button focus" as non-MVP, the web
player has none, and Android's PipOverlay never read the flag. So the documented
field did nothing on any device.
Implement it on Android: when close_button:true, a tappable ✕ floats at the box's
top-right in a FrameLayout wrapper that is a SIBLING of the box — so it isn't
clipped by the box outline or dimmed by the overlay opacity. Tapping it clears
THIS overlay (id-matched via the captured token). Only the ✕ is clickable; the
rest of the full-screen pipLayout stays touch-transparent, so taps elsewhere
fall through to the playing content (no input regression).
Verified on the emulator over live YouTube: the ✕ renders at the corner, and
tapping it removes the overlay while the video keeps playing.
Parity note: web/Tizen players still don't implement close_button; D-pad focus
of the ✕ on non-touch TV hardware is intentionally not wired (MVP = touch/pointer,
matching the Tizen focus deferral).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* Add PIP-Weather-Radar example (TV-style live radar overlay)
A "cut to radar" PiP recipe: a Leaflet map (vendored locally for the
CSP) with a CARTO dark basemap, an animated RainViewer radar loop, and
live NWS warning polygons drawn and color-coded (tornado/severe-tstorm/
flash-flood/flood) with a pulsing "LIVE RADAR" HUD, count chips, and a
legend. Auto-frames the view to the active warning polygon(s).
Two modes: "always" (radar always up) and "on_warning" (default) which
shows the radar only while a qualifying warning covers the configured
point and clears it when the warnings expire — like a station breaking
in during severe weather.
100% keyless / open data: RainViewer radar, CARTO/OSM basemap, NWS
alerts. Zero Node deps; Leaflet is vendored client-side via
vendor-leaflet.sh (gitignored). Offline test covers the warning gate,
color map, RainViewer tile-URL builder, and overlay-URI round-trip.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* docs(radar): note Leaflet is vendored locally, not committed
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Self-contained examples for the PiP overlay API (POST /api/pip), each
with a CSP-safe query-param overlay (external JS), config.example.json,
zero runtime deps, an offline test, and a README:
- PIP-Announce-Broadcast manual one-shot message to a screen/group
- PIP-Weather-Widget Open-Meteo current conditions (keyless)
- PIP-Air-Quality Open-Meteo US AQI widget (keyless)
- PIP-Crypto-Ticker CoinGecko price strip (keyless)
- PIP-News-Ticker scrolling RSS/Atom headlines
- PIP-Room-Status-Calendar ICS-driven Available/Busy room sign
- PIP-Event-Countdown client-side countdown, auto-clears at zero
- PIP-Welcome-Board rotating welcome/birthday cards from CSV
- PIP-Fundraiser-Thermometer goal-progress bar from local/URL JSON
- PIP-QR-Rotator rotating QR codes, encoded client-side
- PIP-Incident-Webhook event-driven: red on firing, clear on resolved
Also includes the CAP-AU (NSW RFS) and US NWS/NOAA emergency-alert
monitors that push expiry-aware PiP overlays.
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The PiP endpoints and the per-item mute field shipped without OpenAPI coverage.
- openapi.yaml: add POST /pip (show), DELETE /pip + POST /pip/clear (clear), all
x-required-scope: full; add the `muted` boolean to PUT /assignments/{id}; add a `pip` tag.
- openapi-contract.test.js: the scope heuristic only treated `command` paths as full-scope,
so a full-scope non-command route (/pip) would fail it — extend it to recognize /pip.
Docs-only as far as the running build goes (no route/behavior change). Lands on main; not
in the frozen v1.9.1-beta4 tag — ships in the next tag.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The Android version fields live separately from the root VERSION / server package,
so the beta4 release commit didn't touch them and the APK reported beta3. Bump them so
the client reports beta4 and OTA sees it as newer. The v1.9.1-beta4 tag is intentionally
NOT moved to this commit.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(server): proxy remote YouTube thumbnails instead of ENOENT on a local path
YouTube content stores thumbnail_path as a REMOTE URL
(https://img.youtube.com/vi/<id>/hqdefault.jpg), but the thumbnail-serving route
path.resolve'd it into contentDir -> a local file that never existed -> ENOENT logged
a few times a minute (the tester-log spam). Recreating content didn't help (new rows
store the same remote URL).
- GET /api/content/:id/thumbnail now proxies a remote http(s) thumbnail_path
server-side (same-origin, so dashboard CSP img-src is unaffected) via a non-throwing
helper: upstream 404 -> 404, other failure/timeout -> 502, image/* only (modest SSRF
hardening; the URL is server-set at ingest). Local thumbnails keep the sendFile path;
the playlist/widget/workspace access gating is unchanged for both branches.
- routes/widgets.js inlineUserContent skips the disk read for a remote thumbnail and
leaves the /api/content/:id/thumbnail reference in place (the proxy serves it).
- routes/content.js ingest unchanged; a comment notes the future download-at-ingest +
backfill option for CDN independence.
- New test/thumbnail-proxy.test.js: local sendFile still works; a remote thumbnail is
proxied (mock upstream, no local read, no ENOENT); upstream 404 -> clean 404. Full
server suite 164/164.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(server): boot banner shows the real version, not a hardcoded v1.2.0
The startup ASCII banner printed "ScreenTinker Server v1.2.0". Use the already-imported
VERSION (require('./version'), the single source of truth that reads the root VERSION
file) in a fixed-width field (VERSION.padEnd(22).slice(0, 22) — the same padEnd
discipline the port line uses) so the fixed-width box border stays aligned for any
version length. No other behavior changes.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(server): persist + ship + real-time per-item mute (#129)
The dashboard mute toggle was a no-op end to end. The active model is playlist_items
(the device payload is its published_snapshot); the legacy `assignments` table the bug
report cited is unused for devices. Three breaks:
- PUT /api/assignments/:id silently dropped `muted` (only read sort_order/duration_sec/
zone_id). It now accepts muted (coerced 0/1) and ITEM_SELECT returns it, so the toggle
persists and its on/off state sticks.
- playlist_items had no `muted` column — added (schema + idempotent migration).
- buildSnapshotItems didn't select muted, so it never reached the published_snapshot /
device payload — now included.
Real-time: on a mute change, emit device:mute-changed { content_id, widget_id, muted } to
every device on that playlist so the player toggles the matching item's volume live,
decoupled from publish (the value is also in the next snapshot, so it persists). Adds a
[mute] log line (the report noted zero mute log entries).
Test: test/mute.test.js — PUT persists + returns muted, it reaches the published
snapshot, and a non-mute update doesn't reset it. Server suite 164/164.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(player): apply per-item mute live on Android + web (#129)
Honor the new per-item mute from the server, both in real time and on reload.
Android:
- WebSocketService: onMuteChanged callback + main-thread device:mute-changed handler.
- MediaPlayerManager.setVideoMuted(): flips the live ExoPlayer volume on the current
video (YouTube autoplays muted; images/widgets are silent).
- MainActivity: on device:mute-changed, apply immediately if the toggled item is the
one playing now.
- PlaylistController.sig(): include muted so a published mute change re-renders/persists
instead of being de-duped.
Web player (server/player/index.html):
- device:mute-changed handler toggles the current <video>; the video mount now also
honors item.muted so a published mute sticks across reloads.
Tizen intentionally not included: its player mutes ALL video for autoplay, so per-item
unmute isn't achievable there.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* PiP overlay MVP: push image/web overlays to a device or group (#109)
Implements the #109 MVP from docs proposal: a floating overlay PUSHED to a device or
group in real time, rendered above the playlist without disturbing it. Scope is the
MVP only — video/RTSP, MQTT, offline-queue, and the priority/stacking system are
deferred to follow-up PRs as the proposal specifies.
Protocol (/device socket, player-agnostic):
- device:pip-show { pip_id, type:image|web, uri, position, width, height, duration,
title?, title_color?, background_color?, opacity?, border_radius?, close_button? }
- device:pip-clear { pip_id? }
The player fetches uri itself (same trust model as remote_url content; server never
proxies). type:web is full-trust by design, hence the 'full' token scope.
Server (server/routes/pip.js, new; mounted in config/api-surface.js PUBLIC_ROUTERS):
- POST /api/pip and POST /api/pip/clear + DELETE /api/pip, all requireScope('full').
- Resolves device_id to a device OR a group, expands a group to members, and emits
per-device — reusing the group command route's room-size online check and
{device_id, name, status: sent|offline} result shape. Generates pip_id.
- Validates type/position allowlists, uri http(s), numeric bounds on
width/height/duration/opacity/border_radius, colors via the existing VALID_COLOR
(#RRGGBB; transparency is the separate opacity field).
- Workspace-isolated: every target query is scoped to req.workspaceId, so a token
bound to workspace A can't address workspace B (404). Offline devices are reported,
never queued (PiP is ephemeral).
Player overlay layer (Tizen; tizen/js/pip-overlay.js, new):
- A #pip sibling ABOVE #stage that PlaylistPlayer/ZoneRenderer never touch.
- applyOrientation now applies the SAME transform to #pip as #stage, so corner
positions track the visible CONTENT in all four orientations.
- image -> <img>, web -> <iframe> (muted by default: empty allow= denies autoplay),
sized/positioned/styled per payload, optional title bar.
- Single overlay slot, last-show-wins; duration timer (0 = until cleared); pip-clear
(id-aware) or timer tears down; teardown wrapped so a malformed payload can't wedge
the layer. Reports show/clear over device:log (tag 'pip').
Dashboard: a minimal "Send overlay" / "Clear overlay" tester on the device-detail
controls (device/group via the open device, type, uri, position, duration), calling
POST /api/pip through the api helper.
Tests (server suite green, 161/161):
- api.test.js: PiP tier — authz (read/write 403, full passes), workspace isolation
(wsA token -> wsB device 404), payload validation, device + group targeting, clear;
plus the PUBLIC_ROUTERS snapshot-firewall updated for /api/pip.
- pip-overlay.test.js: loads the real player.js + pip-overlay.js in a vm with a DOM
shim; proves the overlay shows, auto-dismisses on the duration timer, and never
changes the playlist signature / touches #stage; web->iframe, last-show-wins,
id-aware clear, malformed-payload safety.
Not in this PR (intentional):
- Android player overlay — fast-follow. Protocol + server are player-agnostic; the
Android layer (an overlay View above the player, orientation-matched to MainActivity's
rootView rotation) is the same shape and lands next.
- OpenAPI docs for POST /api/pip — the contract test's scope heuristic only treats
'command' paths as full-scope, so documenting a full-scope non-command route there
needs that heuristic extended first; deferred with the docs item (proposal §8.6).
- video/rtsp types, MQTT, offline queue-on-reconnect, priority/stacking, arbitrary
(x,y)/selector positioning (proposal §6).
Refs #109
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* PiP overlay: add Android + web players (#109)
Extends the #109 PiP MVP to the other two players so the protocol (device:pip-show /
device:pip-clear) is honored fleet-wide, not just on Tizen. No server/protocol changes —
the route and socket messages are player-agnostic; these are the two missing surfaces.
Web player (server/player/index.html):
- New #pipContainer layer above #playerContainer, pointer-transparent, that the playlist
render never touches. The same orientation transform is applied to it as to
#playerContainer (extended to also reset width/height on landscape so a
portrait->landscape switch realigns), so corner positions track the visible content.
- Inline PiP logic mirroring tizen/js/pip-overlay.js: image -> <img>, web -> <iframe>
(muted by default via empty allow=), position/size/bg/opacity/radius/title, single slot
last-show-wins, duration timer (0 = until cleared), id-aware clear, wrapped teardown.
- device:pip-show/clear handlers; reports show/clear over device:log (tag "pip").
Android player:
- activity_main.xml: a pipLayout FrameLayout as the LAST child of rootLayout — it draws
above the content AND inherits rootView's orientation rotation/translation, so corner
positioning is orientation-matched for free.
- PipOverlay.kt (new): builds the overlay box into pipLayout. image -> ImageView (decoded
off-thread via ImageLoader, dropped if torn down mid-decode); web -> WebView with
mediaPlaybackRequiresUserGesture=true (mute-by-default). Gravity-based corner/center
placement with a 4% inset, GradientDrawable bg + corner radius, alpha=opacity, optional
title bar. Single slot last-show-wins; duration timer; id-aware clear; teardown wrapped
and also run on activity destroy (WebView cleanup).
- WebSocketService: onPipShow/onPipClear callbacks + safeOn handlers posted to the main
thread (they build Views) + a sendLog(tag, level, message) emitter for device:log.
- MainActivity: instantiate PipOverlay (log -> wsService.sendLog("pip", ...)), wire the
callbacks, tear down on destroy.
Verified: Android assembleDebug builds clean; web player inline JS parses; server suite
still 161/161 (no server changes this commit). Not yet validated on real hardware —
four-orientation corner positioning mirrors the player container/rootView transform but
should be eyeballed on a panel.
Refs #109
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
#123 already shipped a placeholder device:command handler (#121/#122): screen_off
was a black overlay, reboot/shutdown a toast, update a "re-install" toast. This
replaces that with the real control surface from #125, reconciled into the single
handler #123 introduced (rather than landing a second, competing handler).
- NEW tizen/js/device-control.js: window.STDeviceControl = { run, capabilities,
backend }. Feature-detects webapis.systemcontrol.* (Tizen 6.5/7, sync/throws) then
b2bapis.b2bcontrol.* (SSSP/Tizen 4, async), normalises both to Promises, re-probes
each call. run() never rejects; resolves { ok, supported, action, note, reload }.
Panel power: setPanelMute (mute ON = backlight OFF) -> setDisplayPanel/setPanelStatus
fallback. reboot -> rebootDevice(); shutdown mutes the panel and notes SSSP has no
true power-off; update/reload -> reload:true.
- tizen/js/app.js: device:command now calls STDeviceControl.run and reports the
outcome via reportCmd (device:log tag=command -> dashboard:device-log, plus a
structured device:command-result), reloading ~1.2s later on result.reload. screen_off
falls back to the existing black overlay (showScreenOff) when no B2B surface exists;
screen_on/launch still clear the overlay + keepAwake. Dropped the now-dead
tryPowerControl. reportCapabilities() runs on device:registered so the dashboard sees
the backend ("none" on web/URL-Launcher/consumer TV).
- tizen/config.xml: partner-level b2bcontrol + systemcontrol privileges (ignored, not
fatal, on unsigned/URL-Launcher/web/consumer builds).
- tizen/index.html: load $WEBAPIS/webapis.js + $B2BAPIS/b2bapis.js before the app
scripts (404 harmlessly off-hardware) and device-control.js before app.js.
- tizen/README.md: rewrote the remote-control table for real B2B control + a
partner-signing caveat; added device-control.js to the file list.
Supersedes PR #126 (feat/tizen-device-command-125), which targeted main unaware that
this branch already had a device:command handler.
Verified: node --check on both JS files; config.xml well-formed (xmllint). Not yet
validated on a real SSSP panel — the control surface only takes effect on a
partner-signed .wgt (backend reports "none" on the dev/URL-Launcher build).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Brings the Tizen TV player to parity with the other players: closes the five
Tizen issues Bold Media Group filed (#118-#122) and adds the two larger renderer
features it was still missing.
Fixes (#118-#122)
- #118 Sticky "Not authenticated" banner. On TV sleep/wake the socket reconnects
and a heartbeat could land on the fresh, not-yet-registered socket; the server
rejected it and the old handler painted a permanent banner AND dropped the saved
credentials, forcing a re-pair. Heartbeats are now gated on a per-connection
authenticated flag (true only between device:registered and disconnect/auth-error),
the heartbeat stops on connect/disconnect/auth-error, the banner clears on
device:registered, and the auth-error toast is non-sticky.
- #119 app_version stuck at 1.0.0. Resolved at runtime from config.xml via the Tizen
application API, with a fallback constant that build-wgt.sh stamps from config.xml.
- #121 Remote commands. Added a device:command handler (refresh/launch/screen_on/
screen_off; honest no-op toasts for update/reboot/shutdown, which need B2B/MDM
privileges a sideloaded app lacks). Removed the dead device:reload listener.
- #120 Dashboard preview. Added device:screenshot-request + remote-start/remote-stop.
Images capture; video/YouTube fall back to a status card (TV hardware video plane
and cross-origin iframes can't be read into a canvas).
- #122 Updates/boot. Documented the real paths (re-sideload or URL Launcher/MDM
refresh; display-level kiosk/boot settings) since a sideloaded .wgt has no in-app
OTA or config.xml autostart.
Multi-zone layouts (Android parity)
- New ZoneRenderer ports the Android ZoneManager: zones positioned by percent
geometry with z_index/fit_mode/background, assignments grouped by zone_id
(unassigned content goes to the first zone), each zone rotating independently with
the same per-item schedule gating (#74/#75). app.js selects the renderer from
payload.layout; single-zone playback is unchanged.
Video walls (web-player parity; Android has none)
- New WallController mirrors the web player: when payload.wall_config is present the
stage is positioned (vw/vh) as this screen's slice of the wall. The leader plays
normally and broadcasts wall:sync at 4Hz; followers hold the leader's item, align
index, and lock their video to the leader's clock with a latency-compensated drift
controller (hard-seek past 0.3s, gentle +/-3% playbackRate nudge past 0.05s), and
request an immediate position on (re)connect via wall:sync-request. Per-tile
rotation is not applied yet (matches the web player). Wall emits are gated on
auth + connection so a pre-register tick can't trip device:auth-error.
Not ported: video-wall per-tile rotation, plus the minor Android-only reporting
events (device:playback-state, device:log) and the N/A offline-cache events
(device:content-ack/content-delete). None affect on-screen playback.
Verified: JS syntax + headless unit tests of zone grouping/geometry and wall
leader/follower + drift logic. NOT yet validated on Tizen hardware - multi-screen
video sync in particular needs a real wall to tune.
Ports the wall:sync protocol the web and Tizen players already ship to native
Kotlin/ExoPlayer, so the Android player can join a video wall.
- WallController (new): 4Hz leader broadcast; follower latency-compensated drift
controller (hard-seek past 0.3s, gentle +/-3% playbackRate nudge past 0.05s);
role handling with immediate align on entry and on wall:sync-request. Per-tile
rotation intentionally not applied (web/Tizen parity; left as a TODO).
- MediaPlayerManager: expose position/duration/seekExact/setSpeed for the drift
controller; RESIZE_MODE_FILL / ImageView FIT_XY in wall mode (object-fit:fill
parity), restored to fit/fitCenter on exit. Follower mute (setWallMute) persists
across leader-driven item switches, and followers loop (REPEAT_MODE_ONE) so they
never freeze on the last frame if the leader's next index is late.
- PlaylistController: wallFollower flag suppresses auto-advance (leader drives the
index); getIndex/gotoIndex for follower tracking; itemStartedAtMs for non-video
sync position.
- WebSocketService: onWallSync/onWallSyncRequest handlers (posted to the main
thread since they drive ExoPlayer) + emitWallSync/emitWallSyncRequest senders
guarded on socket.connected() like sendPlaybackState.
- MainActivity: parse wall_config in onPlaylistUpdate and branch before the
orientation + multi-zone paths; size/translate rootView to this screen's slice;
exit() restores full screen.
Compiles clean (./gradlew :app:assembleDebug). NOT yet validated on a device or a
real wall — the ExoPlayer seek/speed sync and the slice transform need on-device
tuning before this is trusted.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>