Completes P2 user-management. Adds the full admin surface for managing
workspace membership: invite modal, role change, member remove, cancel
pending invite. All admin-gated client-side via can_admin from /me,
server-gated via canAdminWorkspace.
Component additions:
- NEW workspace-members-invite-modal.js (~115 LOC). Mirrors
workspace-rename-modal.js pattern (imperative open + listeners + close
+ esc/click-outside/enter). Two key differences: onSuccess callback
instead of window.location.reload (allows targeted re-render of
pending-invites section), and mapError callback so the parent's
mapMutationError is the single regex-to-i18n source of truth (instead
of duplicating in the modal).
- workspace-members.js: header invite button (can_admin gated), per-row
affordances (role select + remove on direct members, cancel on invited
rows, none on via_org rows), exported mapMutationError mapper,
re-render on both success AND error for role-select to resync state
when the server rejects.
- 4 api.js helpers (inviteWorkspaceMember, cancelWorkspaceInvite,
updateWorkspaceMemberRole, removeWorkspaceMember).
- 24 i18n keys under members.modal.*, members.button.*,
members.confirm.*, members.error.*, members.success.*
- CSS for .member-actions family (action buttons + role select + hover
states).
UX decisions:
- Direct-member rows: role <select> replaces role text in same column;
remove button right of detail
- via_org rows: no actions cell (server would 403; UI respects boundary)
- Invited rows: cancel button only (handoff rule was over-broad -
cancel-invite IS a valid mutation on invited rows, refined during 2B
survey)
- Role select fires on change, no Save button (matches teams.js pattern;
mitigations for accidental clicks noted in handoff if reports come in)
- Mutations re-fetch + re-render rather than optimistic updates -
simpler, no state-drift bugs, endpoints respond fast
- /invites endpoint skipped entirely when !can_admin (saves a request;
server still enforces)
Verification: 21/21 Playwright assertions PASS across 6 cases (invite
happy path, invite collision, role change, remove member, last-admin
block, cancel invite). Test infrastructure stashed at
~/Documents/screentinker-2b-playwright-2026-05.py.
Closes P2 (user-management feature). Slice 1+3 backend landed c4fbd2b,
2A read-only view landed 8db171d, 2C accept-invite handler landed
399af54, 2B mutation UI landed here.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Slice 2C: hash route #/accept-invite/{id} with full flow support across
all six auth entry points (login/register/Google/Microsoft/support/setup)
via app-boot consumer pattern rather than per-handler hooks. Stash
mechanism uses localStorage with timestamp + staleness check
(INVITE_EXPIRY_DAYS_FRONTEND = 7, mirrors backend default). On success:
switch workspace, reload, show toast post-reload via scoped
pending_invite_toast key. On error: showToast directly, no reload.
Non-reentrant guard prevents double-consume across the synthetic
hashchange that fires before reload completes.
Two bugs surfaced during Playwright-driven verification (slice 1 left
two latent issues that only manifested when the full accept-invite
flow ran end-to-end):
1. Email URL path: workspaces.js constructed
${publicBase}/#/accept-invite/X which lands on the marketing landing
page (the SPA is at /app). Fixed to use
${publicBase}/app#/accept-invite/X. Any invite email sent before
this fix would have produced an unfollowable link.
2. Synchronous hashchange race: location.hash = '#/' followed by
reload() fires hashchange BEFORE the reload unloads the page. The
intermediate route() call would consume the toast key against a DOM
about to be destroyed, so the post-reload page had no toast. Fixed
with history.replaceState which mutates hash without firing
hashchange.
Files:
- server/routes/workspaces.js (+4/-1, /app path fix + comment)
- frontend/js/api.js (+3 LOC, acceptInvite helper)
- frontend/js/app.js (+154 LOC, accept-invite plumbing)
- frontend/js/i18n/en.js (+9 LOC, accept.* keys)
Browser verification: 11/11 assertions PASS via Playwright suite
covering all 5 D-cases (unauthed flow, authed direct, wrong account,
stale stash, already-member). Script stashed at
~/Documents/screentinker-2c-playwright-2026-05.py.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds the workspace members page at #/workspace/:id/members.
Read-only listing only - mutations land in slice 2B,
accept-invite URL handler lands in slice 2C.
Three sections render based on access path:
- Members: direct workspace_members rows with role + join date
- Organization access: org_owner/org_admin who reach this
workspace via org-level access (via_org=true). 75% opacity
+ italic "via organization" label to distinguish from direct
membership. Section hidden if empty.
- Pending invites: workspace_invites rows (admin-only -
section silently absent for non-admins via 403-suppress)
Switcher dropdown adds a "members" icon next to the rename
pencil, gated on can_admin (same predicate). Icon visible on
hover, mirrors the existing pencil pattern.
24 i18n keys added under members.* (read-only set; mutation
keys land in 2B).
Backend coverage from c4fbd2b unchanged; pre-flight curl
verification (13/13 cases) confirmed all 7 endpoints work as
documented before slice 2 first-exercised the four previously
untested ones (GET /invites, DELETE /invites/:id, PUT
/members/:userId, DELETE /members/:userId).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Slice 1 + 3 of the user-management feature from the May 12 plan.
Backend-only - no UI yet (slice 2 ships separately). Backend +
accept-handler together so the email accept link is functional
from day one without a half-state.
Endpoints added:
- GET /api/workspaces/:id/members (any member; via_org=true
for org-level entries,
read-only from ws context)
- GET /api/workspaces/:id/invites (workspace_admin)
- POST /api/workspaces/:id/invites (workspace_admin)
- DELETE /api/workspaces/:id/invites/:inviteId (workspace_admin)
- PUT /api/workspaces/:id/members/:userId (workspace_admin)
- DELETE /api/workspaces/:id/members/:userId (workspace_admin)
- POST /api/auth/accept-invite/:inviteId (requireAuth +
case-insensitive
email match)
Permission gating:
- canAdminWorkspace (existing) for admin-gated endpoints
- canAccessWorkspace (new helper in lib/permissions.js) for the
members read endpoint - mirrors canAdminWorkspace shape but
admits any workspace_members role plus org/platform paths
Security additions vs the original plan:
- Transaction-bounded collision check on POST /invites closes the
TOCTOU race between simultaneous duplicate POSTs (no UNIQUE
constraint on workspace_invites(workspace_id, email))
- Per-(inviter, workspace), hour-window rate limit on POST /invites
to prevent abuse / cost runaway. Env-configurable via
INVITE_RATE_LIMIT_PER_HOUR with conservative 50/hour default.
429 response is generic - does not echo the configured value.
- Invite expiry env-configurable via INVITE_EXPIRY_DAYS (default 7)
- PUBLIC_URL env var (optional) pins the accept-URL origin in prod;
falls back to request-derived for local dev
Rollback rule on email send: only graph_error (real send attempt
failed at Graph) deletes the row and returns 502. not_configured
and dev_restricted are intentional non-sends - keep the row, count
against rate limit, allow local accept-invite testing to proceed.
Other safety blocks:
- Cannot demote/remove the last workspace_admin (409)
- Cannot remove the parent-org's org_owner via workspace path (403)
- Accept-invite is idempotent if user already a member
- Expired invites delete-on-read and return 410
- Wrong-account accept returns 403 without touching the invite
Expired-invite cleanup added to services/heartbeat.js mirroring
the team_invites sweep pattern.
Verification: 9-case curl-driven E2E against the dev DB fixture
(switcher-test + invitee-existing + invitee-new mid-flow register).
All 9 pass: create / collision-409 / second-create / rate-limit-429 /
existing-user-accept / register-then-accept / wrong-account-403 /
expired-410 / viewer-cannot-invite-403.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The non-admin branch of /me's accessible_workspaces query drove
from workspace_members, so users with org_owner or org_admin on
an organization but no direct workspace_members row were missing
those workspaces from their /me response - and therefore from the
switcher dropdown. Mirrors the access logic in
accessibleWorkspaceIds() (lib/tenancy.js) while keeping the
full-row SELECT shape /me needs.
Verified end-to-end with switcher-test@local.test acting as
org_owner of Acme Studios with no workspace_members row on
Studio B - Studio B now appears in /me's accessible_workspaces
with workspace_role: null, can_admin: true.
Also updates the stale TODO comment in tenancy.js that flagged
this exact gap.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Mirrors the player-side fix in 1aee4f2 - skips the polling->WS
upgrade dance that was causing the dashboard socket to flicker
when Apply burst its fetch traffic.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Companion to 19f434d. The new player_debug_logs sink collects four
data categories not previously enumerated in the privacy policy:
browser user-agent, error/stack-trace data, recent player log entries
(which can include filenames of content being played), and screen/
viewport dimensions. New section 2.5 documents what's collected, why,
and the rolling-buffer retention model (10k entries, oldest pruned
on insert).
Section 5 (Self-Hosted Deployments) clarifies that the telemetry is
collected by the self-hoster's own server, not transmitted to us, and
points at the PLAYER_DEBUG_REPORTING=off kill switch for self-hosters
who prefer no collection at all.
Section 11 retention list gains a row for the rolling-buffer model.
"Last updated" bumped to May 15, 2026.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Smart TVs (Tizen, WebOS, Fire TV, Bravia) have no accessible browser
devtools, so when the player misbehaves on those platforms we previously
had zero visibility. This adds two paths to fix that:
- Visible debug overlay rendered on the TV screen for phone-photo capture
- Automatic server-side telemetry sink for hands-off error reporting
Client side (server/player/):
- Inline ES5 error trap as first script in index.html captures errors
even from parse-time failures in later scripts. Captures into
window.__debugLog with 200-entry cap.
- debug-overlay.js renders a fixed-position overlay covering the top 40%
of the screen. Activates via ?debug=1, d-e-b-u-g key sequence, Samsung
red button (keyCode 403), or smart-TV UA + ?autodebug=1. Freeze toggle
(F key or Samsung green) with visible FROZEN badge for phone capture.
pointer-events: none so touches pass through to the player underneath.
- Reporter machinery posts captured errors to /api/player-debug with
5-second debounce batching, sendBeacon on unload (with payload size
capping to stay under 64KB), 5-minute backoff after 429 responses.
UA-gated: smart-TV allow-list first (handles Tizen-with-Chrome/108),
modern-desktop deny-list second, default-report for unknown UAs.
- Two-pass djb2 fingerprint (16 hex chars) per error for future grouping.
- Absolute script src (/player/debug-overlay.js) so the script loads
regardless of trailing-slash on the player URL.
Server side:
- New player_debug_logs table (10000-row FIFO cap, indexed on
fingerprint + created_at). Schema in schema.sql, idempotent via
CREATE TABLE IF NOT EXISTS.
- POST /api/player-debug unauthenticated (so unpaired players can also
report), rate-limited 10/min/IP, per-field length caps to prevent abuse.
- Dynamic /player HTML route injects window.__playerConfig.debugReporting
based on PLAYER_DEBUG_REPORTING env var (defaults on; =off suppresses
all client telemetry traffic). Other player assets still served static.
- Admin routes (requireAuth + requireSuperAdmin):
GET /api/player-debug/list with pagination and filters
GET /api/player-debug/summary for UA family counts
DELETE /api/player-debug/older-than for manual purge
Admin view (#/admin/player-debug):
- UA family summary at top (Tizen/WebOS/Fire TV/Bravia/Edge/Chrome/etc)
- Filter row: UA contains, date range, has-error checkbox
- Paginated table with expand-row JSON viewer for error_data and context
- device_id labeled (self-reported) since field is unauthenticated input
- Manual delete-older-than button with confirmation dialog
Verified end-to-end with Playwright + Chromium (17/17 checks pass) plus
manual real-browser verification including UA-spoofed Tizen flow landing
rows in the admin view.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Follow-up to 73f41c3 (server-side zone_id wiring). With this commit
the zone feature is verified working end-to-end: dashboard zone
picker renders correctly, zone_id saves and persists, the per-row
zone dropdown reflects the saved zone after reload, and a live
player run with computed-style inspection confirmed zone divs and
video elements size correctly within their geometry.
Frontend (device-detail.js, en.js):
- Add-content modal: zone picker slot now renders in all four states
(has_zones / no_layout / fetch_failed / empty_layout) instead of
silently vanishing when zones.length === 0. Informational rows
match form-group styling and tell the user which control to use
next. Closes the gate-4 symptom where 38-of-42 devices (no layout
assigned) silently dropped zone_id on every assignment.
- Both /api/layouts/:id fetches (add modal, edit-path) now have
!res.ok throw guards and surface failures via console.warn instead
of swallowing them. The add modal additionally exposes the failure
state to the user via the fetch_failed info row.
- Edit-path zone dropdown: replaced brittle DOM-scraping (reading
the i18n label text and matching z.id.slice(0,8) against rendered
meta HTML) with a data-current-zone-id attribute stashed at row
render from a.zone_id. Removes the i18n-format coupling and gives
exact UUID match.
- 3 new i18n keys in en.js (other locales fall back).
Server (devices.js):
- The GET /api/devices/:id assignments query had its own ad-hoc
SELECT projection that was missed during the 73f41c3 site survey.
Without pi.zone_id in this projection, loadDevice() got assignments
without zone_id and the edit-path dropdown displayed "No zone"
after every save+reload even though the DB had the correct value.
One-line fix: add pi.zone_id, mirroring the ITEM_SELECT change in
routes/assignments.js. Listed as the 8th site that 73f41c3's
original survey missed; this commit closes it.
Verification:
- JS parse + en.js ESM load + server module load all clean.
- Live SQL probe: GET /api/devices/:id projection now returns zone_id
for the test rows (id=31 zone_id=z-sh-1, id=54 zone_id=z-sh-2).
- Browser test by hand: zone picker renders per state, zone_id
persists, reload shows saved zone, computed styles on rendered
.zone divs match expected geometry.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 2 (assignments -> playlist_items) dropped zone_id during the
conversion: migrateAssignmentsToPlaylists INSERTed only (playlist_id,
content_id, widget_id, sort_order, duration_sec), and the new
playlist_items DDL omitted the zone_id column entirely. Every write
path on top of playlist_items inherited that omission - the
multi-zone layout assignment feature stopped working.
Frontend always sent zone_id correctly (device-detail.js:1015,1072
POST and PUT both include it; api.addAssignment and api.updateAssignment
forward the body verbatim). Server silently dropped it. The
assignments.js PUT route was the most direct evidence: it destructured
zone_id from req.body but never added it to the updates array.
Schema:
- schema.sql: add zone_id TEXT REFERENCES layout_zones(id) ON DELETE
SET NULL to fresh-install DDL.
- database.js migrations[]: add idempotent ALTER TABLE for existing
installs (the surrounding try/catch loop handles duplicate-column).
Backfill (new gated migration phase2_zone_id_backfill):
- Pre-migration snapshot copied to db/remote_display.pre-zone-id-
backfill-<ts>.db (one-off for this migration; the general
every-migration-snapshot framework is a separate concern, not built
here).
- Best-effort UPDATE playlist_items.zone_id from surviving
assignments rows via device.playlist_id + content_id/widget_id
match, LIMIT 1 for the multi-match edge case.
- Regenerates published_snapshot for every published playlist so the
JSON the player consumes carries zone_id going forward. Even with
zero rows backfilled (the common case post-Phase-2 cleanup) this
closes the snapshot-staleness gap.
- Stamps schema_migrations regardless so it won't re-run on next boot.
- On the live local DB: 0 playlist_items backfilled, 18
published_snapshots regenerated. On the April 13 prod fixture
(sandboxed copy): 0 backfilled, 7 regenerated. Expected and matches
our pre-flight finding that assignments was effectively scrubbed of
zone_id everywhere.
Route wiring (7 sites + 1 shared constant):
- assignments.js ITEM_SELECT: project pi.zone_id (read path so the
frontend display at device-detail.js:500 surfaces the value).
- assignments.js POST INSERT: include zone_id column + value.
- assignments.js PUT: actually use the already-destructured zone_id
in the updates allow-list. Treats undefined as "no change" so a PUT
that omits zone_id leaves the existing value intact; any explicit
value (including null) is written.
- assignments.js copy-to INSERT: preserve a.zone_id during
device-to-device playlist copy.
- playlists.js buildSnapshotItems: project pi.zone_id so the snapshot
JSON carries it. This is what the player's renderZones loop reads
(player/index.html:1338 matches a.zone_id === zone.id).
- playlists.js discard-revert INSERT: restore zone_id from snapshot
item on revert.
Out of scope (verified safe by SQL semantics + UI inspection):
- playlists.js POST item-add and PUT item-update in the playlist-detail
surface: the UI there doesn't expose zone editing, and their SQL
leaves zone_id NULL on insert / untouched on update. No regression.
- Other playlists.js SELECT projections (lines 141, 190, 240, 265, 334,
379, 419) all use SELECT pi.* and auto-pick zone_id once the column
exists.
- Kiosk-page assign at device-detail.js:1027 doesn't send zone_id;
separate pre-existing gap, not part of this regression.
Tests (all local, no push, no prod deploy):
- Migration boot on live local DB: clean, idempotent (second boot
skips the gated function).
- Migration boot on April 13 prod fixture (sandboxed copy at
/tmp/zone-fix-fixtures/test-run.db): cleanly runs the full migration
stack (multi-tenancy + 5 other phases the fixture predated) then
the new zone_id backfill. Live local DB untouched.
- 8 SQL-level route behavior tests pass: INSERT stores zone_id, PUT
changes/clears zone_id, ITEM_SELECT and buildSnapshotItems
projections include zone_id, copy-to preserves, discard-revert
restores from snapshot JSON, undefined zone_id in PUT leaves
existing value intact.
Not verified: end-to-end multi-zone playback on a real device. The
SQL + snapshot JSON layer is correct (player consumes
playlist.find(a => a.zone_id === zone.id) and now gets the right
zone_id back from the snapshot); confirming render-to-correct-zone
on actual hardware is the next step before prod deploy.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a 5-second countdown on the player's Connect button when the
device is unpaired. Auto-clicks at 0 unless the user interacts with
the serverUrl field. Useful for headless kiosks where clicking is
painful. Includes a follow-up race-condition fix so the timer can't
double-fire if Connect is clicked manually during the countdown.
The "Custom" tier on the public pricing page was misrendering as a
better-than-Free tier: headline "Custom", price "Free", "Unlimited
devices/storage", "Get Started" button. Root cause is in DB data,
not markup - the 'enterprise' plan row has price_monthly=0 and
max_devices/storage=-1, and the dynamic render in landing.html maps
those to "Free" + "Unlimited" with the wrong CTA.
Fix: filter the 'enterprise' plan out of the public landing render
(client-side, in landing.html only) and replace it with a hardcoded
Enterprise / Custom marketing card whose Contact Us button opens a
new lead-capture modal.
The DB row itself stays - it is actively used elsewhere:
- auth.js: first user in SELF_HOSTED=true mode is assigned to it
- settings.js: white-label feature is gated on enterprise plan
- 1 user (the dev account) is currently assigned to it
- /api/subscription/plans is also consumed by billing.js, settings.js,
admin.js (logged-in surfaces); they keep getting the full plan list.
The filter is scoped to landing.html's render only.
The in-app billing page renders the same plan with the same cosmetic
bug; that's a logged-in admin surface, out of scope for this commit.
Other 4 cards (Free, Starter, Pro, Business) unchanged.
Frontend (landing.html):
- Filter 'enterprise' from public render
- Hardcoded Enterprise / Custom card. Uses .price class with "Let's
talk" + empty .yearly spacer to match Free card's vertical baseline
so the feature list aligns with the paid cards' baselines.
- Modal markup, CSS (mirrored from frontend/css/main.css conventions
since landing.html doesn't import main.css), and inline JS for
open/close/submit/escape/background-click.
- Honeypot field: hidden 'fax_number' input (off-screen + aria-hidden
+ tabindex=-1). Picked over the obvious 'website' name to catch
mid-tier bots that explicitly skip the well-known honeypot names.
Backend (new server/routes/contact.js):
- POST /api/contact/enterprise, public (unauthenticated)
- Rate limited 5/min/IP+path via the existing rateLimit middleware
- Honeypot check: populated fax_number returns 200 silently, no email
- Server-side validation: required fields, email format, screens
1-100000, multi_tenant in {single,multi}, hosting in {hosted,self,
unsure}. Length caps prevent textarea-bomb abuse.
- Sends via existing services/email.js (Microsoft Graph) to
dan@bytetinker.net from the support@screentinker.com Graph sender.
- Log lines: "[contact] enterprise inquiry from EMAIL (COMPANY)
delivered" or "[contact] honeypot triggered from IP; dropping".
Wired in server.js alongside other public routes (before requireAuth).
Build-time tests passed locally:
- Module loads, server boots clean
- Validation: missing fields, bad email, bad multi_tenant, bad
hosting, screens out of range - all return 400 with the right
error message
- Honeypot: populated fax_number returns 200 success, no email sent,
log line confirms drop
- Rate limit: kicks in at 6th request within a minute as expected
- Real end-to-end send: one test submission delivered to
dan@bytetinker.net via Graph (subject "[ScreenTinker] Enterprise
inquiry: ScreenTinker Build Verification", body formatted with all
fields). GRAPH_DEV_RESTRICT_TO was temporarily widened to include
the recipient for the test and restored to dw5304@gmail.com
immediately after.
- Card render order verified against live API: Free (outline,
Get Started) | Starter | Pro (featured, Most Popular badge) |
Business | Enterprise / Custom (Contact Us -> modal).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two dashboard-accuracy improvements for issue #3.
Disconnect debounce (5s):
- Brief transient flaps (Engine.IO ping miss, eviction-then-reconnect,
Wi-Fi blip) no longer immediately flip the device to offline in the
dashboard. Disconnect handler now defers the offline transition;
register handlers cancel the pending timer if reconnect lands in
window.
- Existing stale-disconnect guard kept as fast-path for the eviction
case (no timer scheduled at all when the active heartbeat conn is
already a different socket).
- Re-check at timer fire compares socketIds: aborts only if a
GENUINELY DIFFERENT socket reclaimed the device. Just the closing
socket's own (not-yet-cleaned-up) entry is treated as stale and
proceeds with offline transition.
- Server-restart mid-grace is handled by the heartbeat checker safety
net (existing component): any 'online' row with last_heartbeat
older than heartbeatTimeout gets marked offline on next sweep.
Truthful single-device command feedback:
- dashboard:device-command handler now checks deviceNs.adapter.rooms
for an active socket before emitting (matches the group-command
route's pattern).
- If room is empty, falls through to commandQueue.queueCommand (lazy
require - if commit C is reverted, MODULE_NOT_FOUND is cached and
every subsequent call gets consistent queued=false behavior).
- Returns three-state ack to caller: { delivered, queued, reason }.
- Server log line was misleading - now logs 'Command delivered to
device X' vs 'Command for offline device X (queued=true/false)'.
Frontend:
- sendCommand() takes optional callback. Without one, fires-and-forgets
(no behavior change for non-wired callers). With one, uses Socket.IO
.timeout(5000).emit so the callback always fires (ack or no_ack).
- Six device-detail command buttons wired to three-state toasts:
reboot, shutdown, screen_off, screen_on, launch, update.
- delivered: green/success toast (existing localized message)
- queued: amber/warning toast (new generic message)
- no_ack: red/error toast
- fallback: red/error toast
- Two callers intentionally left fire-and-forget:
- window._sendCmd (generic remote-overlay keypress/touch helper)
- enable_system_capture (has its own visual state machine; out of
scope for this commit)
Three new i18n keys (en.js only; other locales follow later):
- device.toast.command_queued
- device.toast.command_undeliverable
- device.toast.command_no_ack
Refs #3
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Short-lived per-device queue covers the TV-flap window (issue #3):
when a device is mid-reconnect, prior code emitted to an empty room
and the event vanished. Now playlist-updates and commands targeting
an offline device are queued and flushed in order on the next
device:register for that device_id.
server/lib/command-queue.js (new):
- pendingPlaylistUpdate: per-device marker (rebuild via builder on
flush -> always fresh DB state, no stale snapshots)
- pendingCommands: per-device Map<type, payload> with last-of-type
dedup (most recent screen_off wins)
- TTL via COMMAND_QUEUE_TTL_MS env (default 30000)
- Active sweep every 30s prunes expired entries
Memory bounds: ~6 entries per device worst case (1 playlist marker
+ 5 command types), unref'd sweep timer.
Wired emit sites (8 total; the four direct socket.emit calls in
deviceSocket register handlers are intentionally NOT queued because
the socket is alive by definition at those points):
- server/routes/video-walls.js (pushWallPayloadToDevice)
- server/routes/device-groups.js (pushPlaylistToDevice)
- server/routes/content.js (content-delete fan-out)
- server/routes/playlists.js (pushToDevices + assign)
- server/services/scheduler.js (scheduled rotations)
- server/ws/deviceSocket.js x2 (wall leader reclaim/reassign)
server/ws/deviceSocket.js register paths now call flushQueue after
heartbeat.registerConnection + socket.join. Existing
socket.emit('device:playlist-update', ...) lines kept - they send
the initial state on register; the flush replays any queued events.
Player's handlePlaylistUpdate fingerprint check dedupes the
overlap.
Refs #3
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Make HEARTBEAT_INTERVAL and HEARTBEAT_TIMEOUT env-tunable so
self-hosters with slow/jittery networks don't have to edit
config.js (issue #3 reporter did exactly this to confirm the
diagnosis). Defaults unchanged at 10000ms / 45000ms so existing
deployments keep current behavior.
Same parseInt(env) || default pattern as PORT/HTTPS_PORT/PING_*.
README env table extended.
Refs #3
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Connection-stability layer for issue #3. LG webOS WebKit (and other
TV-grade clients) miss Engine.IO pongs under decode load with the
Socket.IO defaults of 25s ping / 20s timeout, causing spurious
transport drops and a connect/reconnect/evict/disconnect loop on
the device. Default polling-first transport adds another fragility
layer via the polling->WebSocket upgrade dance.
- pingInterval / pingTimeout default to 30000 / 30000 (worst-case
dead-socket detection 60s, up from ~45s). Both env-configurable
via PING_INTERVAL / PING_TIMEOUT.
- Player Socket.IO client: transports: ['websocket', 'polling'].
Tries WebSocket first; falls back to polling on the same connect
attempt if WebSocket fails. Polling fallback preserved for
firewall-restricted networks.
App-level heartbeat checker is unchanged and remains the safety net
for clients that miss the transport-level ping/pong window.
Tradeoffs documented in inline comments. README env table extended
with PING_INTERVAL and PING_TIMEOUT rows.
Refs #3
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Old invite replaced with current permanent invite across README,
landing page, and anywhere else it appeared.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Deleting a content asset that was actively displayed on screens
caused affected players to go black and never recover; deleting an
actively-playing video also failed to stop playback (audio kept
going). Root cause: handlePlaylistUpdate never tore down the current
media element and could drive currentIndex to NaN when a late
onended fired during the playlist swap.
- Add teardownCurrentMedia() - pause, clear src, .load() to actually
release the decoder and kill audio; null event handlers to prevent
late onended races
- handlePlaylistUpdate: preserve continuity - if the playing item
survives the update keep it playing, otherwise walk forward from
the old position to the next surviving item; empty playlist tears
down to waiting state
- Guard playCurrentItem against empty playlist / non-finite index
- Remove dead device:content-delete socket handler (never emitted)
Resolves#4
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Self-hosters running internal-only deployments don't need the
marketing homepage. With DISABLE_HOMEPAGE=true, requests to /
302-redirect to /app instead of serving the landing page.
Unset/false preserves current behavior.
Requested via discord feedback.
The repo has been shipping multiple features ahead of the README (12+
commits today alone). This is a catch-up pass to bring the docs current.
Key additions / updates:
- Multi-tenancy architecture (orgs > workspaces > members + roles)
- Auto-migration on boot
- Teams currently consolidated into workspace_members
- Tech stack reference (Node 20.6+, msal-node, etc.)
- Deployment env vars (full reference table)
- Local dev setup with .env approach
- Contribution/Discord/issue reporting
No code changes - docs only.
Previously sendEmail() only logged on error/suppression paths; success
was silent. After prod deploy of c71c401 it was unclear whether the
first alert tick had actually delivered email or not - the answer was
yes but had to be derived from 'no error log + recipient query showed
matching device'. Add a log line on success so future observability
doesn't require detective work.
Replaces the stub EMAIL_WEBHOOK_URL row with the real 5-variable
GRAPH_* config table, Azure AD app registration steps (single-tenant
+ Mail.Send application permission + admin consent), the local-dev
stdout-fallback behavior when unconfigured, the optional
GRAPH_DEV_RESTRICT_TO allow-list for safe development against fresh
prod DB clones, and a brief enumeration of the alert spam protections
(2h dedup, 24h long-offline cutoff, sequential send pattern, per-user
email_alerts opt-out).
Pairs with c71c401 which shipped the implementation.
Replaces the unused EMAIL_WEBHOOK_URL stub with a real Microsoft Graph
Mail.Send pipeline via @azure/msal-node client-credentials flow. Prior
state on prod: every alert email was logged to journalctl and never
sent (21 fallback log lines per hour for the chronic-offline devices).
Four coordinated changes shipped as one commit since they're all part
of making email delivery actually work responsibly:
1. services/email.js (NEW): Graph send via plain HTTPS (no SDK), in-memory
MSAL token cache (refresh 60s pre-expiry), graceful stdout fallback
when GRAPH_* env vars absent. Drop-in replacement for the old webhook.
2. services/alerts.js refactored: sequential await around sendEmail (was
parallel fire-and-forget; first run hit Graph's MailboxConcurrency 429
ApplicationThrottled on a 30-device backlog). Sequential at ~250ms per
send takes 5-8s for the full backlog, well within the 60s tick. Also:
24h long-offline cutoff to stop nagging about chronic-offline devices
(the 20,000+ minute ones); 2-hour dedup window (was 1h) via a generic
shouldSendAlert(type, id, windowMs) helper that future alert types
(payment_failed, plan_limit_hit, etc.) can reuse.
3. Preferences UI: single checkbox in settings.js Account section bound
to users.email_alerts. Saved via the existing Save Profile button. PUT
/api/auth/me extended to accept email_alerts. requireAuth middleware
SELECT now includes email_alerts so it propagates via req.user.
4. Dev safety net: GRAPH_DEV_RESTRICT_TO env var as an allow-list. When
set, only listed recipients reach Graph; everyone else is suppressed
with a log line. Prevents local dev (which often runs against fresh
prod DB copies) from accidentally emailing real prod users. UNSET on
prod systemd unit so production fans out normally.
Also: package.json scripts use --env-file-if-exists=.env so local dev
picks up .env automatically (Node 20.6+ built-in, no dotenv dep). Prod
runs via systemd ExecStart and is unaffected. server/.gitignore added
to keep .env out of git.
Smoke verified end-to-end:
- Sequential send pattern verified (a prior parallel-send tick had hit
Graph's MailboxConcurrency 429 on 30 simultaneous sends; sequential
at ~250ms each completes the same backlog without throttling)
- 24h cutoff silenced 20/21 prod devices on the next tick
- Dev restrict suppressed the 1 within-24h send
- User-preference toggle flipped via UI -> DB -> alert path silently
continued before reaching even the suppression log
Visual polish to match the new device-count info in ce332ea. The
sidebar-constrained 188px dropdown was too narrow once a second
info chunk ('. N devices') joined the org name on the muted subtitle
line - long names like 'BRASA SALA\'s organization . 2 devices'
wrapped, doubling row height for half the dropdown.
Width: was left:0 + right:0 (= sidebar content width 188px). Now
left:0 + min-width:280px + max-width:360px. Detaches from the
sidebar (which is z-indexed) and extends into the main content area;
the max bound prevents indefinite sprawl on pathological org names.
Row height: padding 10px 12px -> 8px 12px; ws-org margin-top 2px -> 1px.
~58px per row -> ~46px. Less density-heavy at the platform_admin scale
(37 rows visible).
Menu padding: 4px 0 added on the panel so the first/last rows don't
sit flush against the panel border (fixes the 'first row clipped'
visual the tighter rows would otherwise still show).
Max-height: 320px -> 360px. Modest bump now that rows are shorter -
shows ~7 rows at once vs ~5 before.
.ws-org gains white-space:nowrap + overflow:hidden + text-overflow:ellipsis
so the org+count line truncates instead of wrapping. The 360px max-width
sets the truncation threshold.
/me's accessible_workspaces query gains a device_count field via a
correlated subquery on workspaces.id - WHERE workspace_id = w.id
strictly excludes the unclaimed pair-pool (workspace_id IS NULL fails
equality). Added to both query branches (platform_admin LEFT JOIN and
regular INNER JOIN); microseconds per row at current scale (~37 rows
worst case), not optimizing.
Frontend appends the count to the muted org-name line with a middle-dot
separator: 'Acme Studios . 2 devices'. Singular/plural respected via the
existing tn() helper convention; 'No devices' for empty workspaces. New
formatResourceCount(n, keyBase, zeroKey) helper is generic so the same
shape can wire users/playlists/schedules counts later without refactor.
New i18n keys: switcher.devices_count_one, switcher.devices_count_other,
switcher.no_devices. Added to en.js only; other locales fall back to en
via the existing lookup chain (verified in i18n.js:19).
API smoke verified: switcher-test sees Studio A=2, Field Crew=2;
dw5304 (platform_admin) sees all 37 workspaces with their device counts
varying 0-4; single-workspace zero-device user (geoff.case) sees 0.
Teams in its pre-Workspaces form is being paused while the feature is
redesigned as a user-grouping primitive within the new Workspaces
architecture. The original Teams data model had no workspace-awareness
and was effectively non-functional after Phase 2.2 (every route migrated
away from team_id), but the UI remained reachable and allowed users to
accumulate orphan data while believing they were configuring access
control.
Hide the Teams sidebar nav entry to prevent new entries to the UI.
/api/teams now returns 503 Service Unavailable with a 'feature
redesign in progress' message. Existing teams/team_members/team_invites
table data is preserved indefinitely for forward migration to the
future teams design.
Bonus: requireAuth middleware fires before the catch-all so unauthenticated
callers see the standard 401 instead of the 503 redesign message - avoids
exposing the 'feature being redesigned' signal to unauthenticated probes
or fingerprint scanners.
The previous comment claimed defParamCharset:'utf8' fixed multipart
filename header decoding. It doesn't - that option only fires for the
RFC 5987 encoded filename*=utf-8''... form, which clients rarely send.
The actual UTF-8 recovery happens in the storage.filename callback
(added in d679ca8) via Buffer.from(name,'latin1').toString('utf8').
The option is kept set for the rare RFC 5987 case but the comment no
longer overclaims what it does.
busboy reads the Content-Disposition filename="..." header value as
latin1 by default - even with defParamCharset:'utf8' set, that option
only applies to RFC 5987 encoded filename*=... params, which most
clients (browsers, curl, programmatic HTTP) don't send. Modern clients
send raw UTF-8 bytes for non-ASCII filenames; busboy interprets those
bytes one-byte-per-char as latin1, producing a JS string like 'A-tilde
+ quarter-mark' for 'u-umlaut'. JS then re-encodes that string as UTF-8
on the way to SQLite, yielding 4 bytes (c3 83 c2 bc) for what should be
2 bytes (c3 bc). Classic double-encoding mojibake - shows up in the UI
as 'BegrA-tilde...' instead of 'Begru-umlaut...'.
Fix: in the multer filename callback, re-decode file.originalname from
latin1 to utf8 to recover the original byte sequence. Mutating
originalname here propagates to every route handler reading
req.file.originalname (POST /, PUT /:id/replace, and any future upload
route using the same middleware).
This is the actual visible-mojibake bug semetra22 reported. The prior
commit b677752 (NFC normalize in safeFilename) handles a separate but
related case (macOS NFD clients sending decomposed forms); both fixes
compose correctly - latin1->utf8 first restores the byte sequence,
then NFC normalize collapses NFD into composed form.
Smoke verified by sending raw UTF-8 multipart from a Node https client
(no shell escaping). NFC input 'Begru-umlaut-essungsscreens.jpg' with
bytes c3bc c39f arrives clean (was c383c2bc c383c29f before). NFD input
'u + combining diaeresis' arrives as composed NFC c3bc after both fixes.
Single line change to safeFilename() in routes/content.js: add
.normalize('NFC') before sanitizeString. Covers all 4 user-facing
filename storage sites (POST /, POST /remote, POST /embed, PUT /:id
rename) since they all flow through safeFilename.
Fixes macOS NFD vs Linux NFC mismatch on filename storage that mangled
umlauts (ae/oe/ue/ss) in displayed filenames. macOS clients send
NFD-decomposed names (e.g. 'u' + combining diaeresis U+0308 instead of
the precomposed U+00FC); Linux + most renderers expect NFC. Without
this, names like 'Begruessungsscreens.jpg' arrive with the combining
char floating and display as mojibake.
Reported by semetra22 in Discord with extraordinarily good debugging
narrowing (rename works, upload doesn't = bug is in upload path).
Single-point fix at the convergence of all user-facing filename flows.
Existing NFD-mangled rows in DB not backfilled; users can re-upload or
rename to repair. Optional one-time UPDATE backfill captured as follow-up
in handoff doc.
Smoke verified by invoking safeFilename directly on NFD + NFC inputs of
'Begruessungsscreens.jpg' - both produce identical NFC-normalized bytes
(42656772c3bcc39f756e677373637265656e732e6a7067).
2 line substitutions in frontend/js/views/playlists.js: switches
/uploads/thumbnails/{filename} -> /api/content/{id}/thumbnail at both
the playlist editor render (line 293) and the Add-to-playlist content
picker (line 543). Brings playlist view inline with widgets.js,
content-library.js, and device-detail.js which already use the API
path.
Side benefit: thumbnails now go through the workspace-aware permission
check in content.js's /api/content/:id/thumbnail handler (checkContentRead)
instead of unauthenticated static file serve at /uploads/thumbnails/.
Reported by semetra22 in Discord ('All images retrieved via the API
display correctly, but in the playlists, the images are fetched
directly from /uploads/thumbnails/filename and do not display properly').
Fix: at connect, enumerate the user's accessible workspace_ids (direct workspace_members + org_owner/admin paths + platform_admin 'all') via new accessibleWorkspaceIds() helper in lib/tenancy.js; socket.join one room per workspace. All 12 dashboardNs.emit sites across deviceSocket / heartbeat / server.js / devices route / video-walls route now route via dashboardNs.to(workspaceRoom(...)).emit() with the workspace looked up from the relevant device or wall. New lib/socket-rooms.js holds the helpers and breaks a circular dependency (dashboardSocket already requires heartbeat, so heartbeat can't require dashboardSocket).
Inbound 6 commands rewired to canActOnDevice(socket, deviceId, tier): request-screenshot is read tier (workspace_viewer+); remote-touch/key/start/stop and device-command are write tier (workspace_editor+). Platform_admin and org_owner/admin always pass via actingAs. Legacy admin/superadmin branch dropped.
Lifecycle note: workspace-switch already calls window.location.reload (Phase 3 switcher), which forces a fresh socket with updated memberships - no per-emit re-evaluation needed.
Smoke tested with 3 simultaneous socket.io-client connections (switcher-test, swninja, dw5304 platform_admin) + direct canActOnDevice invocation for 6 user/device/tier combinations. All 9 outbound isolation cells and all 6 permission gates pass. Fixture mutation: switcher-test's Field Crew membership flipped from workspace_editor to workspace_viewer to exercise the read/write tier split in one login.
KNOWN REGRESSION (Phase 3 fix): platform_admin / superadmin no longer has cross-workspace 'see everything' view. Every route migrated tonight (2.2a-2.2m) deliberately removed the role-based bypass per design doc - cross-workspace visibility will come via dedicated admin endpoints in Phase 3, not magic role bypasses. Until Phase 3 ships, platform admins must switch-workspace to see other workspaces' data.