mirror of
https://github.com/screentinker/screentinker.git
synced 2026-06-29 09:23:16 -06:00
Compare commits
22 commits
v1.9.1-bet
...
main
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
d9fb914b9e | ||
|
|
ce78d0dde4 | ||
|
|
f206537fed | ||
|
|
139d7d09fa | ||
|
|
852219cb45 | ||
|
|
15448d1c5d | ||
|
|
29a8896aa8 | ||
|
|
101f086204 | ||
|
|
ed3cf72b82 | ||
|
|
d90cfb3986 | ||
|
|
f96b65576f | ||
|
|
ed164647b8 | ||
|
|
ae018b8eea | ||
|
|
071d7cc9c3 | ||
|
|
1e1ed7e29a | ||
|
|
36c4bf523f | ||
|
|
16c381254b | ||
|
|
01e5b10f53 | ||
|
|
9c990ff91f | ||
|
|
a6fe849c67 | ||
|
|
0c0a8dd68a | ||
|
|
aa23cf02dd |
31
CHANGELOG.md
31
CHANGELOG.md
|
|
@ -1,5 +1,36 @@
|
||||||
# Changelog
|
# Changelog
|
||||||
|
|
||||||
|
## 1.9.2-beta1 — unreleased
|
||||||
|
|
||||||
|
### Fixed — server resilience (#142)
|
||||||
|
- **A single flapping device can no longer saturate the event loop.** A new
|
||||||
|
load-aware, per-device reconnect throttle (`lib/reconnect-throttle.js`) gates
|
||||||
|
genuine reconnects *before* the heavy register work (DB writes + playlist build).
|
||||||
|
The verdict is per-device; global event-loop lag only multiplies an
|
||||||
|
already-flagged device's backoff and never throttles a healthy one. Hard ceiling
|
||||||
|
+ cold-start warm-up so a full-fleet reconnect after a deploy is never throttled.
|
||||||
|
- **`device_status_log` growth is bounded.** Added
|
||||||
|
`idx_device_status_log_device_ts`, a global retention sweep (`pruneStatusLog`,
|
||||||
|
`STATUS_LOG_RETENTION_DAYS` default 3) covering removed/idle devices and the
|
||||||
|
`offline_timeout` path, and de-duplicated the table's `CREATE TABLE`.
|
||||||
|
- **`content-ack` spam de-duplicated.** Repeated identical
|
||||||
|
`(device_id, content_id, status)` reports are suppressed within
|
||||||
|
`CONTENT_ACK_DEDUP_MS` (default 10s).
|
||||||
|
- **Provisioning cleanup window corrected.** Unclaimed provisioning devices are now
|
||||||
|
swept after 24h (the code used `365 * 86400` — a year — contradicting its own
|
||||||
|
comment).
|
||||||
|
|
||||||
|
### Added — observability (#142)
|
||||||
|
- **Event-loop lag telemetry** via `perf_hooks.monitorEventLoopDelay()`. Sampled to
|
||||||
|
a bounded `event_loop_lag` table (indexed + pruned, `LAG_TELEMETRY_RETENTION_DAYS`)
|
||||||
|
and surfaced on `/api/status` as `loop_lag` (mean/p50/p99/max + band).
|
||||||
|
|
||||||
|
### Maintenance
|
||||||
|
- Operators whose `device_status_log` is already bloated from a pre-1.9.2 deployment
|
||||||
|
should reclaim disk with a **one-time manual `VACUUM`** in a maintenance window;
|
||||||
|
retention now bounds further growth. Auto-VACUUM is intentionally not enabled.
|
||||||
|
See [`docs/maintenance-device-status-log.md`](docs/maintenance-device-status-log.md).
|
||||||
|
|
||||||
## 1.9.1-beta3 — unreleased
|
## 1.9.1-beta3 — unreleased
|
||||||
|
|
||||||
### Fixed — Tizen player
|
### Fixed — Tizen player
|
||||||
|
|
|
||||||
|
|
@ -426,6 +426,7 @@ keytool -genkey -v -keystore android/release-key.jks -keyalg RSA -keysize 2048 -
|
||||||
3. Install the ScreenTinker app on your device:
|
3. Install the ScreenTinker app on your device:
|
||||||
- **Android TV / tablets**: Download the APK from your instance (`/download/apk`) or build it from source (see above)
|
- **Android TV / tablets**: Download the APK from your instance (`/download/apk`) or build it from source (see above)
|
||||||
- **Raspberry Pi**: `curl -sSL https://your-instance/scripts/raspberry-pi-setup.sh | bash`
|
- **Raspberry Pi**: `curl -sSL https://your-instance/scripts/raspberry-pi-setup.sh | bash`
|
||||||
|
- **Debian 13 (headless)**: `curl -sSL https://your-instance/scripts/debian-13-setup.sh | sudo bash`
|
||||||
- **Windows**: Run the setup script from `scripts/windows-setup.bat`
|
- **Windows**: Run the setup script from `scripts/windows-setup.bat`
|
||||||
- **Samsung Tizen TV / signage**: point the TV's URL Launcher (or browser) at `https://your-instance/player` - no signing needed. For an installed native app, see [tizen/README.md](tizen/README.md)
|
- **Samsung Tizen TV / signage**: point the TV's URL Launcher (or browser) at `https://your-instance/player` - no signing needed. For an installed native app, see [tizen/README.md](tizen/README.md)
|
||||||
- **Any browser**: Open `https://your-instance/player` in kiosk/fullscreen mode
|
- **Any browser**: Open `https://your-instance/player` in kiosk/fullscreen mode
|
||||||
|
|
|
||||||
25
SECURITY.md
25
SECURITY.md
|
|
@ -95,3 +95,28 @@ by name in release notes and (when applicable) in the GitHub advisory
|
||||||
itself. Let me know in your report whether you'd like credit and how
|
itself. Let me know in your report whether you'd like credit and how
|
||||||
you'd like to be named. Anonymous reports are also welcome — no credit
|
you'd like to be named. Anonymous reports are also welcome — no credit
|
||||||
is required.
|
is required.
|
||||||
|
|
||||||
|
## Uploaded content access model
|
||||||
|
|
||||||
|
Uploaded content (images, videos) served under /uploads/content is
|
||||||
|
**public by unguessable URL**, not access-controlled:
|
||||||
|
|
||||||
|
- Filenames are UUIDv4 (122 bits of randomness), so URLs are not enumerable
|
||||||
|
or guessable.
|
||||||
|
- There is no per-request authentication on content bytes, and CORS is open
|
||||||
|
(Access-Control-Allow-Origin: *) because the web player's canvas-based
|
||||||
|
screenshot capture requires cross-origin access.
|
||||||
|
- Anyone who obtains a content URL can read that file, cross-tenant, with no
|
||||||
|
expiry (immutable 30-day cache) and no revocation short of deleting the file.
|
||||||
|
|
||||||
|
This is an intentional design choice for digital signage, where content is
|
||||||
|
destined for public display. It is **security-through-unguessability, not
|
||||||
|
access control.**
|
||||||
|
|
||||||
|
**Do not upload content you require to remain confidential** - including
|
||||||
|
material that is destined for a screen but not yet public (e.g. a scheduled
|
||||||
|
promotion before its reveal, or an internal board containing names or other
|
||||||
|
sensitive details). Such content is world-readable from the moment of upload.
|
||||||
|
If pre-launch or tenant-private confidentiality is a requirement for your
|
||||||
|
deployment, open an issue - signed/expiring URLs are tracked but not yet
|
||||||
|
implemented.
|
||||||
|
|
|
||||||
|
|
@ -9,10 +9,10 @@ android {
|
||||||
|
|
||||||
defaultConfig {
|
defaultConfig {
|
||||||
applicationId = "com.remotedisplay.player"
|
applicationId = "com.remotedisplay.player"
|
||||||
minSdk = 26
|
minSdk = 24
|
||||||
targetSdk = 34
|
targetSdk = 34
|
||||||
versionCode = 26
|
versionCode = 31
|
||||||
versionName = "1.9.1-beta6"
|
versionName = "1.9.2-beta1"
|
||||||
}
|
}
|
||||||
|
|
||||||
signingConfigs {
|
signingConfigs {
|
||||||
|
|
|
||||||
|
|
@ -240,6 +240,12 @@ class MainActivity : AppCompatActivity() {
|
||||||
|
|
||||||
// Start auto-update checker
|
// Start auto-update checker
|
||||||
updateChecker = UpdateChecker(this)
|
updateChecker = UpdateChecker(this)
|
||||||
|
// #139: surface OTA status (applying / backing off / manual-update-required) to the
|
||||||
|
// dashboard. wsService is read lazily — it binds after this runs.
|
||||||
|
updateChecker.otaLogReporter = { level, msg -> wsService?.sendLog("ota", level, msg) }
|
||||||
|
// #139 Phase 2 (Option B): announce OTA status transitions (clear / enter-backoff) so the
|
||||||
|
// dashboard badge clears/lights up promptly without waiting for a reconnect.
|
||||||
|
updateChecker.otaStatusReporter = { wsService?.sendOtaStatus() }
|
||||||
updateChecker.startPeriodicCheck()
|
updateChecker.startPeriodicCheck()
|
||||||
|
|
||||||
}
|
}
|
||||||
|
|
|
||||||
|
|
@ -71,4 +71,37 @@ class ServerConfig(context: Context) {
|
||||||
fun clearPlaylistCache() {
|
fun clearPlaylistCache() {
|
||||||
prefs.edit().remove("cached_playlist").apply()
|
prefs.edit().remove("cached_playlist").apply()
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// #139 OTA attempt state. Persisted (not in-memory) on purpose: the OTA loop is driven
|
||||||
|
// by Fire OS restarting the app, which re-fires the update check; an in-memory counter
|
||||||
|
// would reset on every restart and never back off. `otaTargetVersion` is the version we
|
||||||
|
// are currently trying to install; `otaAttempts` counts install attempts for it;
|
||||||
|
// `otaLastAttemptAt` gates the post-cap retry backoff.
|
||||||
|
var otaTargetVersion: String
|
||||||
|
get() = prefs.getString("ota_target_version", "") ?: ""
|
||||||
|
set(value) = prefs.edit().putString("ota_target_version", value).apply()
|
||||||
|
|
||||||
|
var otaAttempts: Int
|
||||||
|
get() = prefs.getInt("ota_attempts", 0)
|
||||||
|
set(value) = prefs.edit().putInt("ota_attempts", value).apply()
|
||||||
|
|
||||||
|
var otaLastAttemptAt: Long
|
||||||
|
get() = prefs.getLong("ota_last_attempt_at", 0L)
|
||||||
|
set(value) = prefs.edit().putLong("ota_last_attempt_at", value).apply()
|
||||||
|
|
||||||
|
// #139: true once the "entering backoff" status has been reported for the current target,
|
||||||
|
// so the dashboard line fires on the transition only — not on every backed-off poll (Fire OS
|
||||||
|
// restarts re-fire the check constantly). Reset on a new target / on clear.
|
||||||
|
var otaBackoffReported: Boolean
|
||||||
|
get() = prefs.getBoolean("ota_backoff_reported", false)
|
||||||
|
set(value) = prefs.edit().putBoolean("ota_backoff_reported", value).apply()
|
||||||
|
|
||||||
|
fun clearOtaState() {
|
||||||
|
prefs.edit()
|
||||||
|
.remove("ota_target_version")
|
||||||
|
.remove("ota_attempts")
|
||||||
|
.remove("ota_last_attempt_at")
|
||||||
|
.remove("ota_backoff_reported")
|
||||||
|
.apply()
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
|
||||||
|
|
@ -0,0 +1,74 @@
|
||||||
|
package com.remotedisplay.player.service
|
||||||
|
|
||||||
|
/**
|
||||||
|
* #139: pure OTA throttle decision logic — no Android dependencies, so it's unit-testable
|
||||||
|
* (see OtaThrottleTest). UpdateChecker is the imperative shell: it reads/writes the persisted
|
||||||
|
* fields (ServerConfig / EncryptedSharedPreferences) and performs the actual download + install;
|
||||||
|
* this object owns the stateful RULES so they have coverage beyond a compile:
|
||||||
|
*
|
||||||
|
* - a new target version resets the attempt budget,
|
||||||
|
* - a check NEVER consumes the budget — only a launched install does (so a transient
|
||||||
|
* download/network failure can't park a healthy device in backoff),
|
||||||
|
* - after MAX_INSTALL_ATTEMPTS failed installs, back off to one retry per BACKOFF_MS,
|
||||||
|
* - the "entering backoff" signal fires on the crossing only (report-on-transition).
|
||||||
|
*/
|
||||||
|
object OtaThrottle {
|
||||||
|
const val MAX_INSTALL_ATTEMPTS = 3
|
||||||
|
const val BACKOFF_MS = 24L * 60 * 60 * 1000
|
||||||
|
|
||||||
|
/** Persisted OTA state for the version we are currently trying to install. */
|
||||||
|
data class State(
|
||||||
|
val targetVersion: String = "",
|
||||||
|
val attempts: Int = 0,
|
||||||
|
val lastAttemptAt: Long = 0L,
|
||||||
|
val backoffReported: Boolean = false
|
||||||
|
)
|
||||||
|
|
||||||
|
enum class Action { ATTEMPT, BACKOFF }
|
||||||
|
|
||||||
|
/** True when [latestVersion] differs from the persisted target — caller drops stale APKs. */
|
||||||
|
fun isNewTarget(state: State, latestVersion: String): Boolean = state.targetVersion != latestVersion
|
||||||
|
|
||||||
|
/**
|
||||||
|
* A check found [latestVersion] available. Returns the state to persist (reset on a new
|
||||||
|
* target) and whether to attempt now. Does NOT count an attempt: the budget is consumed
|
||||||
|
* only once an install is actually launched (see [onInstallLaunched]).
|
||||||
|
*/
|
||||||
|
fun onUpdateAvailable(state: State, latestVersion: String, now: Long): Pair<State, Action> {
|
||||||
|
val s = if (isNewTarget(state, latestVersion)) State(targetVersion = latestVersion) else state
|
||||||
|
if (s.attempts >= MAX_INSTALL_ATTEMPTS && now - s.lastAttemptAt < BACKOFF_MS) {
|
||||||
|
return s to Action.BACKOFF
|
||||||
|
}
|
||||||
|
return s to Action.ATTEMPT
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* An install was actually launched (a verified APK was in hand). Consumes one attempt and
|
||||||
|
* returns the new state plus whether this attempt is the FIRST to cross the cap into backoff
|
||||||
|
* (true => caller reports "manual update required" once; false on all later polls).
|
||||||
|
*/
|
||||||
|
fun onInstallLaunched(state: State, now: Long): Pair<State, Boolean> {
|
||||||
|
val attempts = state.attempts + 1
|
||||||
|
var s = state.copy(attempts = attempts, lastAttemptAt = now)
|
||||||
|
val enteredBackoff = attempts >= MAX_INSTALL_ATTEMPTS && !s.backoffReported
|
||||||
|
if (enteredBackoff) s = s.copy(backoffReported = true)
|
||||||
|
return s to enteredBackoff
|
||||||
|
}
|
||||||
|
|
||||||
|
/** A check found us already on the latest. True if there was pending OTA state to clear. */
|
||||||
|
fun shouldClearOnUpToDate(state: State): Boolean = state.targetVersion.isNotEmpty()
|
||||||
|
|
||||||
|
/**
|
||||||
|
* #139 Phase 2: operator-facing status for the dashboard.
|
||||||
|
* - "none" : no update pending.
|
||||||
|
* - "manual_update_required" : capped AND still inside the backoff window — this device
|
||||||
|
* can't self-install; a human needs to update it.
|
||||||
|
* - "pending" : an update is in progress / will retry (under the cap, or the
|
||||||
|
* window has elapsed so a retry is due).
|
||||||
|
*/
|
||||||
|
fun statusFor(state: State, now: Long): String = when {
|
||||||
|
state.targetVersion.isEmpty() -> "none"
|
||||||
|
state.attempts >= MAX_INSTALL_ATTEMPTS && now - state.lastAttemptAt < BACKOFF_MS -> "manual_update_required"
|
||||||
|
else -> "pending"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
@ -39,6 +39,25 @@ class UpdateChecker(private val context: Context) {
|
||||||
|
|
||||||
private var installReceiverRegistered = false
|
private var installReceiverRegistered = false
|
||||||
|
|
||||||
|
// #139: report OTA status to the dashboard (device:log, tag "ota"). Wired by MainActivity
|
||||||
|
// to WebSocketService.sendLog; null until then. Read lazily so binding order doesn't matter.
|
||||||
|
// The throttle thresholds + decision rules live in OtaThrottle (pure, unit-tested); this
|
||||||
|
// class is the imperative shell that persists state and does the download/install.
|
||||||
|
var otaLogReporter: ((level: String, message: String) -> Unit)? = null
|
||||||
|
|
||||||
|
private fun report(level: String, message: String) {
|
||||||
|
when (level) { "error" -> Log.e(TAG, message); "warn" -> Log.w(TAG, message); else -> Log.i(TAG, message) }
|
||||||
|
try { otaLogReporter?.invoke(level, message) } catch (_: Throwable) {}
|
||||||
|
}
|
||||||
|
|
||||||
|
// #139 Phase 2 (Option B): announce an OTA status TRANSITION to the server (wired by
|
||||||
|
// MainActivity to WebSocketService.sendOtaStatus, which reads the just-persisted state).
|
||||||
|
// Fired ONLY at the two transitions — clear and enter-backoff — so the dashboard badge
|
||||||
|
// updates promptly without waiting for a reconnect, with no per-poll/heartbeat chatter.
|
||||||
|
// Lazy/null-safe so binding order doesn't matter, same as otaLogReporter.
|
||||||
|
var otaStatusReporter: (() -> Unit)? = null
|
||||||
|
private fun announceOtaStatus() { try { otaStatusReporter?.invoke() } catch (_: Throwable) {} }
|
||||||
|
|
||||||
// The PackageInstaller session reports its status (incl. STATUS_PENDING_USER_ACTION,
|
// The PackageInstaller session reports its status (incl. STATUS_PENDING_USER_ACTION,
|
||||||
// which Android 13+ returns for non-device-owner installers) via this broadcast.
|
// which Android 13+ returns for non-device-owner installers) via this broadcast.
|
||||||
// Without handling it the committed session just stalls and the update never
|
// Without handling it the committed session just stalls and the update never
|
||||||
|
|
@ -59,6 +78,8 @@ class UpdateChecker(private val context: Context) {
|
||||||
catch (e: Exception) { Log.e(TAG, "Confirm launch failed: ${e.message}") }
|
catch (e: Exception) { Log.e(TAG, "Confirm launch failed: ${e.message}") }
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
// Logcat only — NOT report(): these fire per attempt, and #139 keeps the
|
||||||
|
// device:log/dashboard channel to state transitions (enter-backoff, clear).
|
||||||
android.content.pm.PackageInstaller.STATUS_SUCCESS -> Log.i(TAG, "Update installed successfully")
|
android.content.pm.PackageInstaller.STATUS_SUCCESS -> Log.i(TAG, "Update installed successfully")
|
||||||
else -> Log.w(TAG, "Install status: ${intent.getStringExtra(android.content.pm.PackageInstaller.EXTRA_STATUS_MESSAGE)}")
|
else -> Log.w(TAG, "Install status: ${intent.getStringExtra(android.content.pm.PackageInstaller.EXTRA_STATUS_MESSAGE)}")
|
||||||
}
|
}
|
||||||
|
|
@ -116,9 +137,17 @@ class UpdateChecker(private val context: Context) {
|
||||||
|
|
||||||
Log.i(TAG, "Current: $currentVersion, Latest: $latestVersion, Update: $updateAvailable")
|
Log.i(TAG, "Current: $currentVersion, Latest: $latestVersion, Update: $updateAvailable")
|
||||||
|
|
||||||
if (updateAvailable && downloadUrl.isNotEmpty()) {
|
if (!updateAvailable) {
|
||||||
Log.i(TAG, "Update available! Downloading...")
|
// #139: on the latest version now. If OTA state was pending, the install
|
||||||
downloadAndInstall("${config.serverUrl}$downloadUrl", latestVersion)
|
// landed (the app relaunched as the new version) — clear state + caches once.
|
||||||
|
if (OtaThrottle.shouldClearOnUpToDate(otaState())) {
|
||||||
|
report("info", "OTA complete: now on $currentVersion — clearing update state")
|
||||||
|
config.clearOtaState()
|
||||||
|
cleanupApks(null)
|
||||||
|
announceOtaStatus() // transition -> emits 'none' so the badge clears promptly
|
||||||
|
}
|
||||||
|
} else if (downloadUrl.isNotEmpty()) {
|
||||||
|
maybeUpdate(latestVersion, "${config.serverUrl}$downloadUrl")
|
||||||
}
|
}
|
||||||
} catch (e: Exception) {
|
} catch (e: Exception) {
|
||||||
Log.e(TAG, "Update check error: ${e.message}")
|
Log.e(TAG, "Update check error: ${e.message}")
|
||||||
|
|
@ -126,20 +155,89 @@ class UpdateChecker(private val context: Context) {
|
||||||
}.start()
|
}.start()
|
||||||
}
|
}
|
||||||
|
|
||||||
private fun downloadAndInstall(url: String, version: String) {
|
private fun otaState() = OtaThrottle.State(
|
||||||
|
config.otaTargetVersion, config.otaAttempts, config.otaLastAttemptAt, config.otaBackoffReported)
|
||||||
|
|
||||||
|
private fun persistOta(s: OtaThrottle.State) {
|
||||||
|
config.otaTargetVersion = s.targetVersion
|
||||||
|
config.otaAttempts = s.attempts
|
||||||
|
config.otaLastAttemptAt = s.lastAttemptAt
|
||||||
|
config.otaBackoffReported = s.backoffReported
|
||||||
|
}
|
||||||
|
|
||||||
|
// #139 imperative shell over OtaThrottle (the pure, unit-tested decision logic). A device
|
||||||
|
// that can't silently install (Fire TV: no device-owner) stops re-pulling the full APK every
|
||||||
|
// cycle. Only a COMMITTED install consumes the attempt budget — a transient download/verify
|
||||||
|
// failure on a HEALTHY device must never park it in backoff.
|
||||||
|
private fun maybeUpdate(latestVersion: String, downloadUrl: String) {
|
||||||
|
val now = System.currentTimeMillis()
|
||||||
|
val cur = otaState()
|
||||||
|
if (OtaThrottle.isNewTarget(cur, latestVersion)) cleanupApks(latestVersion)
|
||||||
|
|
||||||
|
val (afterCheck, action) = OtaThrottle.onUpdateAvailable(cur, latestVersion, now)
|
||||||
|
persistOta(afterCheck)
|
||||||
|
// Capped + still inside the window: do nothing AND stay silent. Fire OS restarts re-fire
|
||||||
|
// this check constantly; reporting here would just move the flood onto the WS channel.
|
||||||
|
// The enter-backoff line was already sent once on the crossing (below).
|
||||||
|
if (action == OtaThrottle.Action.BACKOFF) return
|
||||||
|
|
||||||
|
// download/verify failure → retry on the normal cadence; do NOT count it as an attempt.
|
||||||
|
if (!downloadAndInstall(downloadUrl, latestVersion)) {
|
||||||
|
Log.w(TAG, "Update $latestVersion: download/verify failed — retry next check (no attempt consumed)")
|
||||||
|
return
|
||||||
|
}
|
||||||
|
|
||||||
|
val (afterLaunch, enteredBackoff) = OtaThrottle.onInstallLaunched(afterCheck, now)
|
||||||
|
persistOta(afterLaunch)
|
||||||
|
Log.i(TAG, "Install launched for $latestVersion (attempt ${afterLaunch.attempts}/${OtaThrottle.MAX_INSTALL_ATTEMPTS})")
|
||||||
|
if (enteredBackoff) {
|
||||||
|
report("warn", "Update $latestVersion available but not installing after ${afterLaunch.attempts} attempts — manual update required (backing off to one retry per ${OtaThrottle.BACKOFF_MS / 3_600_000L}h)")
|
||||||
|
announceOtaStatus() // transition -> emits 'manual_update_required'
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// #139: remove cached OTA APKs other than `keep` (null = remove all). Keeps the external
|
||||||
|
// files dir from accumulating one stale APK per superseded version.
|
||||||
|
private fun cleanupApks(keep: String?) {
|
||||||
try {
|
try {
|
||||||
|
val dir = context.getExternalFilesDir(Environment.DIRECTORY_DOWNLOADS) ?: return
|
||||||
|
val keepName = keep?.let { "ScreenTinker-$it.apk" }
|
||||||
|
dir.listFiles { f ->
|
||||||
|
f.name.startsWith("ScreenTinker-") && f.name.endsWith(".apk") && f.name != keepName
|
||||||
|
}?.forEach { it.delete() }
|
||||||
|
} catch (e: Exception) {
|
||||||
|
Log.w(TAG, "APK cleanup failed: ${e.message}")
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Returns TRUE only when a verified APK is in hand and an install has been launched (the
|
||||||
|
// caller may then count an attempt); FALSE on any download/verify failure — the caller must
|
||||||
|
// NOT count those, so a transient network problem can't burn a healthy device's budget. #139
|
||||||
|
private fun downloadAndInstall(url: String, version: String): Boolean {
|
||||||
|
try {
|
||||||
|
val apkFile = File(context.getExternalFilesDir(Environment.DIRECTORY_DOWNLOADS),
|
||||||
|
"ScreenTinker-$version.apk")
|
||||||
|
|
||||||
|
// #139: reuse a previously-downloaded, verified APK for this version instead of
|
||||||
|
// re-pulling ~8.7 MB every cycle. The file also stays on disk as the artifact for a
|
||||||
|
// manual install when silent install isn't possible.
|
||||||
|
if (apkFile.exists() && verifyApkSignature(apkFile)) {
|
||||||
|
Log.i(TAG, "Reusing cached verified APK: ${apkFile.absolutePath} (${apkFile.length()} bytes)")
|
||||||
|
handler.post { installApk(apkFile) }
|
||||||
|
return true
|
||||||
|
}
|
||||||
|
// A leftover but invalid file (partial/corrupt/tampered) must never be reused.
|
||||||
|
if (apkFile.exists()) apkFile.delete()
|
||||||
|
|
||||||
// Download to a temp file
|
// Download to a temp file
|
||||||
val request = Request.Builder().url(url).build()
|
val request = Request.Builder().url(url).build()
|
||||||
val response = client.newCall(request).execute()
|
val response = client.newCall(request).execute()
|
||||||
|
|
||||||
if (!response.isSuccessful) {
|
if (!response.isSuccessful) {
|
||||||
Log.e(TAG, "Download failed: ${response.code}")
|
Log.e(TAG, "Download failed: ${response.code}")
|
||||||
return
|
return false
|
||||||
}
|
}
|
||||||
|
|
||||||
val apkFile = File(context.getExternalFilesDir(Environment.DIRECTORY_DOWNLOADS),
|
|
||||||
"ScreenTinker-$version.apk")
|
|
||||||
|
|
||||||
response.body?.byteStream()?.use { input ->
|
response.body?.byteStream()?.use { input ->
|
||||||
apkFile.outputStream().use { output ->
|
apkFile.outputStream().use { output ->
|
||||||
input.copyTo(output)
|
input.copyTo(output)
|
||||||
|
|
@ -158,7 +256,7 @@ class UpdateChecker(private val context: Context) {
|
||||||
if (!verifyApkSignature(apkFile)) {
|
if (!verifyApkSignature(apkFile)) {
|
||||||
Log.e(TAG, "Refusing update: APK signature/package verification failed (tampered or MITM'd APK)")
|
Log.e(TAG, "Refusing update: APK signature/package verification failed (tampered or MITM'd APK)")
|
||||||
apkFile.delete()
|
apkFile.delete()
|
||||||
return
|
return false
|
||||||
}
|
}
|
||||||
Log.i(TAG, "APK signature verified against installed app - proceeding to install")
|
Log.i(TAG, "APK signature verified against installed app - proceeding to install")
|
||||||
|
|
||||||
|
|
@ -166,8 +264,10 @@ class UpdateChecker(private val context: Context) {
|
||||||
handler.post {
|
handler.post {
|
||||||
installApk(apkFile)
|
installApk(apkFile)
|
||||||
}
|
}
|
||||||
|
return true
|
||||||
} catch (e: Exception) {
|
} catch (e: Exception) {
|
||||||
Log.e(TAG, "Download/install error: ${e.message}")
|
Log.e(TAG, "Download/install error: ${e.message}")
|
||||||
|
return false
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
@ -245,9 +345,18 @@ class UpdateChecker(private val context: Context) {
|
||||||
private fun verifyApkSignature(apkFile: File): Boolean {
|
private fun verifyApkSignature(apkFile: File): Boolean {
|
||||||
return try {
|
return try {
|
||||||
val pm = context.packageManager
|
val pm = context.packageManager
|
||||||
val flags = if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.P)
|
// #139: getPackageArchiveInfo(GET_SIGNING_CERTIFICATES).signingInfo is NULL for
|
||||||
|
// ARCHIVE files on API 28/29 (it's only populated from API 30) — so the modern flag
|
||||||
|
// reads 0 certs from a downloaded APK and we'd wrongly REFUSE a legitimate update,
|
||||||
|
// which is the real Fire OS 8 / Android 9 OTA-loop cause. Below API 30, read the
|
||||||
|
// archive's signer via the legacy GET_SIGNATURES + .signatures (its v1/JAR cert,
|
||||||
|
// which IS populated on 28/29). This reads the cert CORRECTLY — it does not weaken
|
||||||
|
// verification: the archive's signer is still extracted and compared to the installed
|
||||||
|
// app's signer below, and a mismatch / zero-cert APK is still rejected.
|
||||||
|
val archiveUsesSigningInfo = Build.VERSION.SDK_INT >= Build.VERSION_CODES.R // API 30
|
||||||
|
val archiveFlags = if (archiveUsesSigningInfo)
|
||||||
PackageManager.GET_SIGNING_CERTIFICATES else @Suppress("DEPRECATION") PackageManager.GET_SIGNATURES
|
PackageManager.GET_SIGNING_CERTIFICATES else @Suppress("DEPRECATION") PackageManager.GET_SIGNATURES
|
||||||
val downloaded = pm.getPackageArchiveInfo(apkFile.absolutePath, flags)
|
val downloaded = pm.getPackageArchiveInfo(apkFile.absolutePath, archiveFlags)
|
||||||
if (downloaded == null) {
|
if (downloaded == null) {
|
||||||
Log.e(TAG, "Could not parse downloaded APK")
|
Log.e(TAG, "Could not parse downloaded APK")
|
||||||
return false
|
return false
|
||||||
|
|
@ -256,14 +365,20 @@ class UpdateChecker(private val context: Context) {
|
||||||
Log.e(TAG, "APK package mismatch: ${downloaded.packageName} != ${context.packageName}")
|
Log.e(TAG, "APK package mismatch: ${downloaded.packageName} != ${context.packageName}")
|
||||||
return false
|
return false
|
||||||
}
|
}
|
||||||
val installed = pm.getPackageInfo(context.packageName, flags)
|
// INSTALLED-app read: signingInfo IS populated for installed packages on API 28+,
|
||||||
val downloadedSigs = signingCertHashes(downloaded)
|
// so keep the modern flag there (this side already worked).
|
||||||
val installedSigs = signingCertHashes(installed)
|
val installedUsesSigningInfo = Build.VERSION.SDK_INT >= Build.VERSION_CODES.P // API 28
|
||||||
|
val installedFlags = if (installedUsesSigningInfo)
|
||||||
|
PackageManager.GET_SIGNING_CERTIFICATES else @Suppress("DEPRECATION") PackageManager.GET_SIGNATURES
|
||||||
|
val installed = pm.getPackageInfo(context.packageName, installedFlags)
|
||||||
|
val downloadedSigs = signingCertHashes(downloaded, archiveUsesSigningInfo)
|
||||||
|
val installedSigs = signingCertHashes(installed, installedUsesSigningInfo)
|
||||||
if (downloadedSigs.isEmpty() || installedSigs.isEmpty()) {
|
if (downloadedSigs.isEmpty() || installedSigs.isEmpty()) {
|
||||||
Log.e(TAG, "Missing signing certificates (downloaded=${downloadedSigs.size}, installed=${installedSigs.size})")
|
Log.e(TAG, "Missing signing certificates (downloaded=${downloadedSigs.size}, installed=${installedSigs.size})")
|
||||||
return false
|
return false
|
||||||
}
|
}
|
||||||
// Share at least one current signing certificate.
|
// Require a non-empty overlap of signer certs (handles multi-signer / cert-rotation
|
||||||
|
// the same way the API>=30 path does: compare the full current signer sets).
|
||||||
val match = downloadedSigs.any { it in installedSigs }
|
val match = downloadedSigs.any { it in installedSigs }
|
||||||
if (!match) Log.e(TAG, "APK signing certificate does not match installed app")
|
if (!match) Log.e(TAG, "APK signing certificate does not match installed app")
|
||||||
match
|
match
|
||||||
|
|
@ -273,8 +388,13 @@ class UpdateChecker(private val context: Context) {
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
private fun signingCertHashes(info: PackageInfo): Set<String> {
|
// Read the signer-cert SHA-256 set from a PackageInfo. `useSigningInfo` must match the flag
|
||||||
val sigs: Array<Signature>? = if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.P) {
|
// it was fetched with: GET_SIGNING_CERTIFICATES -> signingInfo.apkContentsSigners (modern;
|
||||||
|
// multi-signer + rotation aware), GET_SIGNATURES -> legacy .signatures (the only field
|
||||||
|
// populated for ARCHIVE reads on API 28/29). Both yield the same cert for a normally-signed
|
||||||
|
// APK; the caller compares as sets so an overlapping signer still verifies.
|
||||||
|
private fun signingCertHashes(info: PackageInfo, useSigningInfo: Boolean): Set<String> {
|
||||||
|
val sigs: Array<Signature>? = if (useSigningInfo) {
|
||||||
info.signingInfo?.apkContentsSigners
|
info.signingInfo?.apkContentsSigners
|
||||||
} else {
|
} else {
|
||||||
@Suppress("DEPRECATION") info.signatures
|
@Suppress("DEPRECATION") info.signatures
|
||||||
|
|
|
||||||
|
|
@ -560,6 +560,22 @@ class WebSocketService : Service() {
|
||||||
} catch (e: Throwable) { Log.w("WebSocketService", "sendLog: ${e.message}") }
|
} catch (e: Throwable) { Log.w("WebSocketService", "sendLog: ${e.message}") }
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// #139 Phase 2 (Option B): announce an OTA status transition to the server so the dashboard
|
||||||
|
// badge updates promptly (not only on reconnect). Reads the just-persisted throttle state —
|
||||||
|
// the emit always reflects the stored truth. Called by UpdateChecker at clear / enter-backoff.
|
||||||
|
fun sendOtaStatus() {
|
||||||
|
if (socket?.connected() != true) return
|
||||||
|
try {
|
||||||
|
val s = OtaThrottle.State(config.otaTargetVersion, config.otaAttempts, config.otaLastAttemptAt, config.otaBackoffReported)
|
||||||
|
socket?.emit("device:ota-status", JSONObject().apply {
|
||||||
|
put("device_id", config.deviceId)
|
||||||
|
put("ota_status", OtaThrottle.statusFor(s, System.currentTimeMillis()))
|
||||||
|
put("ota_target_version", config.otaTargetVersion)
|
||||||
|
put("ota_attempts", config.otaAttempts)
|
||||||
|
})
|
||||||
|
} catch (e: Throwable) { Log.w("WebSocketService", "sendOtaStatus: ${e.message}") }
|
||||||
|
}
|
||||||
|
|
||||||
fun sendPlaybackState(contentId: String, positionSec: Float) {
|
fun sendPlaybackState(contentId: String, positionSec: Float) {
|
||||||
if (socket?.connected() != true) return
|
if (socket?.connected() != true) return
|
||||||
try {
|
try {
|
||||||
|
|
|
||||||
|
|
@ -13,6 +13,8 @@ import android.os.SystemClock
|
||||||
import android.provider.Settings
|
import android.provider.Settings
|
||||||
import android.util.DisplayMetrics
|
import android.util.DisplayMetrics
|
||||||
import android.view.WindowManager
|
import android.view.WindowManager
|
||||||
|
import com.remotedisplay.player.data.ServerConfig
|
||||||
|
import com.remotedisplay.player.service.OtaThrottle
|
||||||
import java.security.MessageDigest
|
import java.security.MessageDigest
|
||||||
import org.json.JSONObject
|
import org.json.JSONObject
|
||||||
|
|
||||||
|
|
@ -49,6 +51,13 @@ class DeviceInfo(private val context: Context) {
|
||||||
put("screen_height", outH)
|
put("screen_height", outH)
|
||||||
put("render_width", renW)
|
put("render_width", renW)
|
||||||
put("render_height", renH)
|
put("render_height", renH)
|
||||||
|
// #139 Phase 2: report OTA backoff state (alongside app_version) so the dashboard can
|
||||||
|
// flag screens stuck in manual-update-required. Read from the persisted throttle state.
|
||||||
|
val cfg = ServerConfig(context)
|
||||||
|
val ota = OtaThrottle.State(cfg.otaTargetVersion, cfg.otaAttempts, cfg.otaLastAttemptAt, cfg.otaBackoffReported)
|
||||||
|
put("ota_status", OtaThrottle.statusFor(ota, System.currentTimeMillis()))
|
||||||
|
put("ota_target_version", cfg.otaTargetVersion)
|
||||||
|
put("ota_attempts", cfg.otaAttempts)
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -0,0 +1,97 @@
|
||||||
|
package com.remotedisplay.player.service
|
||||||
|
|
||||||
|
import org.junit.Assert.assertEquals
|
||||||
|
import org.junit.Assert.assertFalse
|
||||||
|
import org.junit.Assert.assertTrue
|
||||||
|
import org.junit.Test
|
||||||
|
|
||||||
|
/**
|
||||||
|
* #139: coverage for the OTA throttle state machine (the stateful core that the OTA
|
||||||
|
* re-download-loop fix depends on), independent of Android. UpdateChecker is just the shell.
|
||||||
|
*/
|
||||||
|
class OtaThrottleTest {
|
||||||
|
|
||||||
|
private val V = "1.9.1-beta6"
|
||||||
|
private val MAX = OtaThrottle.MAX_INSTALL_ATTEMPTS
|
||||||
|
private val WINDOW = OtaThrottle.BACKOFF_MS
|
||||||
|
|
||||||
|
// Launch `n` installs from `start`, returning the resulting state.
|
||||||
|
private fun launch(start: OtaThrottle.State, n: Int, now: Long = 1000L): OtaThrottle.State {
|
||||||
|
var s = start
|
||||||
|
repeat(n) { s = OtaThrottle.onInstallLaunched(s, now + it).first }
|
||||||
|
return s
|
||||||
|
}
|
||||||
|
|
||||||
|
@Test fun newTargetResetsBudget() {
|
||||||
|
val stale = OtaThrottle.State(targetVersion = "1.9.1-beta5", attempts = 2, lastAttemptAt = 1000, backoffReported = true)
|
||||||
|
assertTrue(OtaThrottle.isNewTarget(stale, V))
|
||||||
|
val (s, action) = OtaThrottle.onUpdateAvailable(stale, V, now = 5000)
|
||||||
|
assertEquals(V, s.targetVersion)
|
||||||
|
assertEquals(0, s.attempts)
|
||||||
|
assertEquals(0L, s.lastAttemptAt)
|
||||||
|
assertFalse(s.backoffReported)
|
||||||
|
assertEquals(OtaThrottle.Action.ATTEMPT, action)
|
||||||
|
}
|
||||||
|
|
||||||
|
@Test fun aCheckNeverConsumesBudget_onlyInstallLaunchedDoes() {
|
||||||
|
var s = OtaThrottle.State(targetVersion = V, attempts = 0)
|
||||||
|
// Repeated checks (e.g. each followed by a failed download) must not advance the counter.
|
||||||
|
repeat(5) {
|
||||||
|
val (ns, action) = OtaThrottle.onUpdateAvailable(s, V, now = 100)
|
||||||
|
assertEquals(OtaThrottle.Action.ATTEMPT, action)
|
||||||
|
assertEquals(0, ns.attempts)
|
||||||
|
s = ns
|
||||||
|
}
|
||||||
|
// Only a launched install increments.
|
||||||
|
assertEquals(1, OtaThrottle.onInstallLaunched(s, now = 200).first.attempts)
|
||||||
|
}
|
||||||
|
|
||||||
|
@Test fun capThenBackoffWithinWindow() {
|
||||||
|
val s = launch(OtaThrottle.State(targetVersion = V), MAX, now = 1000L)
|
||||||
|
assertEquals(MAX, s.attempts)
|
||||||
|
assertTrue(s.backoffReported)
|
||||||
|
// A check inside the window → BACKOFF, no further attempt, state unchanged.
|
||||||
|
val (ns, action) = OtaThrottle.onUpdateAvailable(s, V, now = 1000L + WINDOW - 1)
|
||||||
|
assertEquals(OtaThrottle.Action.BACKOFF, action)
|
||||||
|
assertEquals(MAX, ns.attempts)
|
||||||
|
}
|
||||||
|
|
||||||
|
@Test fun enterBackoffSignalsExactlyOnce() {
|
||||||
|
var s = OtaThrottle.State(targetVersion = V)
|
||||||
|
var crossings = 0
|
||||||
|
repeat(MAX + 3) { i ->
|
||||||
|
val (ns, entered) = OtaThrottle.onInstallLaunched(s, now = i.toLong())
|
||||||
|
if (entered) crossings++
|
||||||
|
s = ns
|
||||||
|
}
|
||||||
|
assertEquals("enter-backoff fires only on the crossing", 1, crossings)
|
||||||
|
}
|
||||||
|
|
||||||
|
@Test fun retryAfterWindowElapsedDoesNotReReport() {
|
||||||
|
val capped = OtaThrottle.State(targetVersion = V, attempts = MAX, lastAttemptAt = 0L, backoffReported = true)
|
||||||
|
val (afterCheck, action) = OtaThrottle.onUpdateAvailable(capped, V, now = WINDOW + 1)
|
||||||
|
assertEquals(OtaThrottle.Action.ATTEMPT, action) // window elapsed → one retry allowed
|
||||||
|
val (_, entered) = OtaThrottle.onInstallLaunched(afterCheck, now = WINDOW + 2)
|
||||||
|
assertFalse("already reported entering backoff — must not report again", entered)
|
||||||
|
}
|
||||||
|
|
||||||
|
@Test fun clearsOnSuccessOnlyWhenPending() {
|
||||||
|
assertTrue(OtaThrottle.shouldClearOnUpToDate(OtaThrottle.State(targetVersion = V, attempts = 2)))
|
||||||
|
assertFalse(OtaThrottle.shouldClearOnUpToDate(OtaThrottle.State())) // nothing pending
|
||||||
|
}
|
||||||
|
|
||||||
|
@Test fun statusForReflectsBackoffWindow() {
|
||||||
|
val now = 10_000L
|
||||||
|
// no target → none
|
||||||
|
assertEquals("none", OtaThrottle.statusFor(OtaThrottle.State(), now))
|
||||||
|
// under the cap → pending
|
||||||
|
assertEquals("pending", OtaThrottle.statusFor(
|
||||||
|
OtaThrottle.State(targetVersion = V, attempts = 1, lastAttemptAt = now), now))
|
||||||
|
// capped AND inside the window → manual update required
|
||||||
|
assertEquals("manual_update_required", OtaThrottle.statusFor(
|
||||||
|
OtaThrottle.State(targetVersion = V, attempts = MAX, lastAttemptAt = now), now + WINDOW - 1))
|
||||||
|
// capped but window elapsed (a retry is due) → pending, not stuck
|
||||||
|
assertEquals("pending", OtaThrottle.statusFor(
|
||||||
|
OtaThrottle.State(targetVersion = V, attempts = MAX, lastAttemptAt = now), now + WINDOW + 1))
|
||||||
|
}
|
||||||
|
}
|
||||||
44
docs/maintenance-device-status-log.md
Normal file
44
docs/maintenance-device-status-log.md
Normal file
|
|
@ -0,0 +1,44 @@
|
||||||
|
# Maintenance: `device_status_log` growth & space reclaim (#142)
|
||||||
|
|
||||||
|
## What changed in 1.9.2-beta1
|
||||||
|
|
||||||
|
`device_status_log` previously grew without an effective bound (the per-device
|
||||||
|
insert-time prune missed removed/idle devices and the heartbeat `offline_timeout`
|
||||||
|
insert). In one deployment it reached ~1.2M rows / ~119 MB over ~23 days and
|
||||||
|
degraded dashboard performance.
|
||||||
|
|
||||||
|
1.9.2-beta1 bounds further growth:
|
||||||
|
|
||||||
|
- **Index** `idx_device_status_log_device_ts(device_id, timestamp)` — the dashboard
|
||||||
|
uptime query and the prunes now use an index instead of a full scan.
|
||||||
|
- **Global retention sweep** (`pruneStatusLog()`), run on startup and on the
|
||||||
|
heartbeat interval, deletes rows older than **`STATUS_LOG_RETENTION_DAYS`**
|
||||||
|
(default **3**) across *all* devices — including removed/idle devices and the
|
||||||
|
`offline_timeout` rows the per-device prune never revisited.
|
||||||
|
|
||||||
|
## Reclaiming space on an already-bloated database
|
||||||
|
|
||||||
|
> **Operator action — only needed once, only if your `device_status_log` is already
|
||||||
|
> bloated from a pre-1.9.2 deployment.**
|
||||||
|
|
||||||
|
Retention bounds *future* growth, but SQLite does **not** return freed pages to the
|
||||||
|
filesystem on `DELETE` — the file stays at its high-water mark until a `VACUUM`.
|
||||||
|
After upgrading (which prunes the old rows), reclaim the disk with a **one-time
|
||||||
|
manual `VACUUM` in a maintenance window**:
|
||||||
|
|
||||||
|
```sh
|
||||||
|
# stop the server (or do this during a low-traffic window — VACUUM takes a global
|
||||||
|
# write lock and rewrites the whole DB file; the app cannot write during it)
|
||||||
|
sqlite3 /opt/screentinker/server/db/remote_display.db 'VACUUM;'
|
||||||
|
```
|
||||||
|
|
||||||
|
In the reference incident this took the DB from **119 MB → 39 MB**.
|
||||||
|
|
||||||
|
### Why VACUUM is not automatic
|
||||||
|
|
||||||
|
`VACUUM` locks the database and rewrites the entire file — unacceptable on the hot
|
||||||
|
path. `PRAGMA auto_vacuum=INCREMENTAL` is **not** enabled either: it only takes
|
||||||
|
effect on a freshly-created database (set before the first table) or after a
|
||||||
|
one-time full `VACUUM` to convert an existing DB, so enabling it would be a no-op on
|
||||||
|
existing installs and a silent behavior change on new ones. Space reclaim is left as
|
||||||
|
a deliberate operator decision; ongoing growth is already bounded by retention.
|
||||||
|
|
@ -6,6 +6,8 @@ export default {
|
||||||
'device.pl_item.orphan_zone_tip': "This item's zone isn't part of the device's current layout. It still plays (recovered into the largest zone), but reassign it to a zone in this layout.",
|
'device.pl_item.orphan_zone_tip': "This item's zone isn't part of the device's current layout. It still plays (recovered into the largest zone), but reassign it to a zone in this layout.",
|
||||||
'dashboard.device_orphan_tip_one': "{n} item assigned to a zone that isn't in this device's layout — open the device to reassign",
|
'dashboard.device_orphan_tip_one': "{n} item assigned to a zone that isn't in this device's layout — open the device to reassign",
|
||||||
'dashboard.device_orphan_tip_other': "{n} items assigned to a zone that isn't in this device's layout — open the device to reassign",
|
'dashboard.device_orphan_tip_other': "{n} items assigned to a zone that isn't in this device's layout — open the device to reassign",
|
||||||
|
// #139: device stuck in OTA backoff (can't self-install — e.g. Fire TV) — needs a manual update.
|
||||||
|
'dashboard.device_ota_stuck': 'Update available (v{version}) — install failed {n}×, manual update required',
|
||||||
// Nav (sidebar)
|
// Nav (sidebar)
|
||||||
'nav.displays': 'Displays',
|
'nav.displays': 'Displays',
|
||||||
'nav.content': 'Content',
|
'nav.content': 'Content',
|
||||||
|
|
|
||||||
|
|
@ -117,6 +117,9 @@ function renderDeviceCard(device) {
|
||||||
<div class="device-card-name">${esc(device.name)}${device.orphan_count > 0 ? `
|
<div class="device-card-name">${esc(device.name)}${device.orphan_count > 0 ? `
|
||||||
<span class="device-orphan-badge" title="${tn('dashboard.device_orphan_tip', device.orphan_count)}" style="margin-left:6px;display:inline-flex;align-items:center;gap:3px;font-size:11px;color:var(--danger);vertical-align:middle">
|
<span class="device-orphan-badge" title="${tn('dashboard.device_orphan_tip', device.orphan_count)}" style="margin-left:6px;display:inline-flex;align-items:center;gap:3px;font-size:11px;color:var(--danger);vertical-align:middle">
|
||||||
<svg width="12" height="12" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2"><path d="M10.29 3.86L1.82 18a2 2 0 0 0 1.71 3h16.94a2 2 0 0 0 1.71-3L13.71 3.86a2 2 0 0 0-3.42 0z"/><line x1="12" y1="9" x2="12" y2="13"/><line x1="12" y1="17" x2="12.01" y2="17"/></svg>${device.orphan_count}
|
<svg width="12" height="12" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2"><path d="M10.29 3.86L1.82 18a2 2 0 0 0 1.71 3h16.94a2 2 0 0 0 1.71-3L13.71 3.86a2 2 0 0 0-3.42 0z"/><line x1="12" y1="9" x2="12" y2="13"/><line x1="12" y1="17" x2="12.01" y2="17"/></svg>${device.orphan_count}
|
||||||
|
</span>` : ''}${device.ota_status === 'manual_update_required' ? `
|
||||||
|
<span class="device-ota-badge" title="${esc(t('dashboard.device_ota_stuck', { version: device.ota_target_version || '?', n: device.ota_attempts || 0 }))}" style="margin-left:6px;display:inline-flex;align-items:center;gap:3px;font-size:11px;color:var(--warning);vertical-align:middle">
|
||||||
|
<svg width="12" height="12" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2"><path d="M21 15v4a2 2 0 0 1-2 2H5a2 2 0 0 1-2-2v-4"/><polyline points="7 10 12 15 17 10"/><line x1="12" y1="15" x2="12" y2="3"/></svg>update
|
||||||
</span>` : ''}</div>
|
</span>` : ''}</div>
|
||||||
${device.owner_name || device.owner_email ? `<div style="font-size:11px;color:var(--text-muted);margin-bottom:4px">
|
${device.owner_name || device.owner_email ? `<div style="font-size:11px;color:var(--text-muted);margin-bottom:4px">
|
||||||
<svg width="10" height="10" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" style="vertical-align:-1px">
|
<svg width="10" height="10" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" style="vertical-align:-1px">
|
||||||
|
|
|
||||||
|
|
@ -17,6 +17,25 @@ if [ -n "$(git status --porcelain)" ]; then
|
||||||
exit 1
|
exit 1
|
||||||
fi
|
fi
|
||||||
|
|
||||||
|
# Pre-push fast-forward guard. This script creates an annotated tag locally; if
|
||||||
|
# origin/main has advanced past the commit we're bumping from, `git push origin main`
|
||||||
|
# is rejected as a non-fast-forward - and if the tag gets pushed anyway it fires the
|
||||||
|
# release workflow from a commit that isn't even on main (the beta9 divergence
|
||||||
|
# incident). Catch the divergence HERE, before the tag exists, so nothing can fire.
|
||||||
|
# Best-effort: when the fetch can't run (offline), warn and proceed rather than block
|
||||||
|
# a local bump - the push itself is still the backstop.
|
||||||
|
if git fetch --quiet origin main 2>/dev/null; then
|
||||||
|
if ! git merge-base --is-ancestor FETCH_HEAD HEAD; then
|
||||||
|
echo "ERROR: origin/main ($(git rev-parse --short FETCH_HEAD)) has commits not in your" >&2
|
||||||
|
echo " HEAD ($(git rev-parse --short HEAD)) - 'git push origin main' would be rejected." >&2
|
||||||
|
echo " Merge origin/main into your branch first, then re-run the bump." >&2
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
else
|
||||||
|
echo "WARNING: could not fetch origin/main - skipping the fast-forward check (offline?)." >&2
|
||||||
|
echo " Confirm 'git push origin main' will fast-forward before pushing the tag." >&2
|
||||||
|
fi
|
||||||
|
|
||||||
CURRENT="$(cat VERSION)"
|
CURRENT="$(cat VERSION)"
|
||||||
IFS=. read -r MAJ MIN PAT <<< "$CURRENT"
|
IFS=. read -r MAJ MIN PAT <<< "$CURRENT"
|
||||||
|
|
||||||
|
|
|
||||||
549
scripts/debian-13-setup.sh
Executable file
549
scripts/debian-13-setup.sh
Executable file
|
|
@ -0,0 +1,549 @@
|
||||||
|
#!/bin/bash
|
||||||
|
# ScreenTinker - Debian 13 Setup Script
|
||||||
|
#
|
||||||
|
# Modes:
|
||||||
|
# - Server + Player (both)
|
||||||
|
# - Server only
|
||||||
|
# - Player only
|
||||||
|
#
|
||||||
|
# Usage:
|
||||||
|
# curl -sSL https://screentinker.com/scripts/debian-13-setup.sh | sudo bash
|
||||||
|
# curl -sSL https://screentinker.com/scripts/debian-13-setup.sh | sudo bash -s -- --server-only
|
||||||
|
# curl -sSL https://screentinker.com/scripts/debian-13-setup.sh | sudo bash -s -- --player-only https://screentinker.com
|
||||||
|
|
||||||
|
set -euo pipefail
|
||||||
|
|
||||||
|
# -- Configuration --
|
||||||
|
SCREENTINKER_DIR="/opt/screentinker"
|
||||||
|
SCREENTINKER_PORT=3001
|
||||||
|
NODE_MAJOR=20
|
||||||
|
LOG_FILE="/var/log/screentinker-debian-setup.log"
|
||||||
|
|
||||||
|
# -- Colors --
|
||||||
|
RED='\033[0;31m'
|
||||||
|
GREEN='\033[0;32m'
|
||||||
|
YELLOW='\033[1;33m'
|
||||||
|
BLUE='\033[0;34m'
|
||||||
|
NC='\033[0m'
|
||||||
|
|
||||||
|
log() { echo -e "${GREEN}[ScreenTinker]${NC} $1"; }
|
||||||
|
warn() { echo -e "${YELLOW}[WARNING]${NC} $1"; }
|
||||||
|
err() { echo -e "${RED}[ERROR]${NC} $1"; exit 1; }
|
||||||
|
|
||||||
|
MODE="both"
|
||||||
|
MODE_SET=false
|
||||||
|
SERVER_URL=""
|
||||||
|
|
||||||
|
while [[ $# -gt 0 ]]; do
|
||||||
|
case "$1" in
|
||||||
|
--server-only)
|
||||||
|
MODE="server"
|
||||||
|
MODE_SET=true
|
||||||
|
shift
|
||||||
|
;;
|
||||||
|
--player-only)
|
||||||
|
MODE="player"
|
||||||
|
MODE_SET=true
|
||||||
|
shift
|
||||||
|
if [[ $# -gt 0 && "$1" == http* ]]; then
|
||||||
|
SERVER_URL="$1"
|
||||||
|
shift
|
||||||
|
fi
|
||||||
|
;;
|
||||||
|
--both)
|
||||||
|
MODE="both"
|
||||||
|
MODE_SET=true
|
||||||
|
shift
|
||||||
|
;;
|
||||||
|
--help|-h)
|
||||||
|
echo "Usage: sudo ./debian-13-setup.sh [OPTIONS] [SERVER_URL]"
|
||||||
|
echo ""
|
||||||
|
echo "Options:"
|
||||||
|
echo " --server-only Install only the server"
|
||||||
|
echo " --player-only [URL] Install only the player (URL required)"
|
||||||
|
echo " --both Install both server and player (default)"
|
||||||
|
echo " --help Show this help"
|
||||||
|
echo ""
|
||||||
|
echo "Examples:"
|
||||||
|
echo " sudo ./debian-13-setup.sh"
|
||||||
|
echo " sudo ./debian-13-setup.sh --server-only"
|
||||||
|
echo " sudo ./debian-13-setup.sh --player-only https://screentinker.com"
|
||||||
|
exit 0
|
||||||
|
;;
|
||||||
|
http*)
|
||||||
|
SERVER_URL="$1"
|
||||||
|
shift
|
||||||
|
;;
|
||||||
|
*)
|
||||||
|
shift
|
||||||
|
;;
|
||||||
|
esac
|
||||||
|
done
|
||||||
|
|
||||||
|
if [ "$(id -u)" -ne 0 ]; then
|
||||||
|
err "This script must be run as root. Try: sudo bash debian-13-setup.sh"
|
||||||
|
fi
|
||||||
|
|
||||||
|
if [ -r /etc/os-release ]; then
|
||||||
|
. /etc/os-release
|
||||||
|
if [ "${ID:-}" != "debian" ] || [ "${VERSION_ID:-}" != "13" ]; then
|
||||||
|
warn "Detected ${PRETTY_NAME:-unknown}. This script targets Debian 13."
|
||||||
|
read -p "Continue anyway? (y/N) " -n 1 -r; echo
|
||||||
|
[[ ! $REPLY =~ ^[Yy]$ ]] && exit 1
|
||||||
|
else
|
||||||
|
log "Detected Debian 13"
|
||||||
|
fi
|
||||||
|
fi
|
||||||
|
|
||||||
|
if [ "$MODE" = "player" ] && [ -z "$SERVER_URL" ]; then
|
||||||
|
echo ""
|
||||||
|
echo -e "${BLUE}======================================${NC}"
|
||||||
|
echo -e "${BLUE} ScreenTinker Debian 13 Setup${NC}"
|
||||||
|
echo -e "${BLUE}======================================${NC}"
|
||||||
|
echo ""
|
||||||
|
read -p "Server URL (e.g., https://screentinker.com): " SERVER_URL
|
||||||
|
elif [ "$MODE" = "both" ] && [ "$MODE_SET" = false ] && [ -z "$SERVER_URL" ]; then
|
||||||
|
echo ""
|
||||||
|
echo -e "${BLUE}======================================${NC}"
|
||||||
|
echo -e "${BLUE} ScreenTinker Debian 13 Setup${NC}"
|
||||||
|
echo -e "${BLUE}======================================${NC}"
|
||||||
|
echo ""
|
||||||
|
echo " 1) Server + Player (recommended for single-screen host)"
|
||||||
|
echo " 2) Server Only"
|
||||||
|
echo " 3) Player Only"
|
||||||
|
echo ""
|
||||||
|
read -p "Choose [1/2/3]: " MODE_CHOICE
|
||||||
|
case "$MODE_CHOICE" in
|
||||||
|
2)
|
||||||
|
MODE="server"
|
||||||
|
;;
|
||||||
|
3)
|
||||||
|
MODE="player"
|
||||||
|
read -p "Server URL (e.g., https://screentinker.com): " SERVER_URL
|
||||||
|
;;
|
||||||
|
*)
|
||||||
|
MODE="both"
|
||||||
|
;;
|
||||||
|
esac
|
||||||
|
fi
|
||||||
|
|
||||||
|
SERVER_URL="${SERVER_URL%/}"
|
||||||
|
|
||||||
|
NEED_SERVER=false
|
||||||
|
NEED_PLAYER=false
|
||||||
|
|
||||||
|
case "$MODE" in
|
||||||
|
server)
|
||||||
|
NEED_SERVER=true
|
||||||
|
;;
|
||||||
|
player)
|
||||||
|
NEED_PLAYER=true
|
||||||
|
;;
|
||||||
|
both)
|
||||||
|
NEED_SERVER=true
|
||||||
|
NEED_PLAYER=true
|
||||||
|
;;
|
||||||
|
*)
|
||||||
|
err "Unknown mode: $MODE"
|
||||||
|
;;
|
||||||
|
esac
|
||||||
|
|
||||||
|
if [ "$NEED_PLAYER" = true ] && [ "$MODE" = "player" ] && [ -z "$SERVER_URL" ]; then
|
||||||
|
err "Player-only mode requires a server URL"
|
||||||
|
fi
|
||||||
|
|
||||||
|
if [ "$NEED_PLAYER" = true ]; then
|
||||||
|
if [ "$MODE" = "player" ]; then
|
||||||
|
KIOSK_URL="${SERVER_URL}/player"
|
||||||
|
else
|
||||||
|
KIOSK_URL="http://localhost:${SCREENTINKER_PORT}/player"
|
||||||
|
fi
|
||||||
|
fi
|
||||||
|
|
||||||
|
echo ""
|
||||||
|
log "Setup log: $LOG_FILE"
|
||||||
|
exec > >(tee -a "$LOG_FILE") 2>&1
|
||||||
|
|
||||||
|
log "Updating system packages..."
|
||||||
|
apt-get update -qq
|
||||||
|
apt-get upgrade -y -qq
|
||||||
|
|
||||||
|
log "Installing base dependencies..."
|
||||||
|
apt-get install -y -qq \
|
||||||
|
git curl wget unzip htop \
|
||||||
|
avahi-daemon \
|
||||||
|
fonts-liberation fonts-noto-color-emoji \
|
||||||
|
>> "$LOG_FILE" 2>&1
|
||||||
|
|
||||||
|
RUNTIME_USER="${SUDO_USER:-$(logname 2>/dev/null || echo root)}"
|
||||||
|
if ! id "$RUNTIME_USER" &>/dev/null; then
|
||||||
|
warn "Could not resolve invoking user; defaulting to root"
|
||||||
|
RUNTIME_USER="root"
|
||||||
|
fi
|
||||||
|
RUNTIME_HOME=$(eval echo "~$RUNTIME_USER")
|
||||||
|
|
||||||
|
if [ "$NEED_SERVER" = true ]; then
|
||||||
|
NEED_NODE=true
|
||||||
|
if command -v node &>/dev/null; then
|
||||||
|
CUR=$(node -v | cut -d'v' -f2 | cut -d'.' -f1)
|
||||||
|
if [ "$CUR" -ge "$NODE_MAJOR" ]; then
|
||||||
|
log "Node.js $(node -v) already installed"
|
||||||
|
NEED_NODE=false
|
||||||
|
fi
|
||||||
|
fi
|
||||||
|
|
||||||
|
if [ "$NEED_NODE" = true ]; then
|
||||||
|
log "Installing Node.js ${NODE_MAJOR}.x..."
|
||||||
|
curl -fsSL "https://deb.nodesource.com/setup_${NODE_MAJOR}.x" | bash - >> "$LOG_FILE" 2>&1
|
||||||
|
apt-get install -y -qq nodejs >> "$LOG_FILE" 2>&1
|
||||||
|
log "Node.js $(node -v) installed"
|
||||||
|
fi
|
||||||
|
|
||||||
|
if [ -d "$SCREENTINKER_DIR/.git" ]; then
|
||||||
|
log "Repo exists at $SCREENTINKER_DIR, pulling latest..."
|
||||||
|
cd "$SCREENTINKER_DIR" && git pull origin main >> "$LOG_FILE" 2>&1
|
||||||
|
else
|
||||||
|
log "Cloning ScreenTinker..."
|
||||||
|
git clone https://github.com/screentinker/screentinker.git "$SCREENTINKER_DIR" >> "$LOG_FILE" 2>&1
|
||||||
|
fi
|
||||||
|
|
||||||
|
log "Installing server dependencies..."
|
||||||
|
cd "$SCREENTINKER_DIR/server"
|
||||||
|
npm install --production >> "$LOG_FILE" 2>&1
|
||||||
|
|
||||||
|
mkdir -p "$SCREENTINKER_DIR/server/db"
|
||||||
|
mkdir -p "$SCREENTINKER_DIR/server/uploads"
|
||||||
|
chown -R "$RUNTIME_USER":"$RUNTIME_USER" "$SCREENTINKER_DIR"
|
||||||
|
|
||||||
|
log "Creating screentinker-server service..."
|
||||||
|
cat > /etc/systemd/system/screentinker-server.service << SERVICEEOF
|
||||||
|
[Unit]
|
||||||
|
Description=ScreenTinker Digital Signage Server
|
||||||
|
After=network-online.target
|
||||||
|
Wants=network-online.target
|
||||||
|
|
||||||
|
[Service]
|
||||||
|
Type=simple
|
||||||
|
User=${RUNTIME_USER}
|
||||||
|
WorkingDirectory=${SCREENTINKER_DIR}/server
|
||||||
|
ExecStart=/usr/bin/node server.js
|
||||||
|
Restart=always
|
||||||
|
RestartSec=5
|
||||||
|
StartLimitBurst=5
|
||||||
|
StartLimitIntervalSec=60
|
||||||
|
|
||||||
|
Environment=NODE_ENV=production
|
||||||
|
Environment=PORT=${SCREENTINKER_PORT}
|
||||||
|
Environment=SELF_HOSTED=true
|
||||||
|
Environment=HOST=0.0.0.0
|
||||||
|
|
||||||
|
StandardOutput=journal
|
||||||
|
StandardError=journal
|
||||||
|
SyslogIdentifier=screentinker-server
|
||||||
|
|
||||||
|
[Install]
|
||||||
|
WantedBy=multi-user.target
|
||||||
|
SERVICEEOF
|
||||||
|
|
||||||
|
systemctl daemon-reload
|
||||||
|
systemctl enable screentinker-server.service
|
||||||
|
log "Server service enabled"
|
||||||
|
fi
|
||||||
|
|
||||||
|
if [ "$NEED_PLAYER" = true ]; then
|
||||||
|
log "Installing player packages..."
|
||||||
|
apt-get install -y -qq \
|
||||||
|
xserver-xorg xserver-xorg-legacy x11-xserver-utils xinit \
|
||||||
|
chromium unclutter xdotool \
|
||||||
|
>> "$LOG_FILE" 2>&1 || {
|
||||||
|
warn "Failed to install chromium package, trying chromium-browser..."
|
||||||
|
apt-get install -y -qq xserver-xorg xserver-xorg-legacy x11-xserver-utils xinit chromium-browser unclutter xdotool >> "$LOG_FILE" 2>&1
|
||||||
|
}
|
||||||
|
|
||||||
|
CHROMIUM_BIN=$(command -v chromium 2>/dev/null || command -v chromium-browser 2>/dev/null || echo "/usr/bin/chromium")
|
||||||
|
|
||||||
|
log "Allowing non-root X server startup..."
|
||||||
|
mkdir -p /etc/X11
|
||||||
|
cat > /etc/X11/Xwrapper.config << 'XWRAPEOF'
|
||||||
|
allowed_users=anybody
|
||||||
|
needs_root_rights=yes
|
||||||
|
XWRAPEOF
|
||||||
|
|
||||||
|
log "Creating kiosk launcher..."
|
||||||
|
cat > "$RUNTIME_HOME/screentinker-kiosk.sh" << KIOSKEOF
|
||||||
|
#!/bin/bash
|
||||||
|
KIOSK_URL="${KIOSK_URL}"
|
||||||
|
|
||||||
|
sleep 2
|
||||||
|
|
||||||
|
# Disable screen blanking and power management
|
||||||
|
xset s off
|
||||||
|
xset s noblank
|
||||||
|
xset -dpms
|
||||||
|
xset s 0 0
|
||||||
|
|
||||||
|
# Hide cursor after 3 seconds of inactivity
|
||||||
|
unclutter -idle 3 -root &
|
||||||
|
|
||||||
|
# Clean Chromium crash flags (prevents restore session dialogs)
|
||||||
|
CDIR="\$HOME/.config/chromium/Default"
|
||||||
|
mkdir -p "\$CDIR"
|
||||||
|
if [ -f "\$CDIR/Preferences" ]; then
|
||||||
|
sed -i 's/"exited_cleanly":false/"exited_cleanly":true/' "\$CDIR/Preferences" 2>/dev/null || true
|
||||||
|
sed -i 's/"exit_type":"Crashed"/"exit_type":"Normal"/' "\$CDIR/Preferences" 2>/dev/null || true
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Wait for local server if running all-in-one
|
||||||
|
if echo "\$KIOSK_URL" | grep -q "localhost"; then
|
||||||
|
echo "Waiting for ScreenTinker server..."
|
||||||
|
for i in \$(seq 1 60); do
|
||||||
|
if curl -sf "http://localhost:${SCREENTINKER_PORT}/api/status" >/dev/null 2>&1; then
|
||||||
|
echo "Server ready after \${i}x2s"
|
||||||
|
break
|
||||||
|
fi
|
||||||
|
sleep 2
|
||||||
|
done
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Detect screen resolution so Chromium fills the display on minimal X11 (no WM)
|
||||||
|
SCREEN_RES=\$(xrandr 2>/dev/null | grep ' connected' | grep -oE '[0-9]+x[0-9]+' | head -1)
|
||||||
|
SCREEN_W=\${SCREEN_RES%%x*}
|
||||||
|
SCREEN_H=\${SCREEN_RES##*x}
|
||||||
|
if [ -z "\$SCREEN_W" ] || [ -z "\$SCREEN_H" ]; then
|
||||||
|
SCREEN_W=1920
|
||||||
|
SCREEN_H=1080
|
||||||
|
fi
|
||||||
|
|
||||||
|
exec ${CHROMIUM_BIN} \\
|
||||||
|
--kiosk \\
|
||||||
|
--window-position=0,0 \\
|
||||||
|
--window-size=\${SCREEN_W},\${SCREEN_H} \\
|
||||||
|
--noerrdialogs \\
|
||||||
|
--disable-infobars \\
|
||||||
|
--disable-session-crashed-bubble \\
|
||||||
|
--disable-features=TranslateUI \\
|
||||||
|
--disable-component-update \\
|
||||||
|
--check-for-update-interval=31536000 \\
|
||||||
|
--autoplay-policy=no-user-gesture-required \\
|
||||||
|
--no-first-run \\
|
||||||
|
--disable-pinch \\
|
||||||
|
--overscroll-history-navigation=0 \\
|
||||||
|
--disable-translate \\
|
||||||
|
--disable-sync \\
|
||||||
|
--disable-background-networking \\
|
||||||
|
--disable-default-apps \\
|
||||||
|
--disable-extensions \\
|
||||||
|
--disable-hang-monitor \\
|
||||||
|
--disable-popup-blocking \\
|
||||||
|
--disable-prompt-on-repost \\
|
||||||
|
--metrics-recording-only \\
|
||||||
|
--safebrowsing-disable-auto-update \\
|
||||||
|
--ignore-certificate-errors \\
|
||||||
|
"\$KIOSK_URL"
|
||||||
|
KIOSKEOF
|
||||||
|
|
||||||
|
chmod +x "$RUNTIME_HOME/screentinker-kiosk.sh"
|
||||||
|
chown "$RUNTIME_USER":"$RUNTIME_USER" "$RUNTIME_HOME/screentinker-kiosk.sh"
|
||||||
|
|
||||||
|
cat > "$RUNTIME_HOME/.xinitrc" << 'XINITEOF'
|
||||||
|
#!/bin/bash
|
||||||
|
exec ~/screentinker-kiosk.sh
|
||||||
|
XINITEOF
|
||||||
|
chmod +x "$RUNTIME_HOME/.xinitrc"
|
||||||
|
chown "$RUNTIME_USER":"$RUNTIME_USER" "$RUNTIME_HOME/.xinitrc"
|
||||||
|
|
||||||
|
if [ "$NEED_SERVER" = true ]; then
|
||||||
|
KIOSK_AFTER="After=screentinker-server.service"
|
||||||
|
KIOSK_REQ="Requires=screentinker-server.service"
|
||||||
|
else
|
||||||
|
KIOSK_AFTER="After=network-online.target"
|
||||||
|
KIOSK_REQ="Wants=network-online.target"
|
||||||
|
fi
|
||||||
|
|
||||||
|
log "Creating kiosk service..."
|
||||||
|
cat > /etc/systemd/system/screentinker-kiosk.service << SERVICEEOF
|
||||||
|
[Unit]
|
||||||
|
Description=ScreenTinker Kiosk Display
|
||||||
|
${KIOSK_AFTER}
|
||||||
|
${KIOSK_REQ}
|
||||||
|
# Prevent conflicts with getty on tty1
|
||||||
|
Conflicts=getty@tty1.service
|
||||||
|
After=getty@tty1.service
|
||||||
|
|
||||||
|
[Service]
|
||||||
|
Type=simple
|
||||||
|
User=${RUNTIME_USER}
|
||||||
|
Environment=DISPLAY=:0
|
||||||
|
Environment=XAUTHORITY=${RUNTIME_HOME}/.Xauthority
|
||||||
|
# Remove stale X lock files from previous crashes before starting
|
||||||
|
ExecStartPre=/bin/bash -c 'rm -f /tmp/.X0-lock /tmp/.X11-unix/X0'
|
||||||
|
ExecStartPre=/bin/sleep 3
|
||||||
|
ExecStart=/usr/bin/startx ${RUNTIME_HOME}/.xinitrc -- :0 -nolisten tcp vt1
|
||||||
|
Restart=on-failure
|
||||||
|
RestartSec=10
|
||||||
|
StartLimitBurst=5
|
||||||
|
StartLimitIntervalSec=120
|
||||||
|
|
||||||
|
TTYPath=/dev/tty1
|
||||||
|
StandardInput=tty
|
||||||
|
StandardOutput=journal
|
||||||
|
StandardError=journal
|
||||||
|
SyslogIdentifier=screentinker-kiosk
|
||||||
|
|
||||||
|
[Install]
|
||||||
|
WantedBy=multi-user.target
|
||||||
|
SERVICEEOF
|
||||||
|
|
||||||
|
systemctl daemon-reload
|
||||||
|
systemctl enable screentinker-kiosk.service
|
||||||
|
log "Kiosk service enabled"
|
||||||
|
|
||||||
|
log "Configuring auto-login on tty1..."
|
||||||
|
mkdir -p /etc/systemd/system/getty@tty1.service.d
|
||||||
|
cat > /etc/systemd/system/getty@tty1.service.d/autologin.conf << AUTOLOGINEOF
|
||||||
|
[Service]
|
||||||
|
ExecStart=
|
||||||
|
ExecStart=-/sbin/agetty --autologin ${RUNTIME_USER} --noclear %I \$TERM
|
||||||
|
AUTOLOGINEOF
|
||||||
|
|
||||||
|
# Disable getty on tty1 so it doesn't conflict with the kiosk service
|
||||||
|
systemctl disable getty@tty1.service 2>/dev/null || true
|
||||||
|
systemctl mask getty@tty1.service 2>/dev/null || true
|
||||||
|
fi
|
||||||
|
|
||||||
|
if [ "$NEED_SERVER" = true ]; then
|
||||||
|
log "Creating management scripts..."
|
||||||
|
|
||||||
|
cat > /usr/local/bin/screentinker-update << 'UPDATEEOF'
|
||||||
|
#!/bin/bash
|
||||||
|
echo "Stopping services..."
|
||||||
|
sudo systemctl stop screentinker-kiosk.service 2>/dev/null || true
|
||||||
|
sudo systemctl stop screentinker-server.service 2>/dev/null || true
|
||||||
|
|
||||||
|
echo "Pulling latest..."
|
||||||
|
cd /opt/screentinker && git pull origin main
|
||||||
|
|
||||||
|
echo "Installing dependencies..."
|
||||||
|
cd server && npm install --production
|
||||||
|
|
||||||
|
echo "Starting services..."
|
||||||
|
sudo systemctl start screentinker-server.service
|
||||||
|
if systemctl list-unit-files | grep -q '^screentinker-kiosk.service'; then
|
||||||
|
sleep 3
|
||||||
|
sudo systemctl start screentinker-kiosk.service
|
||||||
|
fi
|
||||||
|
|
||||||
|
echo ""
|
||||||
|
echo "Done! Server: $(systemctl is-active screentinker-server.service)"
|
||||||
|
if systemctl list-unit-files | grep -q '^screentinker-kiosk.service'; then
|
||||||
|
echo " Kiosk: $(systemctl is-active screentinker-kiosk.service)"
|
||||||
|
fi
|
||||||
|
UPDATEEOF
|
||||||
|
chmod +x /usr/local/bin/screentinker-update
|
||||||
|
|
||||||
|
cat > /usr/local/bin/screentinker-status << 'STATUSEOF'
|
||||||
|
#!/bin/bash
|
||||||
|
echo ""
|
||||||
|
echo "=== ScreenTinker Status ==="
|
||||||
|
echo ""
|
||||||
|
IP=$(hostname -I | awk '{print $1}')
|
||||||
|
|
||||||
|
if systemctl is-active screentinker-server.service &>/dev/null; then
|
||||||
|
echo "Server: RUNNING (PID $(systemctl show screentinker-server.service -p MainPID --value))"
|
||||||
|
else
|
||||||
|
echo "Server: STOPPED"
|
||||||
|
fi
|
||||||
|
|
||||||
|
if systemctl list-unit-files | grep -q '^screentinker-kiosk.service'; then
|
||||||
|
if systemctl is-active screentinker-kiosk.service &>/dev/null; then
|
||||||
|
echo "Kiosk: RUNNING"
|
||||||
|
else
|
||||||
|
echo "Kiosk: STOPPED"
|
||||||
|
fi
|
||||||
|
fi
|
||||||
|
|
||||||
|
echo ""
|
||||||
|
echo "Uptime: $(uptime -p)"
|
||||||
|
echo "Disk: $(df -h /opt/screentinker 2>/dev/null | tail -1 | awk '{print $3 "/" $2 " (" $5 " used)"}')"
|
||||||
|
echo "Memory: $(free -h | awk '/Mem:/ {print $3 " / " $2}')"
|
||||||
|
echo ""
|
||||||
|
echo "Dashboard: http://${IP}:3001"
|
||||||
|
echo "Player: http://${IP}:3001/player"
|
||||||
|
echo "mDNS: http://$(hostname).local:3001"
|
||||||
|
echo ""
|
||||||
|
STATUSEOF
|
||||||
|
chmod +x /usr/local/bin/screentinker-status
|
||||||
|
|
||||||
|
cat > /usr/local/bin/screentinker-logs << 'LOGSEOF'
|
||||||
|
#!/bin/bash
|
||||||
|
case "${1:-server}" in
|
||||||
|
server) journalctl -u screentinker-server.service -f --no-hostname ;;
|
||||||
|
kiosk) journalctl -u screentinker-kiosk.service -f --no-hostname ;;
|
||||||
|
all) journalctl -u screentinker-server.service -u screentinker-kiosk.service -f --no-hostname ;;
|
||||||
|
*) echo "Usage: screentinker-logs [server|kiosk|all]" ;;
|
||||||
|
esac
|
||||||
|
LOGSEOF
|
||||||
|
chmod +x /usr/local/bin/screentinker-logs
|
||||||
|
fi
|
||||||
|
|
||||||
|
cat > /etc/motd << 'MOTDEOF'
|
||||||
|
|
||||||
|
____ _____ _
|
||||||
|
/ ___| ___ _ __ ___ ___ |_ _|_ _ __ | | _____ _ __
|
||||||
|
\___ \ / __| '__/ _ \/ _ \ | || | '_ \| |/ / _ \ '__|
|
||||||
|
___) | (__| | | __/ __/ | || | | | | < __/ |
|
||||||
|
|____/ \___|_| \___|\___| |_||_|_| |_|_|\_\___|_|
|
||||||
|
|
||||||
|
Open-Source Digital Signage for Any Screen
|
||||||
|
|
||||||
|
Commands:
|
||||||
|
screentinker-status Show system info and URLs
|
||||||
|
screentinker-update Pull latest and restart
|
||||||
|
screentinker-logs Follow logs (server|kiosk|all)
|
||||||
|
|
||||||
|
MOTDEOF
|
||||||
|
|
||||||
|
if grep -q "#RuntimeWatchdogSec=0" /etc/systemd/system.conf 2>/dev/null; then
|
||||||
|
sed -i 's/#RuntimeWatchdogSec=0/RuntimeWatchdogSec=10/' /etc/systemd/system.conf
|
||||||
|
log "Hardware watchdog enabled (10s)"
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Disable console blanking so the screen stays on during boot
|
||||||
|
if [ -f /etc/default/grub ]; then
|
||||||
|
if ! grep -q "consoleblank=0" /etc/default/grub; then
|
||||||
|
sed -i 's/GRUB_CMDLINE_LINUX_DEFAULT="\(.*\)"/GRUB_CMDLINE_LINUX_DEFAULT="\1 consoleblank=0"/' /etc/default/grub
|
||||||
|
update-grub >> "$LOG_FILE" 2>&1 && log "Console blanking disabled in GRUB" || warn "update-grub failed (non-fatal)"
|
||||||
|
fi
|
||||||
|
fi
|
||||||
|
|
||||||
|
echo ""
|
||||||
|
echo -e "${GREEN}======================================${NC}"
|
||||||
|
echo -e "${GREEN} ScreenTinker Setup Complete!${NC}"
|
||||||
|
echo -e "${GREEN}======================================${NC}"
|
||||||
|
echo ""
|
||||||
|
|
||||||
|
IP=$(hostname -I | awk '{print $1}')
|
||||||
|
|
||||||
|
if [ "$MODE" = "both" ]; then
|
||||||
|
echo "Mode: Server + Player"
|
||||||
|
echo "Dashboard: http://${IP}:${SCREENTINKER_PORT}"
|
||||||
|
echo "Player: http://${IP}:${SCREENTINKER_PORT}/player"
|
||||||
|
elif [ "$MODE" = "server" ]; then
|
||||||
|
echo "Mode: Server Only"
|
||||||
|
echo "Dashboard: http://${IP}:${SCREENTINKER_PORT}"
|
||||||
|
else
|
||||||
|
echo "Mode: Player Only"
|
||||||
|
echo "Server: $SERVER_URL"
|
||||||
|
fi
|
||||||
|
|
||||||
|
echo ""
|
||||||
|
echo "Services:"
|
||||||
|
if [ "$NEED_SERVER" = true ]; then
|
||||||
|
echo " sudo systemctl [start|stop|restart] screentinker-server"
|
||||||
|
fi
|
||||||
|
if [ "$NEED_PLAYER" = true ]; then
|
||||||
|
echo " sudo systemctl [start|stop|restart] screentinker-kiosk"
|
||||||
|
fi
|
||||||
|
echo ""
|
||||||
|
echo -e "${YELLOW}Reboot to start: sudo reboot${NC}"
|
||||||
|
echo ""
|
||||||
|
|
@ -280,7 +280,7 @@ fi
|
||||||
if echo "\$KIOSK_URL" | grep -q "localhost"; then
|
if echo "\$KIOSK_URL" | grep -q "localhost"; then
|
||||||
echo "Waiting for ScreenTinker server..."
|
echo "Waiting for ScreenTinker server..."
|
||||||
for i in \$(seq 1 30); do
|
for i in \$(seq 1 30); do
|
||||||
if curl -sf "http://localhost:${SCREENTINKER_PORT}/api/health" >/dev/null 2>&1; then
|
if curl -sf "http://localhost:${SCREENTINKER_PORT}/api/status" >/dev/null 2>&1; then
|
||||||
echo "Server ready"
|
echo "Server ready"
|
||||||
break
|
break
|
||||||
fi
|
fi
|
||||||
|
|
@ -288,8 +288,19 @@ if echo "\$KIOSK_URL" | grep -q "localhost"; then
|
||||||
done
|
done
|
||||||
fi
|
fi
|
||||||
|
|
||||||
|
# Detect screen resolution so Chromium fills the display on minimal X11 (no WM)
|
||||||
|
SCREEN_RES=\$(xrandr 2>/dev/null | grep ' connected' | grep -oE '[0-9]+x[0-9]+' | head -1)
|
||||||
|
SCREEN_W=\${SCREEN_RES%%x*}
|
||||||
|
SCREEN_H=\${SCREEN_RES##*x}
|
||||||
|
if [ -z "\$SCREEN_W" ] || [ -z "\$SCREEN_H" ]; then
|
||||||
|
SCREEN_W=1920
|
||||||
|
SCREEN_H=1080
|
||||||
|
fi
|
||||||
|
|
||||||
exec ${CHROMIUM_BIN} \\
|
exec ${CHROMIUM_BIN} \\
|
||||||
--kiosk \\
|
--kiosk \\
|
||||||
|
--window-position=0,0 \\
|
||||||
|
--window-size=\${SCREEN_W},\${SCREEN_H} \\
|
||||||
--noerrdialogs \\
|
--noerrdialogs \\
|
||||||
--disable-infobars \\
|
--disable-infobars \\
|
||||||
--disable-session-crashed-bubble \\
|
--disable-session-crashed-bubble \\
|
||||||
|
|
@ -298,7 +309,6 @@ exec ${CHROMIUM_BIN} \\
|
||||||
--check-for-update-interval=31536000 \\
|
--check-for-update-interval=31536000 \\
|
||||||
--autoplay-policy=no-user-gesture-required \\
|
--autoplay-policy=no-user-gesture-required \\
|
||||||
--no-first-run \\
|
--no-first-run \\
|
||||||
--start-fullscreen \\
|
|
||||||
--disable-pinch \\
|
--disable-pinch \\
|
||||||
--overscroll-history-navigation=0 \\
|
--overscroll-history-navigation=0 \\
|
||||||
--disable-translate \\
|
--disable-translate \\
|
||||||
|
|
|
||||||
|
|
@ -90,4 +90,63 @@ module.exports = {
|
||||||
// on MSP-style deployments where an admin/operator assigns users to existing
|
// on MSP-style deployments where an admin/operator assigns users to existing
|
||||||
// orgs after signup instead.
|
// orgs after signup instead.
|
||||||
autoCreateOrgOnSignup: !['false', '0'].includes(String(process.env.AUTO_CREATE_ORG_ON_SIGNUP || '').toLowerCase()),
|
autoCreateOrgOnSignup: !['false', '0'].includes(String(process.env.AUTO_CREATE_ORG_ON_SIGNUP || '').toLowerCase()),
|
||||||
|
|
||||||
|
// #142 event-loop lag telemetry (services/loop-lag.js). perf_hooks
|
||||||
|
// monitorEventLoopDelay is C++-backed, so continuous sampling is cheap. Each
|
||||||
|
// window's p99 is persisted to event_loop_lag (bounded: indexed + pruned from
|
||||||
|
// day one) and drives the banded load level the reconnect throttle reads.
|
||||||
|
lagSampleIntervalMs: parseInt(process.env.LAG_SAMPLE_INTERVAL_MS) || 1000,
|
||||||
|
lagResolutionMs: parseInt(process.env.LAG_RESOLUTION_MS) || 20,
|
||||||
|
lagTelemetryRetentionDays: parseFloat(process.env.LAG_TELEMETRY_RETENTION_DAYS) || 3,
|
||||||
|
lagPruneIntervalMs: parseInt(process.env.LAG_PRUNE_INTERVAL_MS) || 3600000,
|
||||||
|
// Banded load levels from the window p99 (ms). Asymmetric by design: a band is
|
||||||
|
// entered immediately when its up-threshold is crossed (tighten fast), but
|
||||||
|
// released only one step at a time after lagReleaseSamples consecutive samples
|
||||||
|
// fall below a deadband (release slow), so small fluctuations don't flap it.
|
||||||
|
// Bands ONLY scale how hard an already-flagged device is throttled; a healthy
|
||||||
|
// device is never gated by global lag.
|
||||||
|
lagElevatedMs: parseInt(process.env.LAG_ELEVATED_MS) || 100,
|
||||||
|
lagCriticalMs: parseInt(process.env.LAG_CRITICAL_MS) || 250,
|
||||||
|
lagReleaseSamples: parseInt(process.env.LAG_RELEASE_SAMPLES) || 5,
|
||||||
|
|
||||||
|
// #142 load-aware per-device reconnect throttle (lib/reconnect-throttle.js).
|
||||||
|
// The verdict of WHO is misbehaving is ALWAYS per-device (keyed on device_id):
|
||||||
|
// a device is flagged only when it exceeds reconnectBaseMax genuine reconnects
|
||||||
|
// per reconnectWindowMs. Global lag never flags a healthy device — the lag band
|
||||||
|
// only MULTIPLIES how hard an already-flagged device is backed off.
|
||||||
|
reconnectWindowMs: parseInt(process.env.RECONNECT_WINDOW_MS) || 10000,
|
||||||
|
reconnectBaseMax: parseInt(process.env.RECONNECT_BASE_MAX) || 5,
|
||||||
|
// Absolute per-device ceiling, independent of band AND of warm-up: no device may
|
||||||
|
// exceed this many reconnects/window no matter what the adaptive logic computes,
|
||||||
|
// so a slow-ramp attacker can't train its way through.
|
||||||
|
reconnectHardCeiling: parseInt(process.env.RECONNECT_HARD_CEILING) || 20,
|
||||||
|
// Server-enforced backoff for a flagged device: baseBackoff * 2^(level-1) * band
|
||||||
|
// multiplier, capped at maxBackoff. Level escalates while it keeps storming
|
||||||
|
// (tighten fast) and decays one step per reconnectReleaseMs of calm (release slow).
|
||||||
|
reconnectBaseBackoffMs: parseInt(process.env.RECONNECT_BASE_BACKOFF_MS) || 1000,
|
||||||
|
reconnectMaxBackoffMs: parseInt(process.env.RECONNECT_MAX_BACKOFF_MS) || 60000,
|
||||||
|
reconnectMaxLevel: parseInt(process.env.RECONNECT_MAX_LEVEL) || 10,
|
||||||
|
reconnectReleaseMs: parseInt(process.env.RECONNECT_RELEASE_MS) || 30000,
|
||||||
|
// Cold start: for this long after process start, lag is high while the whole
|
||||||
|
// fleet reconnects at once. Treat leniently — force the 'normal' band and apply
|
||||||
|
// only the hard ceiling (no rate-band throttle) so a deploy can't throttle
|
||||||
|
// healthy screens. Throttle state is in-memory and resets on restart.
|
||||||
|
reconnectWarmupMs: parseInt(process.env.RECONNECT_WARMUP_MS) || 30000,
|
||||||
|
reconnectBandElevatedMult: parseFloat(process.env.RECONNECT_BAND_ELEVATED_MULT) || 2,
|
||||||
|
reconnectBandCriticalMult: parseFloat(process.env.RECONNECT_BAND_CRITICAL_MULT) || 4,
|
||||||
|
|
||||||
|
// #142 device_status_log retention. A GLOBAL scheduled sweep (pruneStatusLog in
|
||||||
|
// db/database.js, run on startup + the heartbeat interval) deletes rows older
|
||||||
|
// than this across ALL devices — covering what the per-device insert-time prune
|
||||||
|
// in deviceSocket.js misses: removed/idle devices that never insert again, and
|
||||||
|
// the heartbeat.js offline_timeout insert that bypasses logDeviceStatus. Default
|
||||||
|
// is LOWER than the old hardcoded 7 days (the reporter's bloat happened under 7d);
|
||||||
|
// 2-3 days is plenty for the dashboard's 24h uptime view + diagnostics.
|
||||||
|
statusLogRetentionDays: parseFloat(process.env.STATUS_LOG_RETENTION_DAYS) || 3,
|
||||||
|
|
||||||
|
// #142 content-ack dedup window (deviceSocket.js). A device (esp. older apps)
|
||||||
|
// can spam "content <id>: ready" for the same item; suppress identical
|
||||||
|
// (device_id, content_id, status) reports within this window. A status CHANGE
|
||||||
|
// has a different key and passes immediately. In-memory; resets on restart.
|
||||||
|
contentAckDedupMs: parseInt(process.env.CONTENT_ACK_DEDUP_MS) || 10000,
|
||||||
};
|
};
|
||||||
|
|
|
||||||
|
|
@ -216,6 +216,24 @@ const migrations = [
|
||||||
// signal, so the two differ — surfacing both explains "reports 720 but monitor sees 1080".
|
// signal, so the two differ — surfacing both explains "reports 720 but monitor sees 1080".
|
||||||
"ALTER TABLE devices ADD COLUMN render_width INTEGER",
|
"ALTER TABLE devices ADD COLUMN render_width INTEGER",
|
||||||
"ALTER TABLE devices ADD COLUMN render_height INTEGER",
|
"ALTER TABLE devices ADD COLUMN render_height INTEGER",
|
||||||
|
// #139 Phase 2: device-reported OTA backoff status, so the dashboard can flag screens that
|
||||||
|
// can't self-install (Fire TV: no device-owner path) and need a hands-on update. ADD COLUMN
|
||||||
|
// with defaults is non-destructive in SQLite, and the apply loop below swallows "duplicate
|
||||||
|
// column" — so this is idempotent and upgrades an existing populated db without data loss.
|
||||||
|
// ota_updated_at = server receipt time (s), stamped on each register persist.
|
||||||
|
"ALTER TABLE devices ADD COLUMN ota_status TEXT DEFAULT 'none'",
|
||||||
|
"ALTER TABLE devices ADD COLUMN ota_target_version TEXT",
|
||||||
|
"ALTER TABLE devices ADD COLUMN ota_attempts INTEGER DEFAULT 0",
|
||||||
|
"ALTER TABLE devices ADD COLUMN ota_updated_at INTEGER",
|
||||||
|
// #142: index device_status_log for the per-device + time-window access pattern.
|
||||||
|
// schema.sql creates this on fresh installs; this migration covers existing DBs.
|
||||||
|
// Both the dashboard uptime query and the retention prune were full scans — the
|
||||||
|
// dashboard-degradation cause once the table reached 1M+ rows.
|
||||||
|
"CREATE INDEX IF NOT EXISTS idx_device_status_log_device_ts ON device_status_log(device_id, timestamp)",
|
||||||
|
// #142: event-loop lag telemetry table (bounded: indexed + scheduled prune).
|
||||||
|
// schema.sql creates these on fresh installs; this covers existing DBs.
|
||||||
|
"CREATE TABLE IF NOT EXISTS event_loop_lag (id INTEGER PRIMARY KEY AUTOINCREMENT, sampled_at INTEGER NOT NULL DEFAULT (strftime('%s','now')), mean_ms REAL NOT NULL, p50_ms REAL NOT NULL, p99_ms REAL NOT NULL, max_ms REAL NOT NULL, band TEXT NOT NULL DEFAULT 'normal')",
|
||||||
|
"CREATE INDEX IF NOT EXISTS idx_event_loop_lag_sampled ON event_loop_lag(sampled_at)",
|
||||||
];
|
];
|
||||||
// Apply each ALTER idempotently. A "duplicate column name" / "already exists"
|
// Apply each ALTER idempotently. A "duplicate column name" / "already exists"
|
||||||
// error means the column is already present (expected on a migrated DB) - benign.
|
// error means the column is already present (expected on a migrated DB) - benign.
|
||||||
|
|
@ -732,6 +750,21 @@ const { applyTenantDeleteCascade } = require('../lib/tenant-cascade-migration');
|
||||||
}
|
}
|
||||||
})();
|
})();
|
||||||
|
|
||||||
|
// #142 GLOBAL device_status_log retention sweep across ALL devices. Run on startup
|
||||||
|
// and on the heartbeat interval (services/heartbeat.js). This covers the rows the
|
||||||
|
// per-device insert-time prune in deviceSocket.js misses: removed/idle devices that
|
||||||
|
// never insert again, and the heartbeat offline_timeout insert that bypasses
|
||||||
|
// logDeviceStatus. A plain time-range delete (like the play_logs prune) — runs off
|
||||||
|
// the hot path; after the first sweep the table is small, so the cost is negligible.
|
||||||
|
function pruneStatusLog() {
|
||||||
|
try {
|
||||||
|
const maxAgeSec = Math.round(config.statusLogRetentionDays * 86400);
|
||||||
|
const n = db.prepare("DELETE FROM device_status_log WHERE timestamp < strftime('%s','now') - ?").run(maxAgeSec).changes;
|
||||||
|
if (n > 0) console.log(`[status-log] pruned ${n} row(s) older than ${config.statusLogRetentionDays}d`);
|
||||||
|
return n;
|
||||||
|
} catch (_) { return 0; }
|
||||||
|
}
|
||||||
|
|
||||||
// Prune old telemetry (keep last 24h worth at 15s intervals = ~5760, cap at 6000)
|
// Prune old telemetry (keep last 24h worth at 15s intervals = ~5760, cap at 6000)
|
||||||
function pruneTelemetry(deviceId) {
|
function pruneTelemetry(deviceId) {
|
||||||
db.prepare(`
|
db.prepare(`
|
||||||
|
|
@ -804,4 +837,4 @@ try {
|
||||||
const { verifyAndRepairSchema } = require('../lib/schema-check');
|
const { verifyAndRepairSchema } = require('../lib/schema-check');
|
||||||
verifyAndRepairSchema(db);
|
verifyAndRepairSchema(db);
|
||||||
|
|
||||||
module.exports = { db, pruneTelemetry, pruneScreenshots };
|
module.exports = { db, pruneTelemetry, pruneScreenshots, pruneStatusLog };
|
||||||
|
|
|
||||||
|
|
@ -463,6 +463,27 @@ CREATE TABLE IF NOT EXISTS device_status_log (
|
||||||
status TEXT NOT NULL,
|
status TEXT NOT NULL,
|
||||||
timestamp INTEGER NOT NULL DEFAULT (strftime('%s','now'))
|
timestamp INTEGER NOT NULL DEFAULT (strftime('%s','now'))
|
||||||
);
|
);
|
||||||
|
-- #142: index the per-device + time-window access pattern. Both the dashboard
|
||||||
|
-- uptime query (WHERE device_id=? AND timestamp>?) and the retention prune
|
||||||
|
-- (WHERE device_id=? AND timestamp<?) were full table scans; at 1M+ rows that
|
||||||
|
-- was the dashboard-degradation cause in the outage report.
|
||||||
|
CREATE INDEX IF NOT EXISTS idx_device_status_log_device_ts ON device_status_log(device_id, timestamp);
|
||||||
|
|
||||||
|
-- ===================== EVENT LOOP LAG (#142) =====================
|
||||||
|
-- Event-loop delay telemetry from perf_hooks.monitorEventLoopDelay(). Bounded
|
||||||
|
-- from day one: indexed on sampled_at and pruned on a schedule (see
|
||||||
|
-- services/loop-lag.js, LAG_TELEMETRY_RETENTION_DAYS) so it can never become a
|
||||||
|
-- second unbounded-growth table.
|
||||||
|
CREATE TABLE IF NOT EXISTS event_loop_lag (
|
||||||
|
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||||
|
sampled_at INTEGER NOT NULL DEFAULT (strftime('%s','now')),
|
||||||
|
mean_ms REAL NOT NULL,
|
||||||
|
p50_ms REAL NOT NULL,
|
||||||
|
p99_ms REAL NOT NULL,
|
||||||
|
max_ms REAL NOT NULL,
|
||||||
|
band TEXT NOT NULL DEFAULT 'normal'
|
||||||
|
);
|
||||||
|
CREATE INDEX IF NOT EXISTS idx_event_loop_lag_sampled ON event_loop_lag(sampled_at);
|
||||||
|
|
||||||
-- ===================== DEVICE FINGERPRINTS =====================
|
-- ===================== DEVICE FINGERPRINTS =====================
|
||||||
|
|
||||||
|
|
@ -484,13 +505,6 @@ CREATE TABLE IF NOT EXISTS alert_configs (
|
||||||
created_at INTEGER NOT NULL DEFAULT (strftime('%s','now'))
|
created_at INTEGER NOT NULL DEFAULT (strftime('%s','now'))
|
||||||
);
|
);
|
||||||
|
|
||||||
CREATE TABLE IF NOT EXISTS device_status_log (
|
|
||||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
|
||||||
device_id TEXT NOT NULL,
|
|
||||||
status TEXT NOT NULL,
|
|
||||||
timestamp INTEGER NOT NULL DEFAULT (strftime('%s','now'))
|
|
||||||
);
|
|
||||||
|
|
||||||
-- ===================== PLAYER DEBUG LOGS =====================
|
-- ===================== PLAYER DEBUG LOGS =====================
|
||||||
-- Smart TVs (Tizen, WebOS, Fire TV, etc.) have no accessible devtools. The
|
-- Smart TVs (Tizen, WebOS, Fire TV, etc.) have no accessible devtools. The
|
||||||
-- player captures errors into window.__debugLog client-side and POSTs them
|
-- player captures errors into window.__debugLog client-side and POSTs them
|
||||||
|
|
|
||||||
98
server/lib/reconnect-throttle.js
Normal file
98
server/lib/reconnect-throttle.js
Normal file
|
|
@ -0,0 +1,98 @@
|
||||||
|
// #142 step 3 — load-aware per-device reconnect throttle (the outage fix).
|
||||||
|
//
|
||||||
|
// A single device stuck in a tight websocket reconnect loop can flood the server
|
||||||
|
// with full register cycles (DB writes + playlist build) and saturate the event
|
||||||
|
// loop. This module gates genuine reconnects PER DEVICE, before that heavy work
|
||||||
|
// runs in deviceSocket.js.
|
||||||
|
//
|
||||||
|
// Design (mirrors the issue's suggested mitigation + the lastPlayLogAt pattern):
|
||||||
|
// - WHO is always per-device: a device is "flagged" only when it exceeds
|
||||||
|
// reconnectBaseMax genuine reconnects within reconnectWindowMs. Global lag
|
||||||
|
// NEVER flags a healthy device.
|
||||||
|
// - Load-awareness is BANDED (normal/elevated/critical from services/loop-lag),
|
||||||
|
// not a continuous controller — deterministic and testable. The band only
|
||||||
|
// MULTIPLIES the backoff applied to an ALREADY-flagged device.
|
||||||
|
// - Hysteresis: escalate immediately while storming (tighten fast); decay the
|
||||||
|
// escalation level one step per reconnectReleaseMs of calm (release slow).
|
||||||
|
// - HARD CEILING: independent of band and of warm-up, no device may exceed
|
||||||
|
// reconnectHardCeiling/window — a slow-ramp attacker can't train through it.
|
||||||
|
// - COLD START: for reconnectWarmupMs after process start, force the 'normal'
|
||||||
|
// band and apply only the hard ceiling, so a full-fleet reconnect right after
|
||||||
|
// a deploy doesn't throttle healthy screens.
|
||||||
|
// - State is in-memory (resets on restart), like pair-lockout / totp-lockout.
|
||||||
|
|
||||||
|
const config = require('../config');
|
||||||
|
const loopLag = require('../services/loop-lag');
|
||||||
|
|
||||||
|
// deviceId -> { hits: number[], level: number, blockedUntil: ms, lastThrottleAt: ms }
|
||||||
|
const state = new Map();
|
||||||
|
let startedAt = Date.now();
|
||||||
|
|
||||||
|
function bandMultiplier(band) {
|
||||||
|
if (band === 'critical') return config.reconnectBandCriticalMult;
|
||||||
|
if (band === 'elevated') return config.reconnectBandElevatedMult;
|
||||||
|
return 1;
|
||||||
|
}
|
||||||
|
|
||||||
|
function reject(s, now, band, reason, observed, allowed) {
|
||||||
|
s.level = Math.min(s.level + 1, config.reconnectMaxLevel);
|
||||||
|
const backoff = Math.min(
|
||||||
|
config.reconnectBaseBackoffMs * Math.pow(2, s.level - 1) * bandMultiplier(band),
|
||||||
|
config.reconnectMaxBackoffMs
|
||||||
|
);
|
||||||
|
s.blockedUntil = now + backoff;
|
||||||
|
s.lastThrottleAt = now;
|
||||||
|
return { allow: false, retryAfterMs: backoff, reason, observed, allowed, band, level: s.level };
|
||||||
|
}
|
||||||
|
|
||||||
|
// Decide whether to allow a genuine reconnect for `deviceId`.
|
||||||
|
// `now` and `bandOverride` are injectable for deterministic tests; production
|
||||||
|
// passes only deviceId.
|
||||||
|
function check(deviceId, now = Date.now(), bandOverride = null) {
|
||||||
|
const warmup = (now - startedAt) < config.reconnectWarmupMs;
|
||||||
|
const band = bandOverride !== null ? bandOverride : (warmup ? 'normal' : loopLag.getBand());
|
||||||
|
|
||||||
|
let s = state.get(deviceId);
|
||||||
|
if (!s) { s = { hits: [], level: 0, blockedUntil: 0, lastThrottleAt: 0 }; state.set(deviceId, s); }
|
||||||
|
|
||||||
|
// Already inside an enforced backoff window: reject and escalate (tighten fast).
|
||||||
|
if (now < s.blockedUntil) {
|
||||||
|
return reject(s, now, band, 'in-backoff', s.hits.length, config.reconnectBaseMax);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Sliding window of genuine reconnects.
|
||||||
|
s.hits = s.hits.filter((t) => now - t < config.reconnectWindowMs);
|
||||||
|
s.hits.push(now);
|
||||||
|
const observed = s.hits.length;
|
||||||
|
|
||||||
|
// Hard ceiling — always enforced, regardless of band or warm-up.
|
||||||
|
if (observed > config.reconnectHardCeiling) {
|
||||||
|
return reject(s, now, band, 'hard-ceiling', observed, config.reconnectHardCeiling);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Cold start: only the hard ceiling applies; never rate-throttle during warm-up.
|
||||||
|
if (warmup) return allow(s, now, band);
|
||||||
|
|
||||||
|
// Healthy device: under the per-device threshold -> always allowed.
|
||||||
|
if (observed <= config.reconnectBaseMax) return allow(s, now, band);
|
||||||
|
|
||||||
|
// Flagged: storming beyond the per-device threshold -> throttle (band-scaled).
|
||||||
|
return reject(s, now, band, 'rate', observed, config.reconnectBaseMax);
|
||||||
|
}
|
||||||
|
|
||||||
|
function allow(s, now, band) {
|
||||||
|
// Release slow: decay one escalation level per reconnectReleaseMs of calm.
|
||||||
|
if (s.level > 0 && now - s.lastThrottleAt > config.reconnectReleaseMs) {
|
||||||
|
s.level = Math.max(0, s.level - 1);
|
||||||
|
s.lastThrottleAt = now;
|
||||||
|
}
|
||||||
|
return { allow: true, band, level: s.level };
|
||||||
|
}
|
||||||
|
|
||||||
|
// Test-only: clear state and optionally rewind the warm-up origin.
|
||||||
|
function __resetForTest(opts = {}) {
|
||||||
|
state.clear();
|
||||||
|
if (opts.startedAt !== undefined) startedAt = opts.startedAt;
|
||||||
|
}
|
||||||
|
|
||||||
|
module.exports = { check, __resetForTest };
|
||||||
4
server/package-lock.json
generated
4
server/package-lock.json
generated
|
|
@ -1,12 +1,12 @@
|
||||||
{
|
{
|
||||||
"name": "screentinker",
|
"name": "screentinker",
|
||||||
"version": "1.9.1-beta6",
|
"version": "1.9.2-beta1",
|
||||||
"lockfileVersion": 3,
|
"lockfileVersion": 3,
|
||||||
"requires": true,
|
"requires": true,
|
||||||
"packages": {
|
"packages": {
|
||||||
"": {
|
"": {
|
||||||
"name": "screentinker",
|
"name": "screentinker",
|
||||||
"version": "1.9.1-beta6",
|
"version": "1.9.2-beta1",
|
||||||
"dependencies": {
|
"dependencies": {
|
||||||
"@azure/msal-node": "^5.2.1",
|
"@azure/msal-node": "^5.2.1",
|
||||||
"archiver": "^7.0.1",
|
"archiver": "^7.0.1",
|
||||||
|
|
|
||||||
|
|
@ -1,6 +1,6 @@
|
||||||
{
|
{
|
||||||
"name": "screentinker",
|
"name": "screentinker",
|
||||||
"version": "1.9.1-beta6",
|
"version": "1.9.2-beta1",
|
||||||
"description": "ScreenTinker - Digital Signage Management Server",
|
"description": "ScreenTinker - Digital Signage Management Server",
|
||||||
"main": "server.js",
|
"main": "server.js",
|
||||||
"scripts": {
|
"scripts": {
|
||||||
|
|
|
||||||
|
|
@ -160,20 +160,58 @@ function checkItemWrite(req, res) {
|
||||||
return item;
|
return item;
|
||||||
}
|
}
|
||||||
|
|
||||||
// #129: real-time mute. Tell every device on this playlist to toggle the volume of the
|
// #129 + mute-fix: per-item mute has to do TWO things, because the device plays from
|
||||||
// matching currently-playing item NOW (decoupled from publish — the device matches by
|
// playlists.published_snapshot (deviceSocket.buildPlaylistPayload), NOT the draft
|
||||||
// content_id/widget_id and applies it live). The new value is also written to the row, so
|
// playlist_items the toggle writes:
|
||||||
// it lands in the next published snapshot and persists across playlist reloads.
|
// (1) LIVE — tell every device on this playlist to silence the matching currently-playing
|
||||||
|
// item NOW (device matches by content_id/widget_id). Mutes the in-progress playthrough.
|
||||||
|
// (2) PERSIST — patch the matching item's `muted` inside the published_snapshot the device
|
||||||
|
// actually plays, then re-push the playlist. Without this the snapshot kept muted=0, so
|
||||||
|
// every loop/reload re-applied full volume — the "icon red but audio plays across 3
|
||||||
|
// playthroughs" bug (Android re-loads each loop; web's native <video> loop masked it).
|
||||||
|
// We patch the snapshot SURGICALLY (just the muted field of matching items) rather than calling
|
||||||
|
// publishPlaylist, so a mute toggle can't prematurely publish other pending draft edits or flip
|
||||||
|
// the playlist's draft/published status. muted is written as 0/1 to match buildSnapshotItems'
|
||||||
|
// format (the player reads it via optInt). playlist_items.muted is still updated by the caller,
|
||||||
|
// so a later full publish stays consistent.
|
||||||
function emitMuteChanged(req, item, muted) {
|
function emitMuteChanged(req, item, muted) {
|
||||||
try {
|
try {
|
||||||
const io = req.app.get('io');
|
const io = req.app.get('io');
|
||||||
if (!io) return;
|
if (!io) return;
|
||||||
const deviceNs = io.of('/device');
|
const deviceNs = io.of('/device');
|
||||||
|
const m = !!muted;
|
||||||
|
|
||||||
|
// (2) PERSIST: patch the published snapshot the device reads from.
|
||||||
|
const pl = db.prepare('SELECT published_snapshot FROM playlists WHERE id = ?').get(item.playlist_id);
|
||||||
|
if (pl && pl.published_snapshot) {
|
||||||
|
let snap = null;
|
||||||
|
try { snap = JSON.parse(pl.published_snapshot); } catch (e) { snap = null; }
|
||||||
|
if (Array.isArray(snap)) {
|
||||||
|
let changed = false;
|
||||||
|
for (const s of snap) {
|
||||||
|
const match = item.content_id ? s.content_id === item.content_id
|
||||||
|
: (item.widget_id ? s.widget_id === item.widget_id : false);
|
||||||
|
if (match && (s.muted ? 1 : 0) !== (m ? 1 : 0)) { s.muted = m ? 1 : 0; changed = true; }
|
||||||
|
}
|
||||||
|
if (changed) {
|
||||||
|
db.prepare('UPDATE playlists SET published_snapshot = ? WHERE id = ?')
|
||||||
|
.run(JSON.stringify(snap), item.playlist_id);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// (1) LIVE toggle + re-deliver the patched snapshot so loops re-apply the correct flag.
|
||||||
|
// Lazy require (matches playlists.pushToDevices) to avoid a route<->ws circular import.
|
||||||
|
const { buildPlaylistPayload } = require('../ws/deviceSocket');
|
||||||
|
const commandQueue = require('../lib/command-queue');
|
||||||
const devices = db.prepare('SELECT id FROM devices WHERE playlist_id = ?').all(item.playlist_id);
|
const devices = db.prepare('SELECT id FROM devices WHERE playlist_id = ?').all(item.playlist_id);
|
||||||
const payload = { content_id: item.content_id || null, widget_id: item.widget_id || null, muted: !!muted };
|
const payload = { content_id: item.content_id || null, widget_id: item.widget_id || null, muted: m };
|
||||||
for (const d of devices) deviceNs.to(d.id).emit('device:mute-changed', payload);
|
for (const d of devices) {
|
||||||
console.log(`[mute] item ${item.id} (content ${item.content_id || item.widget_id}) -> ${muted ? 'MUTED' : 'unmuted'}; notified ${devices.length} device(s)`);
|
deviceNs.to(d.id).emit('device:mute-changed', payload); // current playthrough
|
||||||
} catch (e) { /* best-effort live toggle; the published snapshot is the source of truth */ }
|
commandQueue.queueOrEmitPlaylistUpdate(deviceNs, d.id, buildPlaylistPayload); // future loads (no reload of current item)
|
||||||
|
}
|
||||||
|
console.log(`[mute] item ${item.id} (content ${item.content_id || item.widget_id}) -> ${m ? 'MUTED' : 'unmuted'}; snapshot patched + notified ${devices.length} device(s)`);
|
||||||
|
} catch (e) { /* best-effort; playlist_items.muted is still updated for the next full publish */ }
|
||||||
}
|
}
|
||||||
|
|
||||||
// Update playlist item
|
// Update playlist item
|
||||||
|
|
|
||||||
|
|
@ -7,6 +7,7 @@ const fs = require('fs');
|
||||||
const config = require('../config');
|
const config = require('../config');
|
||||||
const VERSION = require('../version');
|
const VERSION = require('../version');
|
||||||
const { PLATFORM_ROLES } = require('../middleware/auth');
|
const { PLATFORM_ROLES } = require('../middleware/auth');
|
||||||
|
const loopLag = require('../services/loop-lag');
|
||||||
|
|
||||||
// Public status page
|
// Public status page
|
||||||
router.get('/', (req, res) => {
|
router.get('/', (req, res) => {
|
||||||
|
|
@ -24,6 +25,9 @@ router.get('/', (req, res) => {
|
||||||
version,
|
version,
|
||||||
uptime_human: formatUptime(uptime),
|
uptime_human: formatUptime(uptime),
|
||||||
timestamp: new Date().toISOString(),
|
timestamp: new Date().toISOString(),
|
||||||
|
// #142: current event-loop lag snapshot, so site lag is diagnosable from the
|
||||||
|
// health endpoint independent of any throttling. Cheap (in-memory read).
|
||||||
|
loop_lag: loopLag.getLag(),
|
||||||
});
|
});
|
||||||
});
|
});
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -625,6 +625,10 @@ app.set('io', io);
|
||||||
const { startHeartbeatChecker } = require('./services/heartbeat');
|
const { startHeartbeatChecker } = require('./services/heartbeat');
|
||||||
startHeartbeatChecker(io);
|
startHeartbeatChecker(io);
|
||||||
|
|
||||||
|
// #142: start event-loop lag sampling (feeds /api/status + the reconnect throttle)
|
||||||
|
const { startLoopLagMonitor } = require('./services/loop-lag');
|
||||||
|
startLoopLagMonitor();
|
||||||
|
|
||||||
// Start command-queue sweep (prunes expired entries for offline devices)
|
// Start command-queue sweep (prunes expired entries for offline devices)
|
||||||
const commandQueue = require('./lib/command-queue');
|
const commandQueue = require('./lib/command-queue');
|
||||||
commandQueue.startSweep();
|
commandQueue.startSweep();
|
||||||
|
|
@ -710,13 +714,22 @@ function resolveApkPath() {
|
||||||
return null;
|
return null;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// #139: a device that can't silently install re-downloads the APK every check cycle. Don't
|
||||||
|
// word a download as "in progress" (it may be a stuck loop, not progress), and rate-limit the
|
||||||
|
// line to once per IP per window so a looping device can't flood the log.
|
||||||
|
const otaDownloadLoggedAt = new Map(); // ip -> last-logged ms
|
||||||
|
const OTA_DOWNLOAD_LOG_WINDOW_MS = 10 * 60 * 1000;
|
||||||
|
|
||||||
// Serve APK download
|
// Serve APK download
|
||||||
app.get('/download/apk', (req, res) => {
|
app.get('/download/apk', (req, res) => {
|
||||||
const apkPath = resolveApkPath();
|
const apkPath = resolveApkPath();
|
||||||
if (apkPath) {
|
if (apkPath) {
|
||||||
// #96: an APK download means a device is actually applying an OTA - log it so the
|
const ip = getClientIp(req);
|
||||||
// update is observable end to end (check -> download -> [relaunch]).
|
const now = Date.now();
|
||||||
console.log(`[ota] APK download by ${getClientIp(req)} (${fs.statSync(apkPath).size} bytes) - OTA update in progress`);
|
if (now - (otaDownloadLoggedAt.get(ip) || 0) > OTA_DOWNLOAD_LOG_WINDOW_MS) {
|
||||||
|
otaDownloadLoggedAt.set(ip, now);
|
||||||
|
console.log(`[ota] APK served to ${ip} (${fs.statSync(apkPath).size} bytes)`);
|
||||||
|
}
|
||||||
res.setHeader('Content-Type', 'application/vnd.android.package-archive');
|
res.setHeader('Content-Type', 'application/vnd.android.package-archive');
|
||||||
res.setHeader('Content-Disposition', 'attachment; filename="ScreenTinker.apk"');
|
res.setHeader('Content-Disposition', 'attachment; filename="ScreenTinker.apk"');
|
||||||
res.setHeader('Cache-Control', 'no-cache');
|
res.setHeader('Cache-Control', 'no-cache');
|
||||||
|
|
|
||||||
|
|
@ -1,4 +1,4 @@
|
||||||
const { db } = require('../db/database');
|
const { db, pruneStatusLog } = require('../db/database');
|
||||||
const config = require('../config');
|
const config = require('../config');
|
||||||
const { deviceRoom, emitToWorkspace } = require('../lib/socket-rooms');
|
const { deviceRoom, emitToWorkspace } = require('../lib/socket-rooms');
|
||||||
|
|
||||||
|
|
@ -6,6 +6,10 @@ const { deviceRoom, emitToWorkspace } = require('../lib/socket-rooms');
|
||||||
const deviceConnections = new Map();
|
const deviceConnections = new Map();
|
||||||
|
|
||||||
function startHeartbeatChecker(io) {
|
function startHeartbeatChecker(io) {
|
||||||
|
// #142: sweep stale device_status_log rows once at startup (recovers a bloated
|
||||||
|
// table immediately after a deploy), then again on each interval below.
|
||||||
|
pruneStatusLog();
|
||||||
|
|
||||||
setInterval(() => {
|
setInterval(() => {
|
||||||
const now = Date.now();
|
const now = Date.now();
|
||||||
const dashboardNs = io.of('/dashboard');
|
const dashboardNs = io.of('/dashboard');
|
||||||
|
|
@ -36,19 +40,18 @@ function startHeartbeatChecker(io) {
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
// Cleanup: delete unclaimed provisioning devices older than 24 hours
|
// Cleanup: delete unclaimed provisioning devices older than 24 hours.
|
||||||
// Keep imported devices (they have user_id set) so users can re-pair them
|
pruneProvisioningDevices();
|
||||||
db.prepare(`
|
|
||||||
DELETE FROM devices WHERE status = 'provisioning'
|
|
||||||
AND user_id IS NULL
|
|
||||||
AND created_at < strftime('%s','now') - (365 * 86400)
|
|
||||||
`).run();
|
|
||||||
|
|
||||||
// Cleanup: prune play logs older than 90 days
|
// Cleanup: prune play logs older than 90 days
|
||||||
db.prepare(`
|
db.prepare(`
|
||||||
DELETE FROM play_logs WHERE started_at < strftime('%s','now') - (90 * 86400)
|
DELETE FROM play_logs WHERE started_at < strftime('%s','now') - (90 * 86400)
|
||||||
`).run();
|
`).run();
|
||||||
|
|
||||||
|
// #142: global device_status_log retention sweep (all devices, incl. removed/idle
|
||||||
|
// and the offline_timeout insert path that bypasses the per-device prune).
|
||||||
|
pruneStatusLog();
|
||||||
|
|
||||||
// Cleanup: expired team invites
|
// Cleanup: expired team invites
|
||||||
db.prepare(`
|
db.prepare(`
|
||||||
DELETE FROM team_invites WHERE expires_at < strftime('%s','now')
|
DELETE FROM team_invites WHERE expires_at < strftime('%s','now')
|
||||||
|
|
@ -83,11 +86,25 @@ function getAllConnections() {
|
||||||
return deviceConnections;
|
return deviceConnections;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// #142: sweep unclaimed provisioning devices older than 24h. The window previously
|
||||||
|
// read `365 * 86400` (a YEAR), contradicting its own "older than 24 hours" comment,
|
||||||
|
// so socket-register pairing junk lingered far longer than intended. Imported
|
||||||
|
// devices keep a user_id and are preserved so they can be re-paired. Extracted from
|
||||||
|
// the interval above so the correctness fix is unit-testable. Returns rows deleted.
|
||||||
|
function pruneProvisioningDevices() {
|
||||||
|
return db.prepare(`
|
||||||
|
DELETE FROM devices
|
||||||
|
WHERE status = 'provisioning' AND user_id IS NULL
|
||||||
|
AND created_at < strftime('%s','now') - (24 * 3600)
|
||||||
|
`).run().changes;
|
||||||
|
}
|
||||||
|
|
||||||
module.exports = {
|
module.exports = {
|
||||||
startHeartbeatChecker,
|
startHeartbeatChecker,
|
||||||
registerConnection,
|
registerConnection,
|
||||||
updateHeartbeat,
|
updateHeartbeat,
|
||||||
removeConnection,
|
removeConnection,
|
||||||
getConnection,
|
getConnection,
|
||||||
getAllConnections
|
getAllConnections,
|
||||||
|
pruneProvisioningDevices
|
||||||
};
|
};
|
||||||
|
|
|
||||||
107
server/services/loop-lag.js
Normal file
107
server/services/loop-lag.js
Normal file
|
|
@ -0,0 +1,107 @@
|
||||||
|
// #142 — Event-loop lag telemetry (the data subsystem; ships before the throttle).
|
||||||
|
//
|
||||||
|
// Continuously samples event-loop delay via perf_hooks.monitorEventLoopDelay()
|
||||||
|
// (a C++-backed histogram — cheap). Each window we read mean/p50/p99/max, persist
|
||||||
|
// a row to the bounded `event_loop_lag` table, and recompute a coarse load BAND
|
||||||
|
// (normal | elevated | critical) from the window p99.
|
||||||
|
//
|
||||||
|
// The band is consumed by the reconnect throttle (#142 step 3), but this module
|
||||||
|
// has standalone value: getLag() is surfaced on /api/status and band changes are
|
||||||
|
// logged, so site connectivity/lag is diagnosable independent of any throttling.
|
||||||
|
//
|
||||||
|
// Band transitions are deliberately asymmetric (see nextBand): jump UP immediately
|
||||||
|
// when an up-threshold is crossed (tighten fast), step DOWN only one level at a
|
||||||
|
// time after lagReleaseSamples consecutive calm samples below a deadband (release
|
||||||
|
// slow). This avoids band flap from transient blips.
|
||||||
|
|
||||||
|
const { monitorEventLoopDelay } = require('perf_hooks');
|
||||||
|
const { db } = require('../db/database');
|
||||||
|
const config = require('../config');
|
||||||
|
|
||||||
|
const NS_PER_MS = 1e6;
|
||||||
|
// A band releases only once p99 falls below this fraction of the band's entry
|
||||||
|
// threshold — the deadband that stops small fluctuations from flapping the band.
|
||||||
|
const DEADBAND = 0.5;
|
||||||
|
const LEVEL = { normal: 0, elevated: 1, critical: 2 };
|
||||||
|
|
||||||
|
let histogram = null;
|
||||||
|
let band = 'normal';
|
||||||
|
let calmSamples = 0;
|
||||||
|
let current = { mean_ms: 0, p50_ms: 0, p99_ms: 0, max_ms: 0, band: 'normal', sampled_at: 0 };
|
||||||
|
|
||||||
|
// Pure band-transition function (exported for deterministic unit tests). Given the
|
||||||
|
// current band, the window p99 (ms), and the running calm-sample count, returns the
|
||||||
|
// next [band, calmSamples]. Up is immediate (may skip a level); down is one step
|
||||||
|
// per release window, gated by a deadband.
|
||||||
|
function nextBand(cur, p99, calm) {
|
||||||
|
const level = LEVEL[cur] ?? 0;
|
||||||
|
// UP — immediate, tighten fast (normal can jump straight to critical).
|
||||||
|
if (p99 >= config.lagCriticalMs && level < LEVEL.critical) return ['critical', 0];
|
||||||
|
if (p99 >= config.lagElevatedMs && level < LEVEL.elevated) return ['elevated', 0];
|
||||||
|
// DOWN — slow, one step, only below the current band's deadband.
|
||||||
|
if (level === LEVEL.critical && p99 <= config.lagCriticalMs * DEADBAND) {
|
||||||
|
const c = calm + 1;
|
||||||
|
return c >= config.lagReleaseSamples ? ['elevated', 0] : ['critical', c];
|
||||||
|
}
|
||||||
|
if (level === LEVEL.elevated && p99 <= config.lagElevatedMs * DEADBAND) {
|
||||||
|
const c = calm + 1;
|
||||||
|
return c >= config.lagReleaseSamples ? ['normal', 0] : ['elevated', c];
|
||||||
|
}
|
||||||
|
// Hold (inside deadband, or already normal): reset the calm counter.
|
||||||
|
return [cur, 0];
|
||||||
|
}
|
||||||
|
|
||||||
|
const round2 = (x) => Math.round(x * 100) / 100;
|
||||||
|
|
||||||
|
function sample() {
|
||||||
|
const p99 = histogram.percentile(99) / NS_PER_MS;
|
||||||
|
const snap = {
|
||||||
|
mean_ms: round2(histogram.mean / NS_PER_MS),
|
||||||
|
p50_ms: round2(histogram.percentile(50) / NS_PER_MS),
|
||||||
|
p99_ms: round2(p99),
|
||||||
|
max_ms: round2(histogram.max / NS_PER_MS),
|
||||||
|
};
|
||||||
|
histogram.reset();
|
||||||
|
|
||||||
|
const prev = band;
|
||||||
|
[band, calmSamples] = nextBand(band, snap.p99_ms, calmSamples);
|
||||||
|
current = { ...snap, band, sampled_at: Math.floor(Date.now() / 1000) };
|
||||||
|
|
||||||
|
try {
|
||||||
|
db.prepare(
|
||||||
|
'INSERT INTO event_loop_lag (sampled_at, mean_ms, p50_ms, p99_ms, max_ms, band) VALUES (?, ?, ?, ?, ?, ?)'
|
||||||
|
).run(current.sampled_at, snap.mean_ms, snap.p50_ms, snap.p99_ms, snap.max_ms, band);
|
||||||
|
} catch (_) { /* table may not exist on a partially-migrated DB */ }
|
||||||
|
|
||||||
|
// Observable: log whenever we're loaded or when the band changes (incl. back to
|
||||||
|
// normal). Healthy steady state stays quiet.
|
||||||
|
if (band !== 'normal' || prev !== 'normal') {
|
||||||
|
const tag = band !== prev ? ` (was ${prev})` : '';
|
||||||
|
console.log(`[loop-lag] band=${band}${tag} mean=${snap.mean_ms}ms p99=${snap.p99_ms}ms max=${snap.max_ms}ms`);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
function pruneLag() {
|
||||||
|
try {
|
||||||
|
const cutoff = Math.floor(Date.now() / 1000) - Math.round(config.lagTelemetryRetentionDays * 86400);
|
||||||
|
const n = db.prepare('DELETE FROM event_loop_lag WHERE sampled_at < ?').run(cutoff).changes;
|
||||||
|
if (n > 0) console.log(`[loop-lag] pruned ${n} sample(s) older than ${config.lagTelemetryRetentionDays}d`);
|
||||||
|
} catch (_) { /* ignore */ }
|
||||||
|
}
|
||||||
|
|
||||||
|
function startLoopLagMonitor() {
|
||||||
|
if (histogram) return; // idempotent
|
||||||
|
histogram = monitorEventLoopDelay({ resolution: config.lagResolutionMs });
|
||||||
|
histogram.enable();
|
||||||
|
const t1 = setInterval(sample, config.lagSampleIntervalMs);
|
||||||
|
pruneLag(); // sweep stale rows on boot
|
||||||
|
const t2 = setInterval(pruneLag, config.lagPruneIntervalMs);
|
||||||
|
// Don't keep the process alive on these timers (matters for tests / clean exit).
|
||||||
|
if (t1.unref) t1.unref();
|
||||||
|
if (t2.unref) t2.unref();
|
||||||
|
}
|
||||||
|
|
||||||
|
function getBand() { return band; }
|
||||||
|
function getLag() { return { ...current }; }
|
||||||
|
|
||||||
|
module.exports = { startLoopLagMonitor, getBand, getLag, nextBand };
|
||||||
|
|
@ -259,6 +259,32 @@ test('device WS: wrong device_token is rejected (auth-error, never registered)',
|
||||||
assert.ok(!got.registered, 'wrong token must not register');
|
assert.ok(!got.registered, 'wrong token must not register');
|
||||||
});
|
});
|
||||||
|
|
||||||
|
// #139 Phase 2 (Option B): event-driven OTA status. Registers (which, with no ota fields in
|
||||||
|
// device_info, persists ota_status='none' via the backstop), then emits a valid ota-status and
|
||||||
|
// a foreign-id one in order on the authenticated socket.
|
||||||
|
function deviceOtaSeq(payload, otaEvents, timeoutMs = 4000) {
|
||||||
|
return new Promise((resolve) => {
|
||||||
|
const sock = ioClient(`${BASE}/device`, { transports: ['websocket'], reconnection: false, forceNew: true });
|
||||||
|
const finish = () => { try { sock.close(); } catch { /* */ } resolve(); };
|
||||||
|
sock.on('connect', () => sock.emit('device:register', payload));
|
||||||
|
sock.on('device:registered', () => { for (const e of otaEvents) sock.emit('device:ota-status', e); setTimeout(finish, 500); });
|
||||||
|
sock.on('device:auth-error', finish);
|
||||||
|
setTimeout(finish, timeoutMs);
|
||||||
|
});
|
||||||
|
}
|
||||||
|
test('device WS: device:ota-status persists the fields; a foreign device_id is a safe no-op (#139)', async () => {
|
||||||
|
await deviceOtaSeq(
|
||||||
|
{ device_id: S.deviceId, device_token: S.deviceToken, device_info: { app_version: 'test' } },
|
||||||
|
[
|
||||||
|
{ device_id: S.deviceId, ota_status: 'manual_update_required', ota_target_version: '1.9.1-beta6', ota_attempts: 3 },
|
||||||
|
{ device_id: 'nope-not-a-device', ota_status: 'none', ota_target_version: null, ota_attempts: 0 }, // foreign id -> no-op, no throw
|
||||||
|
]);
|
||||||
|
const dev = await jfetch(`/api/devices/${S.deviceId}`, auth(S.jwt));
|
||||||
|
assert.equal(dev.body.ota_status, 'manual_update_required', 'valid ota-status persisted');
|
||||||
|
assert.equal(dev.body.ota_target_version, '1.9.1-beta6');
|
||||||
|
assert.equal(dev.body.ota_attempts, 3, 'and the foreign-id event did not overwrite it');
|
||||||
|
});
|
||||||
|
|
||||||
// ───────────────────────── TIER 4: #92 FOLLOW-UP COVERAGE ─────────────────────────
|
// ───────────────────────── TIER 4: #92 FOLLOW-UP COVERAGE ─────────────────────────
|
||||||
// The non-security gaps named in the self-review (issue #92): the gap-fix fields + the
|
// The non-security gaps named in the self-review (issue #92): the gap-fix fields + the
|
||||||
// cross-tenant guard (the security-relevant one), docs serving, and the token lifecycle
|
// cross-tenant guard (the security-relevant one), docs serving, and the token lifecycle
|
||||||
|
|
|
||||||
85
server/test/content-ack-dedup.test.js
Normal file
85
server/test/content-ack-dedup.test.js
Normal file
|
|
@ -0,0 +1,85 @@
|
||||||
|
'use strict';
|
||||||
|
|
||||||
|
// #142 step 5 — content-ack dedup. Repeated identical (device_id, content_id, status)
|
||||||
|
// reports are suppressed within config.contentAckDedupMs; a status change or a report
|
||||||
|
// after the window passes. Observed via the server log (the handler logs+emits only
|
||||||
|
// when it does NOT dedup). Unique PORT (3984) to avoid the collision class.
|
||||||
|
|
||||||
|
const { test, before, after } = require('node:test');
|
||||||
|
const assert = require('node:assert/strict');
|
||||||
|
const { spawn } = require('node:child_process');
|
||||||
|
const path = require('node:path');
|
||||||
|
const os = require('node:os');
|
||||||
|
const fs = require('node:fs');
|
||||||
|
const crypto = require('node:crypto');
|
||||||
|
const ioClient = require('socket.io-client');
|
||||||
|
|
||||||
|
const PORT = 3984;
|
||||||
|
const BASE = `http://127.0.0.1:${PORT}`;
|
||||||
|
const DATA_DIR = path.join(os.tmpdir(), 'st-ack-' + crypto.randomBytes(4).toString('hex'));
|
||||||
|
const LOG = path.join(os.tmpdir(), 'st-ack-' + crypto.randomBytes(4).toString('hex') + '.log');
|
||||||
|
const DEDUP_MS = 600;
|
||||||
|
let proc;
|
||||||
|
|
||||||
|
const sleep = (ms) => new Promise(r => setTimeout(r, ms));
|
||||||
|
|
||||||
|
before(async () => {
|
||||||
|
const logFd = fs.openSync(LOG, 'w');
|
||||||
|
proc = spawn('node', ['server.js'], {
|
||||||
|
cwd: path.join(__dirname, '..'),
|
||||||
|
env: { ...process.env, DATA_DIR, SELF_HOSTED: 'true', PORT: String(PORT), NODE_ENV: 'test', CONTENT_ACK_DEDUP_MS: String(DEDUP_MS) },
|
||||||
|
stdio: ['ignore', logFd, logFd],
|
||||||
|
});
|
||||||
|
let up = false;
|
||||||
|
for (let i = 0; i < 80; i++) {
|
||||||
|
try { const r = await fetch(BASE + '/api/status'); if (r.ok) { up = true; break; } } catch { /* */ }
|
||||||
|
await sleep(250);
|
||||||
|
}
|
||||||
|
if (!up) throw new Error('server did not boot:\n' + fs.readFileSync(LOG, 'utf8').slice(-2000));
|
||||||
|
});
|
||||||
|
|
||||||
|
after(() => { try { proc.kill('SIGKILL'); } catch { /* */ } });
|
||||||
|
|
||||||
|
function provision() {
|
||||||
|
const code = String(crypto.randomInt(100000, 1000000));
|
||||||
|
return new Promise((resolve) => {
|
||||||
|
const sock = ioClient(`${BASE}/device`, { transports: ['websocket'], reconnection: false, forceNew: true });
|
||||||
|
sock.on('connect', () => sock.emit('device:register', { pairing_code: code }));
|
||||||
|
sock.on('device:registered', (d) => { try { sock.close(); } catch { /* */ } resolve({ id: d.device_id, token: d.device_token }); });
|
||||||
|
setTimeout(() => resolve(null), 4000);
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
function openRegistered(dev) {
|
||||||
|
return new Promise((resolve, reject) => {
|
||||||
|
const sock = ioClient(`${BASE}/device`, { transports: ['websocket'], reconnection: false, forceNew: true });
|
||||||
|
sock.on('connect', () => sock.emit('device:register', { device_id: dev.id, device_token: dev.token, device_info: { app_version: 'test' } }));
|
||||||
|
sock.on('device:registered', () => resolve(sock));
|
||||||
|
sock.on('device:auth-error', () => reject(new Error('auth-error')));
|
||||||
|
setTimeout(() => reject(new Error('register timeout')), 4000);
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
test('repeated identical content-acks are deduped; window-expiry and status-change pass', async () => {
|
||||||
|
const dev = await provision();
|
||||||
|
assert.ok(dev, 'device provisioned');
|
||||||
|
const sock = await openRegistered(dev);
|
||||||
|
const cid = 'cid-' + crypto.randomBytes(3).toString('hex');
|
||||||
|
|
||||||
|
// 5 rapid identical "ready" within the dedup window -> only ONE should log/emit
|
||||||
|
for (let i = 0; i < 5; i++) { sock.emit('device:content-ack', { device_id: dev.id, content_id: cid, status: 'ready' }); await sleep(40); }
|
||||||
|
// wait past the window, then "ready" again -> passes (a fresh report)
|
||||||
|
await sleep(DEDUP_MS + 250);
|
||||||
|
sock.emit('device:content-ack', { device_id: dev.id, content_id: cid, status: 'ready' });
|
||||||
|
// a status CHANGE has a different key -> passes immediately
|
||||||
|
await sleep(60);
|
||||||
|
sock.emit('device:content-ack', { device_id: dev.id, content_id: cid, status: 'error' });
|
||||||
|
await sleep(400);
|
||||||
|
try { sock.close(); } catch { /* */ }
|
||||||
|
|
||||||
|
const log = fs.readFileSync(LOG, 'utf8');
|
||||||
|
const ready = (log.match(new RegExp(`content ${cid}: ready`, 'g')) || []).length;
|
||||||
|
const err = (log.match(new RegExp(`content ${cid}: error`, 'g')) || []).length;
|
||||||
|
assert.equal(ready, 2, 'a burst of identical "ready" collapses to one; a second after the window passes -> 2 total');
|
||||||
|
assert.equal(err, 1, 'a status change is not deduped');
|
||||||
|
});
|
||||||
64
server/test/loop-lag-integration.test.js
Normal file
64
server/test/loop-lag-integration.test.js
Normal file
|
|
@ -0,0 +1,64 @@
|
||||||
|
'use strict';
|
||||||
|
|
||||||
|
// #142 step 2 — integration: the lag monitor samples, persists to a BOUNDED table,
|
||||||
|
// and surfaces current lag on /api/status. Boots the real server with fast sampling
|
||||||
|
// and a tiny (fractional-day) retention so the prune is observable within the test.
|
||||||
|
|
||||||
|
const { test, before, after } = require('node:test');
|
||||||
|
const assert = require('node:assert/strict');
|
||||||
|
const { spawn } = require('node:child_process');
|
||||||
|
const path = require('node:path');
|
||||||
|
const os = require('node:os');
|
||||||
|
const fs = require('node:fs');
|
||||||
|
const crypto = require('node:crypto');
|
||||||
|
const Database = require('better-sqlite3');
|
||||||
|
|
||||||
|
const PORT = 3982;
|
||||||
|
const BASE = `http://127.0.0.1:${PORT}`;
|
||||||
|
const DATA_DIR = path.join(os.tmpdir(), 'st-lag-int-' + crypto.randomBytes(4).toString('hex'));
|
||||||
|
const LOG = path.join(os.tmpdir(), 'st-lag-int-' + crypto.randomBytes(4).toString('hex') + '.log');
|
||||||
|
let proc;
|
||||||
|
|
||||||
|
before(async () => {
|
||||||
|
const logFd = fs.openSync(LOG, 'w');
|
||||||
|
proc = spawn('node', ['server.js'], {
|
||||||
|
cwd: path.join(__dirname, '..'),
|
||||||
|
env: {
|
||||||
|
...process.env, DATA_DIR, SELF_HOSTED: 'true', PORT: String(PORT), NODE_ENV: 'test',
|
||||||
|
LAG_SAMPLE_INTERVAL_MS: '200', // sample fast
|
||||||
|
LAG_TELEMETRY_RETENTION_DAYS: '0.00001', // ~0.86s retention
|
||||||
|
LAG_PRUNE_INTERVAL_MS: '400', // prune often
|
||||||
|
},
|
||||||
|
stdio: ['ignore', logFd, logFd],
|
||||||
|
});
|
||||||
|
let up = false;
|
||||||
|
for (let i = 0; i < 80; i++) {
|
||||||
|
try { const r = await fetch(BASE + '/api/status'); if (r.ok) { up = true; break; } } catch { /* not yet */ }
|
||||||
|
await new Promise(r => setTimeout(r, 250));
|
||||||
|
}
|
||||||
|
if (!up) throw new Error('server did not boot:\n' + fs.readFileSync(LOG, 'utf8').slice(-2000));
|
||||||
|
});
|
||||||
|
|
||||||
|
after(() => { try { proc.kill('SIGKILL'); } catch { /* */ } });
|
||||||
|
|
||||||
|
test('/api/status exposes a current loop_lag snapshot', async () => {
|
||||||
|
const r = await fetch(BASE + '/api/status');
|
||||||
|
const body = await r.json();
|
||||||
|
assert.ok(body.loop_lag, 'loop_lag present on /api/status');
|
||||||
|
assert.ok(['normal', 'elevated', 'critical'].includes(body.loop_lag.band), 'band is a valid level');
|
||||||
|
assert.equal(typeof body.loop_lag.p99_ms, 'number', 'p99_ms is numeric');
|
||||||
|
assert.equal(typeof body.loop_lag.mean_ms, 'number', 'mean_ms is numeric');
|
||||||
|
});
|
||||||
|
|
||||||
|
test('lag samples are persisted AND bounded by retention prune (not unbounded)', async () => {
|
||||||
|
// Let it sample for ~3s. At 200ms/sample that is ~15 inserts, but with ~0.86s
|
||||||
|
// retention pruned every 400ms the table must stay small — proving the table
|
||||||
|
// can never become a second unbounded-growth table.
|
||||||
|
await new Promise(r => setTimeout(r, 1800));
|
||||||
|
const dbPath = path.join(DATA_DIR, 'db', 'remote_display.db');
|
||||||
|
const db = new Database(dbPath, { readonly: true });
|
||||||
|
const count = db.prepare('SELECT COUNT(*) c FROM event_loop_lag').get().c;
|
||||||
|
db.close();
|
||||||
|
assert.ok(count >= 1, 'lag samples are being persisted');
|
||||||
|
assert.ok(count < 15, `table is bounded by the prune (held ${count} rows over ~3s of 200ms sampling)`);
|
||||||
|
});
|
||||||
57
server/test/loop-lag.test.js
Normal file
57
server/test/loop-lag.test.js
Normal file
|
|
@ -0,0 +1,57 @@
|
||||||
|
'use strict';
|
||||||
|
|
||||||
|
// #142 step 2 — deterministic unit tests for the event-loop-lag band transitions.
|
||||||
|
// Pure function, no sockets/timing. Isolate the DB to a temp dir BEFORE requiring
|
||||||
|
// the module (requiring it pulls in db/database, which initialises a DB on load).
|
||||||
|
|
||||||
|
const os = require('node:os');
|
||||||
|
const path = require('node:path');
|
||||||
|
const crypto = require('node:crypto');
|
||||||
|
process.env.DATA_DIR = path.join(os.tmpdir(), 'st-lag-unit-' + crypto.randomBytes(4).toString('hex'));
|
||||||
|
|
||||||
|
const { test } = require('node:test');
|
||||||
|
const assert = require('node:assert/strict');
|
||||||
|
const { nextBand } = require('../services/loop-lag');
|
||||||
|
|
||||||
|
// config defaults exercised here: elevated=100ms, critical=250ms, releaseSamples=5,
|
||||||
|
// deadband=0.5 -> release-below thresholds: elevated@50ms, critical@125ms.
|
||||||
|
|
||||||
|
test('UP is immediate and can skip a level (tighten fast)', () => {
|
||||||
|
assert.deepEqual(nextBand('normal', 50, 0), ['normal', 0], 'below elevated stays normal');
|
||||||
|
assert.deepEqual(nextBand('normal', 100, 0), ['elevated', 0], 'crossing elevated up-threshold jumps immediately');
|
||||||
|
assert.deepEqual(nextBand('normal', 250, 0), ['critical', 0], 'a big spike jumps normal->critical in one sample');
|
||||||
|
assert.deepEqual(nextBand('elevated', 250, 0), ['critical', 0]);
|
||||||
|
});
|
||||||
|
|
||||||
|
test('deadband holds the band for small fluctuations (no flap)', () => {
|
||||||
|
// elevated, p99 between release(50) and up(100) -> hold elevated, calm reset
|
||||||
|
assert.deepEqual(nextBand('elevated', 80, 3), ['elevated', 0]);
|
||||||
|
// critical, p99 between release(125) and up(250) -> hold critical
|
||||||
|
assert.deepEqual(nextBand('critical', 200, 4), ['critical', 0]);
|
||||||
|
});
|
||||||
|
|
||||||
|
test('DOWN is slow: requires lagReleaseSamples calm samples below the deadband', () => {
|
||||||
|
// elevated -> normal only after 5 consecutive calm samples
|
||||||
|
let band = 'elevated', calm = 0;
|
||||||
|
for (let i = 0; i < 4; i++) {
|
||||||
|
[band, calm] = nextBand(band, 20, calm);
|
||||||
|
assert.equal(band, 'elevated', `still elevated after ${i + 1} calm sample(s)`);
|
||||||
|
}
|
||||||
|
[band, calm] = nextBand(band, 20, calm); // 5th
|
||||||
|
assert.deepEqual([band, calm], ['normal', 0], 'drops to normal on the 5th calm sample');
|
||||||
|
});
|
||||||
|
|
||||||
|
test('DOWN releases one level at a time: critical -> elevated -> normal', () => {
|
||||||
|
let band = 'critical', calm = 0;
|
||||||
|
for (let i = 0; i < 5; i++) [band, calm] = nextBand(band, 10, calm);
|
||||||
|
assert.equal(band, 'elevated', 'critical releases to elevated, never straight to normal');
|
||||||
|
for (let i = 0; i < 5; i++) [band, calm] = nextBand(band, 10, calm);
|
||||||
|
assert.equal(band, 'normal', 'then elevated releases to normal');
|
||||||
|
});
|
||||||
|
|
||||||
|
test('a single calm sample does not release (calm counter resets on a non-calm sample)', () => {
|
||||||
|
let [band, calm] = nextBand('elevated', 20, 0); // calm=1
|
||||||
|
assert.deepEqual([band, calm], ['elevated', 1]);
|
||||||
|
[band, calm] = nextBand(band, 80, calm); // back inside deadband -> reset
|
||||||
|
assert.deepEqual([band, calm], ['elevated', 0], 'one blip resets the release counter');
|
||||||
|
});
|
||||||
|
|
@ -91,6 +91,24 @@ test('muted reaches the device via the published snapshot (buildSnapshotItems)',
|
||||||
assert.equal(item.muted, 1, 'snapshot (device payload) carries muted=1');
|
assert.equal(item.muted, 1, 'snapshot (device payload) carries muted=1');
|
||||||
});
|
});
|
||||||
|
|
||||||
|
test('mute toggle patches the published snapshot WITHOUT a manual republish (the beta7 bug)', async () => {
|
||||||
|
// Baseline: publish once so the device has a snapshot carrying muted=0.
|
||||||
|
await jfetch(`/api/assignments/${S.itemId}`, put(S.jwt, { muted: false }));
|
||||||
|
await jfetch(`/api/playlists/${S.playlistId}/publish`, post(S.jwt, {}));
|
||||||
|
const read = () => JSON.parse(db.prepare('SELECT published_snapshot FROM playlists WHERE id = ?').get(S.playlistId).published_snapshot)
|
||||||
|
.find((i) => i.content_id === S.contentId).muted;
|
||||||
|
assert.equal(read(), 0, 'baseline: snapshot the device plays carries muted=0');
|
||||||
|
|
||||||
|
// The actual bug: a mute toggle ALONE (no /publish) must reach the played snapshot.
|
||||||
|
// On beta7 this stayed 0 (markDraft only) so every loop re-applied full volume.
|
||||||
|
await jfetch(`/api/assignments/${S.itemId}`, put(S.jwt, { muted: true }));
|
||||||
|
assert.equal(read(), 1, 'mute toggle patched the snapshot the device plays — no manual republish needed');
|
||||||
|
|
||||||
|
// Unmute toggle reverts the snapshot too.
|
||||||
|
await jfetch(`/api/assignments/${S.itemId}`, put(S.jwt, { muted: false }));
|
||||||
|
assert.equal(read(), 0, 'unmute toggle patched the snapshot back to 0');
|
||||||
|
});
|
||||||
|
|
||||||
test('PUT ignoring muted (other field) leaves muted untouched', async () => {
|
test('PUT ignoring muted (other field) leaves muted untouched', async () => {
|
||||||
await jfetch(`/api/assignments/${S.itemId}`, put(S.jwt, { muted: true }));
|
await jfetch(`/api/assignments/${S.itemId}`, put(S.jwt, { muted: true }));
|
||||||
const r = await jfetch(`/api/assignments/${S.itemId}`, put(S.jwt, { duration_sec: 15 }));
|
const r = await jfetch(`/api/assignments/${S.itemId}`, put(S.jwt, { duration_sec: 15 }));
|
||||||
|
|
|
||||||
41
server/test/provisioning-cleanup.test.js
Normal file
41
server/test/provisioning-cleanup.test.js
Normal file
|
|
@ -0,0 +1,41 @@
|
||||||
|
'use strict';
|
||||||
|
|
||||||
|
// #142 (cut 2) — provisioning-row cleanup window correctness. The sweep deletes
|
||||||
|
// UNCLAIMED provisioning devices older than 24h (it previously used 365*86400 — a
|
||||||
|
// year — contradicting its own comment). Imported devices (user_id set) and
|
||||||
|
// non-provisioning devices are preserved. Deterministic, in-process (no server).
|
||||||
|
|
||||||
|
const os = require('node:os');
|
||||||
|
const path = require('node:path');
|
||||||
|
const crypto = require('node:crypto');
|
||||||
|
process.env.DATA_DIR = path.join(os.tmpdir(), 'st-provclean-' + crypto.randomBytes(4).toString('hex'));
|
||||||
|
|
||||||
|
const { test } = require('node:test');
|
||||||
|
const assert = require('node:assert/strict');
|
||||||
|
const { db } = require('../db/database');
|
||||||
|
const { pruneProvisioningDevices } = require('../services/heartbeat');
|
||||||
|
|
||||||
|
test('sweeps unclaimed provisioning devices older than 24h, keeps the rest', () => {
|
||||||
|
db.pragma('foreign_keys = OFF'); // seed user_id without a real users row
|
||||||
|
db.exec('DELETE FROM devices');
|
||||||
|
const ins = db.prepare("INSERT INTO devices (id, status, user_id, created_at) VALUES (?, ?, ?, strftime('%s','now') - ?)");
|
||||||
|
ins.run('old-unclaimed', 'provisioning', null, 25 * 3600); // >24h, unclaimed -> SWEPT
|
||||||
|
ins.run('new-unclaimed', 'provisioning', null, 1 * 3600); // <24h, unclaimed -> kept
|
||||||
|
ins.run('old-imported', 'provisioning', 'u-imported', 25 * 3600); // >24h but imported (user_id) -> kept
|
||||||
|
ins.run('old-online', 'online', null, 25 * 3600); // >24h but not provisioning -> kept
|
||||||
|
db.pragma('foreign_keys = ON');
|
||||||
|
|
||||||
|
assert.equal(db.prepare('SELECT COUNT(*) c FROM devices').get().c, 4, 'seeded 4');
|
||||||
|
|
||||||
|
const deleted = pruneProvisioningDevices();
|
||||||
|
assert.equal(deleted, 1, 'only the >24h unclaimed provisioning device is swept');
|
||||||
|
|
||||||
|
const ids = db.prepare('SELECT id FROM devices ORDER BY id').all().map(r => r.id);
|
||||||
|
assert.deepEqual(ids, ['new-unclaimed', 'old-imported', 'old-online']);
|
||||||
|
// regression guard: a 25h-old row sits well inside the OLD 365-day window, so this
|
||||||
|
// would have survived before the fix.
|
||||||
|
});
|
||||||
|
|
||||||
|
test('idempotent: a second sweep with nothing stale deletes nothing', () => {
|
||||||
|
assert.equal(pruneProvisioningDevices(), 0);
|
||||||
|
});
|
||||||
113
server/test/reconnect-throttle-integration.test.js
Normal file
113
server/test/reconnect-throttle-integration.test.js
Normal file
|
|
@ -0,0 +1,113 @@
|
||||||
|
'use strict';
|
||||||
|
|
||||||
|
// #142 step 3 — REQUIRED GATE TEST + storm + neighbor, over real sockets.
|
||||||
|
//
|
||||||
|
// Boots the real server with warm-up ACTIVE (default) so the whole suite runs in
|
||||||
|
// the cold-start window — the exact "right after a deploy" scenario. Hard ceiling
|
||||||
|
// and window are tightened so the storm trips quickly without thousands of connects;
|
||||||
|
// fleet devices stay well under the ceiling.
|
||||||
|
|
||||||
|
const { test, before, after } = require('node:test');
|
||||||
|
const assert = require('node:assert/strict');
|
||||||
|
const { spawn } = require('node:child_process');
|
||||||
|
const path = require('node:path');
|
||||||
|
const os = require('node:os');
|
||||||
|
const fs = require('node:fs');
|
||||||
|
const crypto = require('node:crypto');
|
||||||
|
const ioClient = require('socket.io-client');
|
||||||
|
|
||||||
|
const PORT = 3983;
|
||||||
|
const BASE = `http://127.0.0.1:${PORT}`;
|
||||||
|
const DATA_DIR = path.join(os.tmpdir(), 'st-thr-int-' + crypto.randomBytes(4).toString('hex'));
|
||||||
|
const LOG = path.join(os.tmpdir(), 'st-thr-int-' + crypto.randomBytes(4).toString('hex') + '.log');
|
||||||
|
let proc;
|
||||||
|
|
||||||
|
before(async () => {
|
||||||
|
const logFd = fs.openSync(LOG, 'w');
|
||||||
|
proc = spawn('node', ['server.js'], {
|
||||||
|
cwd: path.join(__dirname, '..'),
|
||||||
|
env: {
|
||||||
|
...process.env, DATA_DIR, SELF_HOSTED: 'true', PORT: String(PORT), NODE_ENV: 'test',
|
||||||
|
// warm-up left at default (30s) so the whole test runs in the cold-start window
|
||||||
|
RECONNECT_HARD_CEILING: '8',
|
||||||
|
RECONNECT_WINDOW_MS: '5000',
|
||||||
|
RECONNECT_BASE_MAX: '3',
|
||||||
|
},
|
||||||
|
stdio: ['ignore', logFd, logFd],
|
||||||
|
});
|
||||||
|
let up = false;
|
||||||
|
for (let i = 0; i < 80; i++) {
|
||||||
|
try { const r = await fetch(BASE + '/api/status'); if (r.ok) { up = true; break; } } catch { /* */ }
|
||||||
|
await new Promise(r => setTimeout(r, 250));
|
||||||
|
}
|
||||||
|
if (!up) throw new Error('server did not boot:\n' + fs.readFileSync(LOG, 'utf8').slice(-2000));
|
||||||
|
});
|
||||||
|
|
||||||
|
after(() => { try { proc.kill('SIGKILL'); } catch { /* */ } });
|
||||||
|
|
||||||
|
// Provision a brand-new device via a UNIQUE pairing code -> returns {device_id, device_token}.
|
||||||
|
function provision() {
|
||||||
|
const code = String(crypto.randomInt(100000, 1000000));
|
||||||
|
return new Promise((resolve) => {
|
||||||
|
const sock = ioClient(`${BASE}/device`, { transports: ['websocket'], reconnection: false, forceNew: true });
|
||||||
|
sock.on('connect', () => sock.emit('device:register', { pairing_code: code }));
|
||||||
|
sock.on('device:registered', (d) => { try { sock.close(); } catch { /* */ } resolve({ id: d.device_id, token: d.device_token }); });
|
||||||
|
setTimeout(() => { try { sock.close(); } catch { /* */ } resolve(null); }, 4000);
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
// One genuine reconnect (new socket). Resolves {registered, throttled}.
|
||||||
|
function reconnect(dev) {
|
||||||
|
return new Promise((resolve) => {
|
||||||
|
const sock = ioClient(`${BASE}/device`, { transports: ['websocket'], reconnection: false, forceNew: true });
|
||||||
|
let done = false;
|
||||||
|
const finish = (r) => { if (done) return; done = true; try { sock.close(); } catch { /* */ } resolve(r); };
|
||||||
|
sock.on('connect', () => sock.emit('device:register', { device_id: dev.id, device_token: dev.token, device_info: { app_version: 'test' } }));
|
||||||
|
sock.on('device:registered', () => finish({ registered: true, throttled: false }));
|
||||||
|
sock.on('device:throttled', () => finish({ registered: false, throttled: true }));
|
||||||
|
setTimeout(() => finish({ registered: false, throttled: false }), 1500);
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
test('GATE: full-fleet reconnect right after restart throttles NO healthy device', async () => {
|
||||||
|
// 12 distinct devices, each reconnecting twice in quick succession — a deploy-time
|
||||||
|
// herd. The loop is transiently busy, but per-device keying means none is flagged.
|
||||||
|
const fleet = [];
|
||||||
|
for (let i = 0; i < 12; i++) { const d = await provision(); assert.ok(d, 'device provisioned'); fleet.push(d); }
|
||||||
|
|
||||||
|
let registered = 0, throttled = 0;
|
||||||
|
// two reconnect rounds across the whole fleet
|
||||||
|
for (let round = 0; round < 2; round++) {
|
||||||
|
const results = await Promise.all(fleet.map(reconnect));
|
||||||
|
for (const r of results) { if (r.registered) registered++; if (r.throttled) throttled++; }
|
||||||
|
}
|
||||||
|
assert.equal(throttled, 0, 'NO healthy fleet device may be throttled at cold start');
|
||||||
|
assert.equal(registered, 24, 'every fleet reconnect registered');
|
||||||
|
});
|
||||||
|
|
||||||
|
test('a single device storming IS throttled (backoff engages)', async () => {
|
||||||
|
const dev = await provision();
|
||||||
|
assert.ok(dev);
|
||||||
|
let registered = 0, throttled = 0;
|
||||||
|
// 12 sequential reconnects within the 5s window -> exceeds the hard ceiling (8)
|
||||||
|
for (let i = 0; i < 12; i++) {
|
||||||
|
const r = await reconnect(dev);
|
||||||
|
if (r.registered) registered++;
|
||||||
|
if (r.throttled) throttled++;
|
||||||
|
}
|
||||||
|
assert.ok(throttled >= 1, `storming device must be throttled (got ${throttled} throttle(s))`);
|
||||||
|
assert.ok(registered < 12, `not all storm reconnects should succeed (got ${registered}/12)`);
|
||||||
|
});
|
||||||
|
|
||||||
|
test('neighbor isolation: a healthy device is unaffected while another storms', async () => {
|
||||||
|
const stormer = await provision();
|
||||||
|
const neighbor = await provision();
|
||||||
|
assert.ok(stormer && neighbor);
|
||||||
|
// storm the stormer hard
|
||||||
|
for (let i = 0; i < 12; i++) await reconnect(stormer);
|
||||||
|
// neighbor reconnects normally a couple of times -> must still register
|
||||||
|
const a = await reconnect(neighbor);
|
||||||
|
const b = await reconnect(neighbor);
|
||||||
|
assert.ok(a.registered && b.registered, 'neighbor must register normally while another device storms');
|
||||||
|
assert.ok(!a.throttled && !b.throttled, 'neighbor must not be throttled by another device');
|
||||||
|
});
|
||||||
98
server/test/reconnect-throttle.test.js
Normal file
98
server/test/reconnect-throttle.test.js
Normal file
|
|
@ -0,0 +1,98 @@
|
||||||
|
'use strict';
|
||||||
|
|
||||||
|
// #142 step 3 — deterministic unit tests for the per-device reconnect throttle.
|
||||||
|
// Pure logic with injected `now` / band; isolate the DB before require (the module
|
||||||
|
// pulls in services/loop-lag -> db/database which initialises a DB on load).
|
||||||
|
|
||||||
|
const os = require('node:os');
|
||||||
|
const path = require('node:path');
|
||||||
|
const crypto = require('node:crypto');
|
||||||
|
process.env.DATA_DIR = path.join(os.tmpdir(), 'st-thr-unit-' + crypto.randomBytes(4).toString('hex'));
|
||||||
|
|
||||||
|
const { test, beforeEach } = require('node:test');
|
||||||
|
const assert = require('node:assert/strict');
|
||||||
|
const throttle = require('../lib/reconnect-throttle');
|
||||||
|
|
||||||
|
// config defaults: window=10000, baseMax=5, hardCeiling=20, baseBackoff=1000,
|
||||||
|
// maxBackoff=60000, releaseMs=30000, warmup=30000, elevMult=2, critMult=4.
|
||||||
|
const T0 = 1_000_000; // arbitrary epoch-ms origin for the warm-up clock
|
||||||
|
const POST = T0 + 40_000; // safely past the 30s warm-up
|
||||||
|
const WARM = T0 + 1_000; // inside the warm-up window
|
||||||
|
|
||||||
|
beforeEach(() => throttle.__resetForTest({ startedAt: T0 }));
|
||||||
|
|
||||||
|
test('healthy device is never throttled (<= baseMax genuine reconnects)', () => {
|
||||||
|
for (let i = 0; i < 5; i++) {
|
||||||
|
const v = throttle.check('A', POST + i, 'normal');
|
||||||
|
assert.ok(v.allow, `reconnect ${i + 1} (<=baseMax) must be allowed`);
|
||||||
|
}
|
||||||
|
});
|
||||||
|
|
||||||
|
test('a per-device storm IS throttled and the backoff GROWS (tighten fast)', () => {
|
||||||
|
let v;
|
||||||
|
for (let i = 0; i < 5; i++) v = throttle.check('B', POST + i, 'normal'); // 5 allowed
|
||||||
|
v = throttle.check('B', POST + 5, 'normal'); // 6th -> flagged
|
||||||
|
assert.equal(v.allow, false);
|
||||||
|
assert.equal(v.reason, 'rate');
|
||||||
|
assert.equal(v.observed, 6);
|
||||||
|
assert.equal(v.allowed, 5);
|
||||||
|
const b1 = v.retryAfterMs;
|
||||||
|
// keep hammering while blocked -> escalate, longer backoff each time
|
||||||
|
const b2 = throttle.check('B', POST + 6, 'normal').retryAfterMs;
|
||||||
|
const b3 = throttle.check('B', POST + 7, 'normal').retryAfterMs;
|
||||||
|
assert.ok(b2 > b1 && b3 > b2, `backoff must grow: ${b1} < ${b2} < ${b3}`);
|
||||||
|
});
|
||||||
|
|
||||||
|
test('lag band multiplies an already-flagged device\'s backoff (critical > normal)', () => {
|
||||||
|
let v;
|
||||||
|
for (let i = 0; i < 5; i++) throttle.check('N', POST + i, 'normal');
|
||||||
|
v = throttle.check('N', POST + 5, 'normal');
|
||||||
|
const normalBackoff = v.retryAfterMs;
|
||||||
|
|
||||||
|
throttle.__resetForTest({ startedAt: T0 });
|
||||||
|
for (let i = 0; i < 5; i++) throttle.check('C', POST + i, 'critical');
|
||||||
|
v = throttle.check('C', POST + 5, 'critical');
|
||||||
|
assert.ok(v.retryAfterMs > normalBackoff, `critical backoff ${v.retryAfterMs} > normal ${normalBackoff}`);
|
||||||
|
});
|
||||||
|
|
||||||
|
test('a healthy device is NOT throttled even when the band is critical (lag never gates the healthy)', () => {
|
||||||
|
for (let i = 0; i < 5; i++) {
|
||||||
|
const v = throttle.check('H', POST + i, 'critical');
|
||||||
|
assert.ok(v.allow, 'healthy device stays allowed regardless of band');
|
||||||
|
}
|
||||||
|
});
|
||||||
|
|
||||||
|
test('COLD START: during warm-up, moderate flapping (>baseMax, <ceiling) is NOT throttled', () => {
|
||||||
|
for (let i = 0; i < 12; i++) { // 12 > baseMax(5) but < hardCeiling(20)
|
||||||
|
const v = throttle.check('W', WARM + i, 'critical'); // band forced normal in warm-up anyway
|
||||||
|
assert.ok(v.allow, `warm-up reconnect ${i + 1} must be lenient`);
|
||||||
|
}
|
||||||
|
});
|
||||||
|
|
||||||
|
test('HARD CEILING is enforced even during warm-up (slow-ramp cannot train through)', () => {
|
||||||
|
let v;
|
||||||
|
for (let i = 0; i < 20; i++) {
|
||||||
|
v = throttle.check('K', WARM + i, 'normal');
|
||||||
|
assert.ok(v.allow, `warm-up reconnect ${i + 1} (<=ceiling) allowed`);
|
||||||
|
}
|
||||||
|
v = throttle.check('K', WARM + 20, 'normal'); // 21st -> over ceiling(20)
|
||||||
|
assert.equal(v.allow, false);
|
||||||
|
assert.equal(v.reason, 'hard-ceiling');
|
||||||
|
});
|
||||||
|
|
||||||
|
test('neighbor isolation: one device storming does not throttle another', () => {
|
||||||
|
for (let i = 0; i < 10; i++) throttle.check('STORM', POST + i, 'normal'); // STORM gets throttled
|
||||||
|
const v = throttle.check('NEIGHBOR', POST + 11, 'normal');
|
||||||
|
assert.ok(v.allow, 'a different device must be unaffected');
|
||||||
|
});
|
||||||
|
|
||||||
|
test('release slow: escalation level decays after a calm period', () => {
|
||||||
|
let v;
|
||||||
|
for (let i = 0; i < 6; i++) v = throttle.check('R', POST + i, 'normal'); // flagged, level 1
|
||||||
|
assert.ok(v.level >= 1);
|
||||||
|
const peak = v.level;
|
||||||
|
// a calm reconnect well past the window AND past releaseMs(30000)
|
||||||
|
v = throttle.check('R', POST + 6 + 40_000, 'normal');
|
||||||
|
assert.ok(v.allow, 'calm reconnect after the storm is allowed');
|
||||||
|
assert.ok(v.level < peak, `level decays after calm: ${v.level} < ${peak}`);
|
||||||
|
});
|
||||||
48
server/test/status-log-prune.test.js
Normal file
48
server/test/status-log-prune.test.js
Normal file
|
|
@ -0,0 +1,48 @@
|
||||||
|
'use strict';
|
||||||
|
|
||||||
|
// #142 step 4 — global device_status_log retention sweep. Deterministic, in-process
|
||||||
|
// (no server/port). Isolate the DB and set retention BEFORE requiring the module
|
||||||
|
// (config reads env at load; database.js initialises a DB on load).
|
||||||
|
|
||||||
|
const os = require('node:os');
|
||||||
|
const path = require('node:path');
|
||||||
|
const crypto = require('node:crypto');
|
||||||
|
process.env.DATA_DIR = path.join(os.tmpdir(), 'st-statusprune-' + crypto.randomBytes(4).toString('hex'));
|
||||||
|
process.env.STATUS_LOG_RETENTION_DAYS = '2';
|
||||||
|
|
||||||
|
const { test } = require('node:test');
|
||||||
|
const assert = require('node:assert/strict');
|
||||||
|
const { db, pruneStatusLog } = require('../db/database');
|
||||||
|
|
||||||
|
test('global sweep deletes rows older than retention across ALL devices, keeps recent', () => {
|
||||||
|
db.exec('DELETE FROM device_status_log'); // clean slate
|
||||||
|
const old = db.prepare("INSERT INTO device_status_log (device_id, status, timestamp) VALUES (?, ?, strftime('%s','now') - ?)");
|
||||||
|
|
||||||
|
// 5 days old (> 2d retention): an active device, a device NOT in the devices
|
||||||
|
// table (removed/idle — what the per-device insert-time prune never revisits),
|
||||||
|
// and the heartbeat offline_timeout status that bypasses logDeviceStatus.
|
||||||
|
old.run('live-dev', 'online', 5 * 86400);
|
||||||
|
old.run('removed-idle-dev', 'offline', 5 * 86400);
|
||||||
|
old.run('hb-dev', 'offline_timeout', 5 * 86400);
|
||||||
|
// recent (< retention): must survive, regardless of device existence / status.
|
||||||
|
old.run('live-dev', 'online', 0);
|
||||||
|
old.run('hb-dev', 'offline_timeout', 3600);
|
||||||
|
|
||||||
|
assert.equal(db.prepare('SELECT COUNT(*) c FROM device_status_log').get().c, 5, 'seeded 5 rows');
|
||||||
|
|
||||||
|
const deleted = pruneStatusLog();
|
||||||
|
assert.equal(deleted, 3, 'the 3 over-retention rows pruned (incl. removed-idle + offline_timeout paths)');
|
||||||
|
|
||||||
|
const remaining = db.prepare('SELECT device_id, status FROM device_status_log ORDER BY device_id').all();
|
||||||
|
assert.equal(remaining.length, 2);
|
||||||
|
// both survivors are the recent rows; no old row of any device/status survived
|
||||||
|
assert.deepEqual(remaining.map(r => r.device_id).sort(), ['hb-dev', 'live-dev']);
|
||||||
|
const oldestNow = db.prepare("SELECT MIN(timestamp) m FROM device_status_log").get().m;
|
||||||
|
const cutoff = Math.floor(Date.now() / 1000) - 2 * 86400;
|
||||||
|
assert.ok(oldestNow >= cutoff, 'no surviving row is older than the retention cutoff');
|
||||||
|
});
|
||||||
|
|
||||||
|
test('sweep is safe and idempotent on an empty/already-clean table', () => {
|
||||||
|
db.exec('DELETE FROM device_status_log');
|
||||||
|
assert.equal(pruneStatusLog(), 0, 'nothing to delete -> 0, no throw');
|
||||||
|
});
|
||||||
|
|
@ -6,6 +6,7 @@ const { db, pruneTelemetry, pruneScreenshots } = require('../db/database');
|
||||||
const config = require('../config');
|
const config = require('../config');
|
||||||
const heartbeat = require('../services/heartbeat');
|
const heartbeat = require('../services/heartbeat');
|
||||||
const commandQueue = require('../lib/command-queue');
|
const commandQueue = require('../lib/command-queue');
|
||||||
|
const reconnectThrottle = require('../lib/reconnect-throttle');
|
||||||
|
|
||||||
// Debounce window for marking a device offline on socket disconnect. Brief
|
// Debounce window for marking a device offline on socket disconnect. Brief
|
||||||
// flap (Wi-Fi blip, Engine.IO ping miss, server-side eviction-then-reconnect)
|
// flap (Wi-Fi blip, Engine.IO ping miss, server-side eviction-then-reconnect)
|
||||||
|
|
@ -27,6 +28,12 @@ const OFFLINE_DEBOUNCE_MS = 5000;
|
||||||
// event is still forwarded every time, so the UI is unaffected. In-memory only.
|
// event is still forwarded every time, so the UI is unaffected. In-memory only.
|
||||||
const lastPlayLogAt = new Map();
|
const lastPlayLogAt = new Map();
|
||||||
const PLAY_LOG_MIN_GAP_MS = 2000;
|
const PLAY_LOG_MIN_GAP_MS = 2000;
|
||||||
|
|
||||||
|
// #142 content-ack dedup. An older app can spam "content <id>: ready" for the same
|
||||||
|
// item; each was logged + emitted individually (secondary load). Suppress identical
|
||||||
|
// (device_id, content_id, status) reports within config.contentAckDedupMs. A status
|
||||||
|
// CHANGE has a different key and passes immediately. In-memory; resets on restart.
|
||||||
|
const lastContentAck = new Map();
|
||||||
const { getUserPlan, getUserDeviceCount } = require('../middleware/subscription');
|
const { getUserPlan, getUserDeviceCount } = require('../middleware/subscription');
|
||||||
// Phase 2.3: deviceRoom() resolves a device_id to its workspace room so
|
// Phase 2.3: deviceRoom() resolves a device_id to its workspace room so
|
||||||
// dashboardNs.emit can be scoped instead of broadcast platform-wide.
|
// dashboardNs.emit can be scoped instead of broadcast platform-wide.
|
||||||
|
|
@ -353,6 +360,23 @@ module.exports = function setupDeviceSocket(io) {
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// #142: per-device reconnect throttle. Only GENUINE reconnects (a new
|
||||||
|
// socket) count — same-socket playlist refreshes (isPlaylistRefresh) are
|
||||||
|
// exempt. This runs BEFORE the heavy register work (DB writes, playlist
|
||||||
|
// build) so a single flapping device cannot saturate the event loop. The
|
||||||
|
// verdict is per-device; global lag only scales an already-flagged
|
||||||
|
// device's backoff, never gates a healthy one.
|
||||||
|
if (!isPlaylistRefresh) {
|
||||||
|
const verdict = reconnectThrottle.check(device_id);
|
||||||
|
if (!verdict.allow) {
|
||||||
|
console.warn(`[throttle] device ${device_id} reconnect throttled: reason=${verdict.reason} band=${verdict.band} observed=${verdict.observed}/${verdict.allowed} per ${config.reconnectWindowMs}ms -> backoff ${verdict.retryAfterMs}ms (level ${verdict.level})`);
|
||||||
|
socket.emit('device:throttled', { retry_after_ms: verdict.retryAfterMs, reason: 'reconnect_rate' });
|
||||||
|
// nextTick disconnect so the throttle notice flushes first.
|
||||||
|
process.nextTick(() => { try { socket.disconnect(true); } catch (_) { /* */ } });
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
currentDeviceId = device_id;
|
currentDeviceId = device_id;
|
||||||
authenticated = true;
|
authenticated = true;
|
||||||
// Cancel any pending offline timer - device is back in the grace window
|
// Cancel any pending offline timer - device is back in the grace window
|
||||||
|
|
@ -372,8 +396,12 @@ module.exports = function setupDeviceSocket(io) {
|
||||||
}
|
}
|
||||||
|
|
||||||
if (device_info) {
|
if (device_info) {
|
||||||
db.prepare('UPDATE devices SET android_version = ?, app_version = ?, screen_width = ?, screen_height = ?, render_width = ?, render_height = ? WHERE id = ?')
|
db.prepare(`UPDATE devices SET android_version = ?, app_version = ?, screen_width = ?, screen_height = ?, render_width = ?, render_height = ?,
|
||||||
.run(device_info.android_version, device_info.app_version, device_info.screen_width, device_info.screen_height, device_info.render_width ?? null, device_info.render_height ?? null, device_id);
|
ota_status = ?, ota_target_version = ?, ota_attempts = ?, ota_updated_at = strftime('%s','now') WHERE id = ?`)
|
||||||
|
.run(device_info.android_version, device_info.app_version, device_info.screen_width, device_info.screen_height, device_info.render_width ?? null, device_info.render_height ?? null,
|
||||||
|
// #139 Phase 2: older APKs don't send these — default to a clean 'none' state.
|
||||||
|
device_info.ota_status ?? 'none', device_info.ota_target_version ?? null, device_info.ota_attempts ?? 0,
|
||||||
|
device_id);
|
||||||
}
|
}
|
||||||
|
|
||||||
heartbeat.registerConnection(device_id, socket.id);
|
heartbeat.registerConnection(device_id, socket.id);
|
||||||
|
|
@ -557,6 +585,13 @@ module.exports = function setupDeviceSocket(io) {
|
||||||
if (!requireDeviceAuth()) return;
|
if (!requireDeviceAuth()) return;
|
||||||
const { device_id, content_id, status } = data;
|
const { device_id, content_id, status } = data;
|
||||||
if (device_id !== currentDeviceId) return;
|
if (device_id !== currentDeviceId) return;
|
||||||
|
// #142: drop repeats of the same (device, content, status) within the dedup
|
||||||
|
// window. Only a change (new content/status) or a report after the window
|
||||||
|
// logs+emits, so a device spamming the same "ready" can't add load.
|
||||||
|
const ackKey = `${device_id}|${content_id}|${status}`;
|
||||||
|
const nowAck = Date.now();
|
||||||
|
if (nowAck - (lastContentAck.get(ackKey) || 0) < config.contentAckDedupMs) return;
|
||||||
|
lastContentAck.set(ackKey, nowAck);
|
||||||
console.log(`Device ${device_id} content ${content_id}: ${status}`);
|
console.log(`Device ${device_id} content ${content_id}: ${status}`);
|
||||||
emitToDeviceWorkspace(dashboardNs, device_id, 'dashboard:content-ack', { device_id, content_id, status });
|
emitToDeviceWorkspace(dashboardNs, device_id, 'dashboard:content-ack', { device_id, content_id, status });
|
||||||
});
|
});
|
||||||
|
|
@ -585,6 +620,20 @@ module.exports = function setupDeviceSocket(io) {
|
||||||
});
|
});
|
||||||
});
|
});
|
||||||
|
|
||||||
|
// #139 Phase 2 (Option B): event-driven OTA status. The device announces a status TRANSITION
|
||||||
|
// ('manual_update_required' on enter-backoff, 'none' on clear) so the dashboard badge updates
|
||||||
|
// promptly without waiting for a reconnect. The register path still persists these fields too
|
||||||
|
// (the reconnect backstop if a transition event is missed). Same columns + ?? defaults.
|
||||||
|
socket.on('device:ota-status', (data) => {
|
||||||
|
if (!requireDeviceAuth()) return;
|
||||||
|
const { device_id, ota_status, ota_target_version, ota_attempts } = data || {};
|
||||||
|
// Unknown / forged / mismatched id -> no-op. WHERE id = ? also makes an unregistered id a
|
||||||
|
// 0-row update (never throws), so a stray event can't error the socket.
|
||||||
|
if (!device_id || device_id !== currentDeviceId) return;
|
||||||
|
db.prepare("UPDATE devices SET ota_status = ?, ota_target_version = ?, ota_attempts = ?, ota_updated_at = strftime('%s','now') WHERE id = ?")
|
||||||
|
.run(ota_status ?? 'none', ota_target_version ?? null, ota_attempts ?? 0, device_id);
|
||||||
|
});
|
||||||
|
|
||||||
// Play event logging (proof-of-play)
|
// Play event logging (proof-of-play)
|
||||||
socket.on('device:play-event', (data) => {
|
socket.on('device:play-event', (data) => {
|
||||||
if (!requireDeviceAuth()) return;
|
if (!requireDeviceAuth()) return;
|
||||||
|
|
|
||||||
|
|
@ -1,6 +1,6 @@
|
||||||
<?xml version="1.0" encoding="UTF-8"?>
|
<?xml version="1.0" encoding="UTF-8"?>
|
||||||
<widget xmlns="http://www.w3.org/ns/widgets" xmlns:tizen="http://tizen.org/ns/widgets"
|
<widget xmlns="http://www.w3.org/ns/widgets" xmlns:tizen="http://tizen.org/ns/widgets"
|
||||||
id="http://screentinker.com/player" version="1.9.1" viewmodes="maximized">
|
id="http://screentinker.com/player" version="1.9.2" viewmodes="maximized">
|
||||||
<tizen:application id="ScrnTinkr1.ScreenTinker" package="ScrnTinkr1" required_version="2.4"/>
|
<tizen:application id="ScrnTinkr1.ScreenTinker" package="ScrnTinkr1" required_version="2.4"/>
|
||||||
<tizen:profile name="tv"/>
|
<tizen:profile name="tv"/>
|
||||||
<name>ScreenTinker</name>
|
<name>ScreenTinker</name>
|
||||||
|
|
|
||||||
Loading…
Reference in a new issue