if --no-dedup was enabled in a volume which already contained
symlinked duplicate files, renaming/moving folders could fail
this is due to folder contents being moved one file at a time
(which is how symlink breakage is prevented) except the links
are moved assuming the final directory layout, meaning they
may be intermittently broken during the movie
with no-dedup, the symlinks are converted into full files as
each symlink is encountered, but a temporarily broken symlink
would crash the procedure
fix this by giving `_symlink` a new parameter `fsrc`
which is a known valid inode for data copying purposes
previously, the assumption was made that the database and filesystem
would not desync, and that an upload could safely be substituted with
a symlink to an existing copy on-disk, assuming said copy still
existed on-disk at all
this is fine if copyparty is the only software that makes changes to
the filesystem, but that is a shitty assumption to make in hindsight
add `--safe-dedup` which takes a "safety level", and by default (50)
it will no longer blindly expect that the filesystem has not been
altered through other means; the file contents will now be hashed
and compared to the database
deduplication can be much slower as a result, but definitely worth it
as this avoids some potentially very unpleasant surprises
the previous behavior can be restored with `--safe-dedup 1`
* v1.13.8 broke collision resolving for non-identical files;
the correct filename was reserved but not symlinked to
the original file, leaving a zerobyte file instead.
See v1.14.3 github release notes for remediation info
* add sanchecks for early detection of index/fs desync;
saves performance and gives less confusing logs
* navpane would always feed the vproxy paths into the tree
instead of only when necessary (the initial load)
* mkdir would return `X-New-Dir` without the `rp-loc` prefix
* chpw and some other redirects also sent raw vpaths
Reported-by: @iridial
hooks can now interrupt or redirect actions, and initiate
related actions, by printing json on stdout with commands
mainly to mitigate limitations such as sharex/sharex#3992
xbr/xau can redirect uploads to other destinations with `reloc`
and most hooks can initiate indexing or deletion of additional
files by giving a list of vpaths in json-keys `idx` or `del`
there are limitations;
* xbu/xau effects don't apply to ftp, tftp, smb
* xau will intentionally fail if a reloc destination exists
* xau effects do not apply to up2k
also provides more details for hooks:
* xbu/xau: basic-uploader vpath with filename
* xbr/xar: add client ip
rather than sending each file chunk as a separate HTTP request,
sibling chunks will now be fused together into larger HTTP POSTs
which results in unreasonably huge speed boosts on some routes
( `2.6x` from Norway to US-East, `1.6x` from US-West to Finland )
the `x-up2k-hash` request header now takes a comma-separated list
of chunk hashes, which must all be sibling chunks, resulting in
one large consecutive range of file data as the post body
a new global-option `--u2sz`, default `1,64,96`, sets the target
request size as 64 MiB, allowing the settings ui to specify any
value between 1 and 96 MiB, which is cloudflare's max value
this does not cause any issues for resumable uploads; thanks to the
streaming HTTP POST parser, each chunk will be verified and written
to disk as they arrive, meaning only the untransmitted chunks will
have to be resent in the event of a connection drop -- of course
assuming there are no misconfigured WAFs or caching-proxies
the previous up2k approach of uploading each chunk in a separate HTTP
POST was inefficient in many real-world scenarios, mainly due to TCP
window-scaling behaving erratically in some IXPs / along some routes
a particular link from Norway to Virginia,US is unusably slow for
the first 4 MiB, only reaching optimal speeds after 100 MiB, and
then immediately resets the scale when the request has been sent;
connection reuse does not help in this case
on this route, the basic-uploader was somehow faster than up2k
with 6 parallel uploads; only time i've seen this
was intentionally skipped to avoid complexity but enough people have
asked why it doesn't work that it's time to do something about it
turns out it wasn't that bad
* template-based title formatting
* picture embeds are no longer ant-sized
* `--og-color` sets accent color; default #333
* `--og-s-title` forces default title, ignoring e2t
* add a music indicator to song titles because discord doesn't
currently only being used to workaround discord discarding
query strings in opengraph tags, but i'm sure there will be
plenty more wonderful usecases for this atrocity
if a given filesystem were to disappear (e.g. removable storage)
followed by another filesystem appearing at the same location,
this would not get noticed by up2k in a timely manner
fix this by discarding the mtab cache after `--mtab-age` seconds and
rebuild it from scratch, unless the previous values are definitely
correct (as indicated by identical output from `/bin/mount`)
probably reduces windows performance by an acceptable amount
counterpart of `--s-wr-sz` which existed already
the default (256 KiB) appears optimal in the most popular scenario
(linux host with storage on local physical disk, usually NVMe)
was previously 32 KiB, so large uploads should now use 17% less CPU
also adds sanchecks for values of `--iobuf`, `--s-rd-sz`, `--s-wr-sz`
also adds file-overwrite feature for multipart posts
the default (256 KiB) appears optimal in the most popular scenario
(linux host with storage on local physical disk, usually NVMe)
was previously a mix of 64 and 512 KiB;
now the same value is enforced everywhere
download-as-tar is now 20% faster with the default value
it is now possible to grant access to users other than `${u}`
(the user which the volume belongs to)
previously, permissions did not apply correctly to IdP volumes due to
the way `${u}` and `${g}` was expanded, which was a funky iteration
over all known users/groups instead of... just expanding them?
also adds another sanchk that a volume's URL must contain a
`${u}` to be allowed to mention `${u}` in the accs list, and
similarly for `${g}` / `@${g}` since users can be in multiple groups
to abort an upload, refresh the page and access the unpost tab,
which now includes unfinished uploads (sorted before completed ones)
can be configured through u2abort (global or volflag);
by default it requires both the IP and account to match
https://a.ocv.me/pub/g/nerd-stuff/2024-0310-stoltzekleiven.jpg
as each chunk is written to the file, httpcli calls
up2k.confirm_chunk to register the chunk as completed, and the reply
indicates whether that was the final outstanding chunk, in which case
httpcli closes the file descriptors since there's nothing more to write
the issue is that the final chunk is registered as completed before the
file descriptors are closed, meaning there could be writes that haven't
finished flushing to disk yet
if the client decides to issue another handshake during this window,
up2k sees that all chunks are complete and calls up2k.finish_upload
even as some threads might still be flushing the final writes to disk
so the conditions to hit this bug were as follows (all must be true):
* multiprocessing is disabled
* there is a reverse-proxy
* a client has several idle connections and reuses one of those
* the server's filesystem is EXTREMELY slow, to the point where
closing a file takes over 30 seconds
the fix is to stop handshakes from being processed while a file is
being closed, which is unfortunately a small bottleneck in that it
prohibits initiating another upload while one is being finalized, but
the required complexity to handle this better is probably not worth it
(a separate mutex for each upload session or something like that)
this issue is mostly harmless, partially because it is super tricky to
hit (only aware of it happening synthetically), and because there is
usually no harmful consequences; the worst-case is if this were to
happen exactly as the server OS decides to crash, which would make the
file appear to be fully uploaded even though it's missing some data
(all extremely unlikely, but not impossible)
there is no performance impact; if anything it should now accept
new tcp connections slightly faster thanks to more granular locking
features which should be good to go:
* user groups
* assigning permissions by group
* dynamically created volumes based on username/groupname
* rebuild vfs when new users/groups appear
but several important features still pending;
* detect dangerous configurations
* dynamic vol below readable path
* remember volumes created during previous runs
* helps prevent unintended access
* correct filesystem-scan on startup
* allow mounting `/` (the entire filesystem) as a volume
* not that you should (really, you shouldn't)
* improve `-v` helptext
* change IdP group symbol to @ because % is used for file inclusion
* not technically necessary but is less confusing in docs