the up2k databases are, by default, stored in a `.hist` subfolder
inside each volume, next to thumbnails and transcoded audio
add a new option for storing the databases in a separate location,
making it possible to tune the underlying filesystem for optimal
performance characteristics
the `--hist` global-option and `hist` volflag still behave like
before, but `--dbpath` and volflag `dbpath` will override the
histpath for the up2k-db and up2k-snap exclusivey
specifically google, but also some others, have started ignoring
rel="nofollow" while also understanding just enough javascript to
try viewing binary files as text
download-as-tar-gz becomes 2.4x faster in docker
segfaults on windows, so don't use it there
does not affect fedora or gentoo,
since zlib-ng is already system-default on those
also adds a global-option to write list of successful
binds to a textfile, for automation / smoketest purposes
too restrictive, blocking editing through webdav and ftp
but since logues and readmes can be used as helptext for users
with write-only access, it makes sense to block logue/readme
uploads from write-only users
users with write-only access can still upload any file as before,
but the filename prefix `_wo_` is added onto files named either
README.md | PREADME.md | .prologue.html | .epilogue.html
the new option `--wo-up-readme` restores previous behavior, and
will not add the filename-prefix for readmes/logues
previously, the native python-error was printed when reading
the contents of a textfile using the wrong character encoding
while technically correct, it could be confusing for end-users
add a helper to produce a more helpful errormessage when
someone (for example) tries to load a latin-1 config file
support for "owa", audio-only webm, was introduced in iOS 17.5
owa is a more compliant alternative to opus-caf from iOS 11,
which was technically limited to CBR opus, a limitation which
we ignored since it worked mostly fine for regular opus too
being the new officially-recommended way to do things,
we'll default to owa for iOS 18 and later, even though
iOS still has some bugs affecting our use specifically:
if a weba file is preloaded into a 2nd audio object,
safari will throw a spurious exception as playback is
initiated, even as the file is playing just fine
the `.ld` stuff is an attempt at catching and ignoring this
spurious error without eating any actual network exceptions
try to decode some malicious xml on startup; if this succeeds,
then force-disable all xml-based features (primarily WebDAV)
this is paranoid future-proofing against unanticipated changes
in future versions of python, specifically if the importlib or
xml.etree.ET behavior changes in a way that somehow reenables
entity expansion, which (still hypothetically) would probably
be caused by failing to unload the `_elementtree` c-module
no past or present python versions are affected by this change
url-param / header `ck` specifies hashing algo;
md5 sha1 sha256 sha512 b2 blake2 b2s blake2s
value 'no' or blank disables checksumming,
for when copyparty is running on ancient gear
and you don't really care about file integrity
* if free ram on startup is less than 2 GiB,
use smaller chunks for parallel file hashing
* if --th-max-ram is lower than 0.25 (256 MiB),
print a warning that thumbnails will not work
* make thumbnail cleaner immediately do a sweep on startup,
forgetting any failed conversions so they can be retried
in case the memory limit was increased since last run
PUT uploads, as used by webdav, would stat the absolute
path of the file to be created, which would throw ENOENT
strip components until the path is an existing directory
and also try to enforce disk space / volume size limits
even when the incoming file is of unknown size
previously, the biggest file that could be uploaded through
cloudflare was 383 GiB, due to max num chunks being 4096
`--u2sz`, which takes three ints (min-size, target, max-size)
can now be used to enforce a max chunksize; chunks larger
than max-size gets split into smaller subchunks / chunklets
subchunks cannot be stitched/joined, and subchunks of the
same chunk must be uploaded sequentially, one at a time
if a subchunk fails due to bitflips or connection-loss,
then the entire chunk must (and will) be reuploaded
* pyz: yeet the resource tar which is now pointless thanks to pkgres
* cache impresource stuff because pyz lookups are Extremely slow
* prefer tx_file when possible for slightly better performance
* use hardcoded list of expected resources instead of dynamic
discovery at runtime; much simpler and probably safer
* fix some forgotten resources (copying.txt, insecure.pem)
* fix loading jinja templates on windows
add support for reading webdeps and jinja-templates using either
importlib_resources or pkg_resources, which removes the need for
extracting these to a temporary folder on the filesystem
* util: add helper functions to abstract embedded resource access
* http*: serve embedded resources through resource abstraction
* main: check webdeps through resource abstraction
* httpconn: remove unused method `respath(name)`
* use __package__ to find package resources
* util: use importlib_resources backport if available
* pass E.pkg as module object for importlib_resources compatibility
* util: add pkg_resources compatibility to resource abstraction
* exponentially slow upload handshakes caused by lack of rd+fn
sqlite index; became apparent after a volume hit 200k files
* listing big folders 5% faster due to `_quotep3b`
* optimize `unquote`, 20% faster but only used rarely
* reindex on startup 150x faster in some rare cases
(same filename in MANY folders)
the database is now around 10% larger (likely worst-case)
dedup is still encouraged and fully supported, but
being default-enabled has caused too many surprises
enabling `--dedup` restores the previous default behavior
also renames `--never-symlink` to `--hardlink-only`
hooks can now interrupt or redirect actions, and initiate
related actions, by printing json on stdout with commands
mainly to mitigate limitations such as sharex/sharex#3992
xbr/xau can redirect uploads to other destinations with `reloc`
and most hooks can initiate indexing or deletion of additional
files by giving a list of vpaths in json-keys `idx` or `del`
there are limitations;
* xbu/xau effects don't apply to ftp, tftp, smb
* xau will intentionally fail if a reloc destination exists
* xau effects do not apply to up2k
also provides more details for hooks:
* xbu/xau: basic-uploader vpath with filename
* xbr/xar: add client ip
hooks can be restricted to users with certain permissions, for example
`--xm aw,notify-send` will only `notify-send` if user has write-access
the user's list of permissions are now also included in the json
that is passed to the hook if enabled; `--xm aw,j,notify-send`
will now also stop parsing flags when encountering a blank value,
allowing to specify any initial arguments to the command:
`--xm aw,j,,notify-send,hey` would run `notify-send` with `hey`
as its first argument, and the json would be the 2nd argument,
similarly `--xm ,notify-send,hey` when no flags specified
this is somewhat explained in `--help-hooks`, but
additional related features are planned in the near future
and will all be better documented when the dust settles
* upgrade to partftpy 0.4.0
* workarounds for buggy clients/servers
* improved ipv6 support, especially on macos
* improved robustness on unreliable networks
* make `--tftp4` separate from `--ftp4`
when there was more than ~700 active connections,
* sendfile (non-https downloads) could fail
* mdns and ssdp could fail to reinitialize on network changes
...because `select` can't handle FDs higher than 512 on windows
(1024 on linux/macos), so prefer `poll` where possible (linux/macos)
but apple keeps breaking and unbreaking `poll` in macos,
so use `--no-poll` if necessary to force `select` instead
use sigmasks to block SIGINT, SIGTERM, SIGUSR1 from all other threads
also initiate shutdown by calling sighandler directly,
in case this misses anything and that is still unreliable
(discovered by `--exit=idx` being noop once in a blue moon)
currently only being used to workaround discord discarding
query strings in opengraph tags, but i'm sure there will be
plenty more wonderful usecases for this atrocity
counterpart of `--s-wr-sz` which existed already
the default (256 KiB) appears optimal in the most popular scenario
(linux host with storage on local physical disk, usually NVMe)
was previously 32 KiB, so large uploads should now use 17% less CPU
also adds sanchecks for values of `--iobuf`, `--s-rd-sz`, `--s-wr-sz`
also adds file-overwrite feature for multipart posts
the default (256 KiB) appears optimal in the most popular scenario
(linux host with storage on local physical disk, usually NVMe)
was previously a mix of 64 and 512 KiB;
now the same value is enforced everywhere
download-as-tar is now 20% faster with the default value
some clients (clonezilla-webdav) rapidly create and delete files;
this fails if copyparty is still hashing the file (usually the case)
and the same thing can probably happen due to antivirus etc
add global-option --rm-retry (volflag rm_retry) specifying
for how long (and how quickly) to keep retrying the deletion
default: retry for 5sec on windows, 0sec (disabled) on everything else
because this is only a problem on windows
* webdav: extend applesan regex with more stuff to exclude
* on macos, set applesan as default `--no-idx` to avoid indexing them
(they didn't show up in search since they're dotfiles, but still)
igloo irc has an absolute time limit of 2 minutes before it just
disconnects mid-upload and that kinda looked like it had a buggy
multipart generator instead of just being funny
anticipating similar events in the future, also log the
client-selected boundary value to eyeball its yoloness
primarily to support uploading from Igloo IRC but also generally useful
(not actually tested with Igloo IRC yet because it's a paid feature
so just gonna wait for spiky to wake up and tell me it didn't work)
* permission `.` grants dotfile visibility if user has `r` too
* `-ed` will grant dotfiles to all `r` accounts (same as before)
* volflag `dots` likewise
also drops compatibility for pre-0.12.0 `-v` syntax
(`-v .::red` will no longer translate to `-v .::r,ed`)
will probably fail when some devices (sup iphone) stream to car stereos
but at least passwords won't end up somewhere unexpected this way
(plus, the js no longer uses the jank url to request waveforms)
* some malicious requests are now answered with HTTP 422,
so that they count against --ban-422
* do not include request headers when replying to invalid requests,
in case there is a reverse-proxy inserting something interesting
not even the deprecationwarning that got silently generated burning
20~30% of all CPU-time without actually displaying it anywhere, nice
python 3.12.0 is now only 5% slower than 3.11.6
also fixes some other, less-performance-fatal deprecations
* --ban-url: URLs which 404 and also match --sus-urls (bot-scan)
* --ban-403: trying to access volumes that dont exist or require auth
* --ban-422: invalid POST messages, fuzzing and such
* --nonsus-urls: regex of 404s which shouldn't trigger --ban-404
in may situations it makes sense to handle this logic inside copyparty,
since stuff like cloudflare and running copyparty on another physical
box than the nginx frontend is on becomes fairly clunky