Commit graph

184 commits

Author SHA1 Message Date
ed b2e8bf6e89 selftest dxml on startup:
try to decode some malicious xml on startup; if this succeeds,
then force-disable all xml-based features (primarily WebDAV)

this is paranoid future-proofing against unanticipated changes
in future versions of python, specifically if the importlib or
xml.etree.ET behavior changes in a way that somehow reenables
entity expansion, which (still hypothetically) would probably
be caused by failing to unload the `_elementtree` c-module

no past or present python versions are affected by this change
2025-01-17 06:06:36 +00:00
ed 87598dcd7f recent-uploads: move rendering to js
* loads 50% faster, reducing server-load by 30%

* inhibits search engines from indexing it

* eyecandy (filter applies automatically on edit)
2024-12-20 23:52:03 +00:00
ed 73f7249c5f decode and log request URLs; closes #125
as processing of a HTTP request begins (GET, HEAD, PUT, POST, ...),
the original query line is printed in its encoded form. This makes
debugging easier, since there is no ambiguity in how the client
phrased its request.

however, this results in very opaque logs for non-ascii languages;
basically a wall of percent-encoded characters. Avoid this issue
by printing an additional log-message if the URL contains `%`,
immediately below the original url-encoded entry.

also fix tests on macos, and an unrelated bad logmsg in up2k
2024-12-16 00:53:22 +01:00
ed 016708276c add sorting granularity options for media URLs 2024-12-03 20:01:19 +00:00
ed 42fd66675e tests: improve specificity 2024-12-01 15:36:35 +00:00
ed 21a3f3699b webdav: add tests + fix minor edgecases
* allow depth:0 at top of unmapped root

* cannot use the Referer header to identify
   graphical browsers since rclone sends it
2024-12-01 14:44:41 +00:00
ed 2ce8233921 webdav: auth-challenge clients correctly:
* return 403 instead of 404 in the following sitations:
  * viewing an RSS feed without necessary auth
  * accessing a file with the wrong filekey
  * accessing a file/folder without necessary auth
     (would previously 404 for intentional ambiguity)

* only allow PROPFIND if user has either read or write;
   previously a blank response was returned if user has
   get-access, but this could confuse webdav clients into
   skipping authentication (for example AuthPass)

* return 401 basic-challenge instead of 403 if the client
   appears to be non-graphical, because many webdav clients
   do not provide the credentials until they're challenged.
   There is a heavy bias towards assuming the client is a
   browser, because browsers must NEVER EVER get a 401
   (tricky state that is near-impossible to deal with)

* return 401 basic-challenge instead of 403 if a PUT
   is attempted without any credentials included; this
   should be safe, as graphical browsers never do that

this fixes the interoperability issues mentioned in
https://github.com/authpass/authpass/issues/379
where AuthPass would GET files without providing the
password because it expected a 401 instead of a 403;
AuthPass is behaving correctly, this is not a bug
2024-11-27 22:07:53 +00:00
ed 697a4fa8a4 exclude search results by regex (#120)
a better alternative to using `--no-idx` for this purpose since
this also excludes recent uploads, not just during fs-indexing,
and it doesn't prevent deduplication

also speeds up searches by a tiny amount due to building the
sanchecks into the exclude-filter while parsing the config,
instead of during each search query
2024-11-26 23:57:01 +00:00
ed 2ab8924e2d tests/debug: plug some resource leaks 2024-11-22 22:21:43 +00:00
ed 8aba5aed4f list active downloads in controlpanel 2024-11-10 02:12:18 +00:00
ed cacec9c1f3 support copying files/folders; closes #115
behaves according to the target volume's deduplication config;
will create symlinks / hardlinks instead if dedup is enabled
2024-11-07 21:41:53 +00:00
ed dd6dbdd90a http 304: client-option to force-disable cache
an extremely brutish workaround for issues such as #110 where
browsers receive an HTTP 304 and misinterpret as HTTP 200

option `--no304=1` adds the button `no304` to the controlpanel
which can be enabled to force-disable caching in that browser

the button is default-disabled; by specifying `--no304=2`
instead of `--no304=1` the button becomes default-enabled

can also always be enabled by accessing `/?setck=no304=y`
2024-10-26 17:56:54 +00:00
ed 7678a91b0e debug: --ohead (log response headers) 2024-10-25 20:00:19 +00:00
ed 7ffd805a03 add RSS feed output; closes #109 2024-10-18 23:24:12 +00:00
ed b7f9bf5a28 cidr-based autologin 2024-10-13 21:56:26 +00:00
ed 3d7facd774 add option to entirely disable dedup
global-option `--no-clone` / volflag `noclone` entirely disables
serverside deduplication; clients will then fully upload dupe files

can be useful when `--safe-dedup=1` is not an option due to other
software tampering with the on-disk files, and your filesystem has
prohibitively slow or expensive reads
2024-10-08 21:27:19 +00:00
ed 1ff14b4e05 optimizations, failsafes, formatting 2024-10-02 21:59:53 +00:00
ed 678675a9a6 fix prometheus metrics; broke in 609c5921 2024-09-16 21:04:58 +00:00
ed 427597b603 show total directory size in listings
sizes are computed during `-e2ds` indexing, and new uploads
are counted, but a rescan is necessary after a move or delete
2024-09-15 23:01:18 +00:00
ed d67e9cc507 sqlite and misc optimizations:
* exponentially slow upload handshakes caused by lack of rd+fn
   sqlite index; became apparent after a volume hit 200k files
* listing big folders 5% faster due to `_quotep3b`
* optimize `unquote`, 20% faster but only used rarely
* reindex on startup 150x faster in some rare cases
   (same filename in MANY folders)

the database is now around 10% larger (likely worst-case)
2024-09-15 13:18:43 +00:00
ed 609c5921d4 list incoming files + ETA in controlpanel 2024-09-10 21:24:05 +00:00
ed b5405174ec add login sessions 2024-09-09 23:39:20 +00:00
ed 6eee601521 fix u2c --ow (overwrite/replace)
the u2c flag to overwrite files on the server became no-op in v1.13.8
2024-09-09 19:40:38 +00:00
ed a2e0f98693 disable upload deduplication by default;
dedup is still encouraged and fully supported, but
being default-enabled has caused too many surprises

enabling `--dedup` restores the previous default behavior

also renames `--never-symlink` to `--hardlink-only`
2024-09-08 17:09:14 +00:00
ed 1111153f06 test dedup relinking 2024-09-08 12:55:27 +00:00
ed 4401de0413 fix mv with --no-dedup in volumes with dupes;
if --no-dedup was enabled in a volume which already contained
symlinked duplicate files, renaming/moving folders could fail

this is due to folder contents being moved one file at a time
(which is how symlink breakage is prevented) except the links
are moved assuming the final directory layout, meaning they
may be intermittently broken during the movie

with no-dedup, the symlinks are converted into full files as
each symlink is encountered, but a temporarily broken symlink
would crash the procedure

fix this by giving `_symlink` a new parameter `fsrc`
which is a known valid inode for data copying purposes
2024-09-07 00:47:12 +00:00
ed 6e671c5245 verify on-disk contents before dedup;
previously, the assumption was made that the database and filesystem
would not desync, and that an upload could safely be substituted with
a symlink to an existing copy on-disk, assuming said copy still
existed on-disk at all

this is fine if copyparty is the only software that makes changes to
the filesystem, but that is a shitty assumption to make in hindsight

add `--safe-dedup` which takes a "safety level", and by default (50)
it will no longer blindly expect that the filesystem has not been
altered through other means; the file contents will now be hashed
and compared to the database

deduplication can be much slower as a result, but definitely worth it
as this avoids some potentially very unpleasant surprises

the previous behavior can be restored with `--safe-dedup 1`
2024-09-06 19:08:14 +00:00
ed 5a8c3b8be0 optimize test_httpcli.py too, from 1.64 to 1.51s 2024-08-31 22:03:06 +00:00
ed 1c9c17fb9b optimize test_dedup.py
* 7.71s originally
* 4.51s with fstab reuse
* 4.34s without db_wd
* 4.02s with no pp start
* 3.73s with Cfg reuse
2024-08-31 21:54:47 +00:00
ed 3da62ec234 fix dedup bug as of v1.13.8:
* v1.13.8 broke collision resolving for non-identical files;
   the correct filename was reserved but not symlinked to
   the original file, leaving a zerobyte file instead.
   See v1.14.3 github release notes for remediation info

* add sanchecks for early detection of index/fs desync;
   saves performance and gives less confusing logs
2024-08-30 22:06:25 +00:00
ed 7c2beba555 add file/folder sharing; closes #84 2024-08-18 22:49:13 +00:00
ed 0b46b1a614 fix some vproxy issues (#93):
* navpane would always feed the vproxy paths into the tree
   instead of only when necessary (the initial load)

* mkdir would return `X-New-Dir` without the `rp-loc` prefix
  * chpw and some other redirects also sent raw vpaths

Reported-by: @iridial
2024-08-17 18:17:40 +00:00
ed 83fb569d61 make passwords user-changeable; closes #92 2024-08-14 20:09:57 +00:00
ed ee9aad82dd support listening on unix sockets 2024-08-12 21:58:02 +00:00
ed 6c94a63f1c add hook side-effects; closes #86
hooks can now interrupt or redirect actions, and initiate
related actions, by printing json on stdout with commands

mainly to mitigate limitations such as sharex/sharex#3992

xbr/xau can redirect uploads to other destinations with `reloc`
and most hooks can initiate indexing or deletion of additional
files by giving a list of vpaths in json-keys `idx` or `del`

there are limitations;
* xbu/xau effects don't apply to ftp, tftp, smb
* xau will intentionally fail if a reloc destination exists
* xau effects do not apply to up2k

also provides more details for hooks:
* xbu/xau: basic-uploader vpath with filename
* xbr/xar: add client ip
2024-08-11 14:52:32 +00:00
ed d5c9c8ebbd make it 5% faster 2024-07-31 17:51:53 +00:00
ed 746229846d add test for zip-download 2024-07-30 22:44:29 +00:00
ed 0219eada23 cleanup: strip trailing whitespace 2024-07-26 19:33:56 +00:00
ed 132a83501e add chunk stitching; twice as fast long-distance uploads:
rather than sending each file chunk as a separate HTTP request,
sibling chunks will now be fused together into larger HTTP POSTs
which results in unreasonably huge speed boosts on some routes
( `2.6x` from Norway to US-East,  `1.6x` from US-West to Finland )

the `x-up2k-hash` request header now takes a comma-separated list
of chunk hashes, which must all be sibling chunks, resulting in
one large consecutive range of file data as the post body

a new global-option `--u2sz`, default `1,64,96`, sets the target
request size as 64 MiB, allowing the settings ui to specify any
value between 1 and 96 MiB, which is cloudflare's max value

this does not cause any issues for resumable uploads; thanks to the
streaming HTTP POST parser, each chunk will be verified and written
to disk as they arrive, meaning only the untransmitted chunks will
have to be resent in the event of a connection drop -- of course
assuming there are no misconfigured WAFs or caching-proxies

the previous up2k approach of uploading each chunk in a separate HTTP
POST was inefficient in many real-world scenarios, mainly due to TCP
window-scaling behaving erratically in some IXPs / along some routes

a particular link from Norway to Virginia,US is unusably slow for
the first 4 MiB, only reaching optimal speeds after 100 MiB, and
then immediately resets the scale when the request has been sent;
connection reuse does not help in this case

on this route, the basic-uploader was somehow faster than up2k
with 6 parallel uploads; only time i've seen this
2024-07-21 23:35:37 +00:00
ed 1cdb170290 order-significant --th-covers;
the first matching filename as listed in the
`--th-covers` global-option will always be selected
2024-07-13 00:54:38 +02:00
ed 9a87ee2fe4 add gsel option; closes #85
global-option `--gsel`, volflag `gsel` default-enables the
client setting to select files by ctrl-clicking them in the grid
2024-06-18 22:47:17 +02:00
ed 52e06226a2 make thumbnails compatible with dirkeys/filekeys
was intentionally skipped to avoid complexity but enough people have
asked why it doesn't work that it's time to do something about it

turns out it wasn't that bad
2024-06-16 21:35:43 +02:00
ed 5ad65450c4 more intuitive df option/volflag, closes #83 2024-06-01 01:15:34 +00:00
ed 560d7b6672 option to add or change mimetype mappings 2024-05-08 21:12:14 +00:00
ed da091aec85 "volume" is too overloaded, make it --au-vol instead 2024-05-05 21:27:07 +00:00
ed 36f2c446af opengraph stuff:
* template-based title formatting
* picture embeds are no longer ant-sized
* `--og-color` sets accent color; default #333
* `--og-s-title` forces default title, ignoring e2t
* add a music indicator to song titles because discord doesn't
2024-05-03 00:11:40 +00:00
ed 69517e4624 add general-purpose query-string parcelling;
currently only being used to workaround discord discarding
query strings in opengraph tags, but i'm sure there will be
plenty more wonderful usecases for this atrocity
2024-05-02 22:49:27 +00:00
ed ea270ab9f2 add og / opengraph / discord embeds 2024-05-01 23:40:56 +00:00
ed e8db3dd37f fix tests on windows 2024-04-25 22:25:38 +00:00
ed f6e693f0f5 reevaluate support for sparse files periodically
if a given filesystem were to disappear (e.g. removable storage)
followed by another filesystem appearing at the same location,
this would not get noticed by up2k in a timely manner

fix this by discarding the mtab cache after `--mtab-age` seconds and
rebuild it from scratch, unless the previous values are definitely
correct (as indicated by identical output from `/bin/mount`)

probably reduces windows performance by an acceptable amount
2024-04-24 21:18:26 +00:00