zip-download: eagerly 64bit data-descriptors; closes #155

this avoids a false-positive in the info-zip unzip zipbomb detector.

unfortunately,

* now impossible to extract large (4 GiB) zipfiles using old software
   (WinXP, macos 10.12)

* now less viable to stream download-as-zip into a zipfile unpacker
   (please use download-as-tar for that purpose)

context:

the zipfile specification (APPNOTE.TXT) is slightly ambiguous as to when
data-descriptor (0x504b0708) filesize-fields change from 32bit to 64bit;
both copyparty and libarchive independently made the same interpretation
that this is only when the local header is zip64, AND the size-fields
are both 0xFFFFFFFF. This makes sense because the data descriptor is
only necessary when that particular file-to-be-added exceeds 4 GiB,
and/or when the crc32 is not known ahead of time.

another interpretation, seen in an early version of the patchset
to fix CVE-2019-13232 (zip-bombs) in the info-zip unzip command,
believes the only requirement is that the local header is zip64.

in many linux distributions, the unzip command would thus fail on
zipfiles created by copyparty, since they (by default) satisfy
the three requirements to hit the zipbomb false-positive:

* total filesize exceeds 4 GiB, and...
* a mix of regular (32bit) and zip64 entries, and...
* streaming-mode zipfile (not made with ?zip=crc)

this issue no longer exists in a more recent version of that patchset,
https://github.com/madler/unzip/commit/af0d07f95809653b
but this fix has not yet made it into most linux distros
This commit is contained in:
ed 2025-04-17 18:52:47 +00:00
parent e1c20c7a18
commit db33d68d42
2 changed files with 11 additions and 6 deletions

View file

@ -731,6 +731,7 @@ select which type of archive you want in the `[⚙️] config` tab:
* `up2k.db` and `dir.txt` is always excluded * `up2k.db` and `dir.txt` is always excluded
* bsdtar supports streaming unzipping: `curl foo?zip | bsdtar -xv` * bsdtar supports streaming unzipping: `curl foo?zip | bsdtar -xv`
* good, because copyparty's zip is faster than tar on small files * good, because copyparty's zip is faster than tar on small files
* but `?tar` is better for large files, especially if the total exceeds 4 GiB
* `zip_crc` will take longer to download since the server has to read each file twice * `zip_crc` will take longer to download since the server has to read each file twice
* this is only to support MS-DOS PKZIP v2.04g (october 1993) and older * this is only to support MS-DOS PKZIP v2.04g (october 1993) and older
* how are you accessing copyparty actually * how are you accessing copyparty actually

View file

@ -54,6 +54,7 @@ def gen_fdesc(sz: int, crc32: int, z64: bool) -> bytes:
def gen_hdr( def gen_hdr(
h_pos: Optional[int], h_pos: Optional[int],
z64: bool,
fn: str, fn: str,
sz: int, sz: int,
lastmod: int, lastmod: int,
@ -70,7 +71,6 @@ def gen_hdr(
# appnote 4.5 / zip 3.0 (2008) / unzip 6.0 (2009) says to add z64 # appnote 4.5 / zip 3.0 (2008) / unzip 6.0 (2009) says to add z64
# extinfo for values which exceed H, but that becomes an off-by-one # extinfo for values which exceed H, but that becomes an off-by-one
# (can't tell if it was clamped or exactly maxval), make it obvious # (can't tell if it was clamped or exactly maxval), make it obvious
z64 = sz >= 0xFFFFFFFF
z64v = [sz, sz] if z64 else [] z64v = [sz, sz] if z64 else []
if h_pos and h_pos >= 0xFFFFFFFF: if h_pos and h_pos >= 0xFFFFFFFF:
# central, also consider ptr to original header # central, also consider ptr to original header
@ -244,6 +244,7 @@ class StreamZip(StreamArc):
sz = st.st_size sz = st.st_size
ts = st.st_mtime ts = st.st_mtime
h_pos = self.pos
crc = 0 crc = 0
if self.pre_crc: if self.pre_crc:
@ -252,8 +253,12 @@ class StreamZip(StreamArc):
crc &= 0xFFFFFFFF crc &= 0xFFFFFFFF
h_pos = self.pos # some unzip-programs expect a 64bit data-descriptor
buf = gen_hdr(None, name, sz, ts, self.utf8, crc, self.pre_crc) # even if the only 32bit-exceeding value is the offset,
# so force that by placeholdering the filesize too
z64 = h_pos >= 0xFFFFFFFF or sz >= 0xFFFFFFFF
buf = gen_hdr(None, z64, name, sz, ts, self.utf8, crc, self.pre_crc)
yield self._ct(buf) yield self._ct(buf)
for buf in yieldfile(src, self.args.iobuf): for buf in yieldfile(src, self.args.iobuf):
@ -266,8 +271,6 @@ class StreamZip(StreamArc):
self.items.append((name, sz, ts, crc, h_pos)) self.items.append((name, sz, ts, crc, h_pos))
z64 = sz >= 4 * 1024 * 1024 * 1024
if z64 or not self.pre_crc: if z64 or not self.pre_crc:
buf = gen_fdesc(sz, crc, z64) buf = gen_fdesc(sz, crc, z64)
yield self._ct(buf) yield self._ct(buf)
@ -306,7 +309,8 @@ class StreamZip(StreamArc):
cdir_pos = self.pos cdir_pos = self.pos
for name, sz, ts, crc, h_pos in self.items: for name, sz, ts, crc, h_pos in self.items:
buf = gen_hdr(h_pos, name, sz, ts, self.utf8, crc, self.pre_crc) z64 = h_pos >= 0xFFFFFFFF or sz >= 0xFFFFFFFF
buf = gen_hdr(h_pos, z64, name, sz, ts, self.utf8, crc, self.pre_crc)
mbuf += self._ct(buf) mbuf += self._ct(buf)
if len(mbuf) >= 16384: if len(mbuf) >= 16384:
yield mbuf yield mbuf