-
Notifications
You must be signed in to change notification settings - Fork 24
Description
Summary
During testing, I discovered that SlideLoader’s chunked upload pipeline can silently produce corrupted slide files when chunks arrive out of order or when the browser sends multiple parallel chunk uploads.
This happens because SlideServer writes chunks purely based on the client-provided offset without validating:
-
whether the chunk order is correct
-
whether the offset matches the current temp file size
-
whether chunks overlap
-
whether the client retried a previous offset
-
whether the full file content matches the original
This results in corrupted slides being saved without any error.
What works
✔ Sequential Python upload
Uploading a large slide (>2GB) using a simple Python script sending sequential chunks works correctly.
✔ Sequential browser upload
A browser-based uploader (using Fetch API) also works when chunks are strictly uploaded one-by-one.
What breaks
❌ Out-of-order chunks
If chunks are intentionally sent in the wrong order (e.g., 0 → 2 → 1 → 3), SlideServer still accepts them and produces a corrupted file.
Example mismatch:
Original SHA256: 01B1513BFB09718D270EC02B75985B0EE32091BA2CB50D286E760BAB9534DDF1
Uploaded SHA256: 2B8E2616B4668452A2A6C6373BC9E490DDA6D4C8F2F1DED9164E6864E7957824
❌ Parallel browser upload
Using a browser uploader that issues chunk uploads concurrently (similar to how Chrome behaves during heavy uploads or retry conditions), uploads also become corrupted.
The server still returns 200 and proceeds to finish, even though the resulting file does not match the original.
To Reproduce
1. Python Out-of-Order Script
Upload a slide using a Python script that intentionally sends chunks in the wrong order.
Chunk order example:
chunk 0 (offset 0)
chunk 2 (offset 1*chunk_size)
chunk 1 (offset 2*chunk_size)
then chunks 3+
SlideServer will:
- accept all chunks
- return 200
- save a corrupted file
2. Browser Aggressive Upload
Create a small browser uploader using Fetch that uploads chunks concurrently (like Chrome does).
With concurrency = 6, I observed:
- out-of-order requests
- overlapping offsets
- retries
- corrupted final file
### Expected Behavior
SlideServer should:
- Reject out-of-order chunks (409 Offset Mismatch)
- Validate offset == current_temp_file_size
- Prevent overlapping writes
- Optionally verify SHA256 at /finish
- Expose correct resume behavior if validation fails
### Actual Behavior
- Out-of-order chunks are written without checking
- Final file is corrupted but server returns success
- No warnings or log messages
- No way for client to detect corruption
Browser concurrency triggers silent corruption
Desktop:
- OS: Windows 11 (64-bit)
- Browser: Google Chrome
- Browser Version: 119.0.6045.200
- Python: Anaconda Python 3.10 (used for running SlideServer and tests)
- SlideLoader setup: Local clone on Windows filesystem (D:\Gsoc\SlideLoader)
Suggested Fix
I can provide a PR including:
- offset validation:
current_size = os.path.getsize(tmppath)
if offset != current_size:
return 409
- per-token write locks
- optional SHA256 integrity check
- improved error responses
Let me know if you’d like a PR
