Firefox‐based HTTP fetching bridge proof-of-concept #7800
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Firefox‐based HTTP fetching bridge proof-of-concept
This proposal introduces a Firefox‐based HTTP fetching bridge for gallery-dl, leveraging a real browser (Firefox ESR 128+) instead of pure-Python HTTP libraries. By routing requests through a Firefox extension and native messaging host, we can bypass aggressive anti-scraping measures on sites like Fanbox or Patreon, without resorting to brittle TLS‐signature emulation or OAuth workarounds.
Motivation
Many modern websites employ sophisticated bot-detection techniques—ranging from HTTP/2 fingerprinting to dynamic JavaScript challenges—that are extremely difficult to replicate in pure Python. For example, when scraping Fanbox, only the first post downloads successfully; subsequent requests return HTTP 403. Rather than impersonate a browser, we can simply use a real browser engine to perform HTTP fetches, inheriting all native features (TLS stack, HTTP/2 support, cookies, JavaScript execution, etc.).
Architecture Overview
Firefox Extension
fetch()in the page context, inheriting cookies, JS, and TLS/HTTP2.Native Messaging Host (Python)
127.0.0.1:8888).HTTPS → HTTP URL Rewriting Trick
https://…→http://…before sending the request. The extension then restores the original scheme inside Firefox and issues the secure fetch.Installation & Setup (Linux)
Unpack & Inspect the XPI
Get the bridge here: https://github.com/joeydominic/ff-fetch-bridge
Review
manifest.jsonand JS to verify there’s no malicious code.Install the Firefox Extension
about:addons→ “Install Add-on From File…”xpinstall.signatures.required = falseinabout:config(see Mozilla docs).Register the Native Messaging Host
/opt/ff-fetch-bridge/ipc.py.Launch Firefox
127.0.0.1:8888.Usage in
gallery-dlAdd or update your config under the relevant extractor (e.g.
fanbox):Developer Notes & Debugging
Extension Debug Console
about:debugging#/runtime/this-firefox→ “Inspect” on the fetch bridge.Enable Verbose Logging
127.0.0.1:18888.console.log) appear in the extension console./tmp/dbg.logand to stderr (viewable in the global browser console via Ctrl + Shift + J).Timeout Considerations
"timeout"as needed (e.g. 600 s).Large File Downloads
Known Limitations
Both extension and proxy fully buffer each response in RAM. Large files on low-RAM systems may trigger OOM errors.
Pure CONNECT requests can’t be used; the HTTP rewrite hack is mandatory.
Sites behind Cloudflare may present interstitial JS challenges that require user interaction or headless JS completion.
TODO
Cloudflare Challenge Handling
Detect Cloudflare blocks, automatically open the challenge page in a new tab, and either:
Plea for Core Integration & Community Support
I kindly urge the gallery-dl maintainers to merge FF Fetch Bridge support into the main branch. I bet I implemented the integration poorly, sorry for that. With built-in extension detection and automatic HTTPS→HTTP rewriting, users would enjoy out-of-the-box resilience against modern anti-bot defenses.
Moreover, I invite the community to adopt and maintain the Firefox extension—improving cross-platform support, Cloudflare bypass. I'm a newbie to extensions and wrote this using ChatGPT :)
Thank you for considering this enhancement! I’m eager to collaborate on refinement, address questions, or help integrate it into gallery-dl’s codebase.