-
Notifications
You must be signed in to change notification settings - Fork 0
Restructure marker and paddleocr packages
#43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR restructures the marker and paddleocr packages by moving them from the src/ocr/ directory to dedicated workspace packages in packages/ocr/. The change improves dependency management by using uv workspaces and updates Docker configurations for better caching.
- Converted
markerandpaddleocrfrom nested modules to standalone workspace packages - Updated Docker configurations to use multi-stage builds with dependency caching
- Simplified environment variable handling using
DATA_FOLDERconsistently across services
Reviewed Changes
Copilot reviewed 21 out of 25 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| src/ocr/paddleocr/* | Removed old paddleocr implementation files |
| src/ocr/marker/* | Removed old marker implementation files |
| packages/ocr/paddleocr/* | New paddleocr workspace package with pyproject.toml |
| packages/ocr/marker/* | New marker workspace package with simplified API |
| src/api/app/routers/*.py | Updated to use aiohttp for async requests and simplified environment handling |
| docker-compose.yml | Updated build contexts and volume mounting for new package structure |
| pyproject.toml | Added new workspace members and optional dependencies |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
Co-authored-by: Copilot <[email protected]>
Follows on from #42:
packages/ocr/markerandpackages/ocr/paddleocrpackages asuvworkspacespyonb_paddleocrAPI and Dockerfile to be consistent with the other OCR tools inpyonbpyonb_markerandpyonb_paddleocrDockerfiles to cache dependenciesDATA_FOLDERfor mounting themarkerandpaddleocrdata volumespyonband running inference withmarkerandpaddleocrmarkeranddocling