Skip to content

Conversation

@Zhangg7723
Copy link
Collaborator

Summary

  • Added a new PowerRAG SDK for simplified access to PowerRAG API functionalities, including document processing, knowledge base management, and extraction.
  • Enhanced documentation and tests for the new SDK and proxy features.

This commit lays the groundwork for improved user experience and functionality in the PowerRAG ecosystem.

Solution Description

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a PowerRAG SDK and API proxy layer to simplify access to PowerRAG functionalities. The main additions include a Python SDK for programmatic access and a proxy layer that routes PowerRAG API requests through the main RAGFlow service.

Key Changes:

  • Added a comprehensive Python SDK (powerrag/sdk/) with modules for knowledge base, document, chunk, extraction, RAPTOR, knowledge graph, and retrieval management
  • Introduced an API proxy (api/apps/sdk/powerrag_proxy.py) to forward PowerRAG requests from RAGFlow to the PowerRAG server
  • Migrated the server from Flask to Quart for async support and updated all route handlers accordingly
  • Added extensive test suite and documentation for the SDK

Reviewed changes

Copilot reviewed 37 out of 37 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
powerrag/sdk/client.py Main SDK client providing HTTP request methods and manager module initialization
powerrag/sdk/modules/*_manager.py Manager classes for each functional module (7 modules total)
powerrag/sdk/modules/*.py TypedDict data models for type safety
powerrag/sdk/tests/*.py Comprehensive test suite covering all modules
api/apps/sdk/powerrag_proxy.py API proxy for forwarding requests to PowerRAG server
powerrag/server/app.py Migration from Flask to Quart with async support
powerrag/server/routes/*.py Updated route handlers to async functions
powerrag/server/services/parse_to_md_task_manager.py New task manager for async parse_to_md operations
Comments suppressed due to low confidence (2)

powerrag/server/routes/powerrag_routes.py:1

  • This route handler is not declared as async while all other route handlers in this file have been updated to async functions. For consistency with the Quart migration and to avoid potential issues with the async application context, this should also be an async function.
    powerrag/server/services/split_service.py:1
  • The comment mentions '直接引用同一模块中定义的函数' (directly reference functions defined in the same module), but the actual code that referenced these functions (the global statement) was removed. The comment should be updated to explain why the global statement was removed or what the new approach is.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

page_size: int = 30,
orderby: str = "create_time",
desc: bool = True,
) -> tuple[List[KnowledgeBaseInfo], int]:
Copy link

Copilot AI Jan 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using tuple[...] syntax requires Python 3.9+. For better compatibility with Python 3.8 (as stated in README), use Tuple[...] from the typing module instead.

Copilot uses AI. Check for mistakes.
keywords: Optional[str] = None,
page: int = 1,
page_size: int = 30,
) -> tuple[List[ChunkInfo], int, Dict[str, Any]]:
Copy link

Copilot AI Jan 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using tuple[...] syntax requires Python 3.9+. For better compatibility with Python 3.8 (as stated in README), use Tuple[...] from the typing module instead.

Copilot uses AI. Check for mistakes.
create_time_to: int = 0,
suffix: Optional[List[str]] = None,
run: Optional[List[str]] = None,
) -> tuple[List[DocumentInfo], int]:
Copy link

Copilot AI Jan 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using tuple[...] syntax requires Python 3.9+. For better compatibility with Python 3.8 (as stated in README), use Tuple[...] from the typing module instead.

Copilot uses AI. Check for mistakes.

# 创建异步 HTTP 客户端(使用连接池提高性能)
_http_client = httpx.AsyncClient(
timeout=httpx.Timeout(300.0, connect=10.0), # 5分钟总超时,10秒连接超时
Copy link

Copilot AI Jan 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment states '5分钟总超时' (5 minutes total timeout), but the timeout value is set to 300.0 seconds, which is indeed 5 minutes. However, the comment could be clearer by stating '300秒总超时' or updating to match the comment exactly.

Suggested change
timeout=httpx.Timeout(300.0, connect=10.0), # 5分钟总超时,10秒连接超时
timeout=httpx.Timeout(300.0, connect=10.0), # 300秒(5分钟)总超时,10秒连接超时

Copilot uses AI. Check for mistakes.
@whhe whhe merged commit 8f3aff0 into oceanbase:main Jan 4, 2026
7 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants