06 Jan 07:19

Zhangg7723

058ef91

v0.2.1 Latest

Latest

What's Changed

fix:chat with knowledge base return not found by @Zhangg7723 in #21
feat:add PowerRAG SDK and API Proxy by @Zhangg7723 in #23
hotfix: use locally built deps image in GHA by @whhe(川粉)(川粉) in #24
ci: change to publish Docker images to DockerHub for main branch and tags by @whhe in #26
Add powerrag sdk publish action by @Zhangg7723 in #27
fix MinerU API configuration error by @Zhangg7723 in #28

Full Changelog: v0.2.0...v0.2.1

Contributors

Zhangg7723 and whhe

Assets 2

16 Dec 04:00

whhe

v0.2.0

575d9a2

v0.2.0

Release Notes

v0.2.0

RAGFlow Integration

This release integrates RAGFlow updates from v0.21.1 to v0.22.1, bringing the following improvements:

From RAGFlow v0.22.1:

Agent: Supports exporting Agent outputs in Word or Markdown formats
Agent: Adds a List operations component
Agent: Adds a Variable aggregator component
Data sources: Supports S3-compatible data sources, e.g., MinIO
Data sources: Adds data synchronization with JIRA
Continues the redesign of the Profile page layouts
Upgrades the Flask web framework from synchronous to asynchronous, increasing concurrency and preventing blocking issues caused when requesting upstream LLM services

From RAGFlow v0.22.0:

Dataset: Supports data synchronization from five online sources (AWS S3, Google Drive, Notion, Confluence, and Discord)
Dataset: RAPTOR can be built across an entire dataset or on individual documents
Ingestion pipeline: Supports Docling document parsing in the Parser component
Launches a new administrative Web UI dashboard for graphical user management and service status monitoring
Agent: Supports structured output
Agent: Supports metadata filtering in the Retrieval component
Agent: Introduces a Variable aggregator component with data operation and session variable definition capabilities
Upgrades RAGFlow's document engine Infinity to v0.6.5

New Features

Optimized Gotenberg Functions (#7)
- Enhanced document conversion capabilities
OceanBase Docker Configuration (#4)
- Updated OceanBase docker configuration for better deployment
Enhanced Search Performance (#17)
- Improved search functionality for better performance

Improvements

Refactored Title and Regex Based Chunk Method (#16)
- Improved chunking logic for better document processing
Updated Merging Logic in split_with_title_chunks (#8)
- Enhanced chunk merging algorithm
Simplified String Escaping (#13)
- Refactored string escaping in get_value_str and OBConnection for better maintainability
Docker Configuration and Documentation (#12)
- Updated docker configurations and README
Build Workflow (#9)
- Added workflow to build dev docker image

Bug Fixes

Fixed PowerRAG Server Timeout Error (#5)
- Resolved timeout issues in PowerRAG server
Fixed Image Source Lost in Smart Chunks (#11)
- Fixed issue where image sources were lost during smart chunking
Fixed Security Alerts and Chunk Saved Error (#19)
- Resolved security issues and chunk saving errors

Contributors

Thanks to all contributors who made this release possible:

Full Changelog: v0.1.0...v0.2.0

发布说明

v0.2.0

RAGFlow 集成

本次发布集成了 RAGFlow 从 v0.21.1 到 v0.22.1 的更新，包含以下改进：

来自 RAGFlow v0.22.1：

Agent：支持导出 Agent 输出为 Word 或 Markdown 格式
Agent：新增列表操作组件
Agent：新增变量聚合器组件
数据源：支持 S3 兼容数据源，例如 MinIO
数据源：新增 JIRA 数据同步功能
继续重新设计个人中心页面布局
将 Flask Web 框架从同步升级为异步，提高并发性能，防止请求上游 LLM 服务时出现阻塞问题

来自 RAGFlow v0.22.0：

数据集：支持从五个在线数据源同步数据（AWS S3、Google Drive、Notion、Confluence 和 Discord）
数据集：RAPTOR 可以在整个数据集或单个文档上构建
数据摄取管道：在解析器组件中支持 Docling 文档解析
推出新的管理 Web UI 仪表板，用于图形化用户管理和服务状态监控
Agent：支持结构化输出
Agent：在检索组件中支持元数据过滤
Agent：引入变量聚合器组件，具有数据操作和会话变量定义功能
将 RAGFlow 的文档引擎 Infinity 升级至 v0.6.5

新功能

优化 Gotenberg 功能 (#7)
- 增强文档转换能力
OceanBase Docker 配置 (#4)
- 更新 OceanBase docker 配置，优化部署体验
增强搜索性能 (#17)
- 优化搜索功能，提升性能

改进

重构基于标题和正则的分块方法 (#16)
- 改进分块逻辑，提升文档处理效果
更新 split_with_title_chunks 的合并逻辑 (#8)
- 增强分块合并算法
简化字符串转义 (#13)
- 重构 get_value_str 和 OBConnection 中的字符串转义逻辑，提升可维护性
Docker 配置和文档 (#12)
- 更新 docker 配置和 README
构建工作流 (#9)
- 新增开发版 docker 镜像构建工作流

错误修复

修复 PowerRAG 服务器超时错误 (#5)
- 解决 PowerRAG 服务器超时问题
修复智能分块中图片源丢失问题 (#11)
- 修复智能分块过程中图片源丢失的问题
修复安全告警和分块保存错误 (#19)
- 解决安全问题和分块保存错误

贡献者

感谢所有为本版本做出贡献的开发者：

完整更新日志: v0.1.0...v0.2.0

Assets 2

18 Nov 04:45

whhe

v0.1.0

4158263

v0.1.0

New features

New parsing and processing workflow: Introduces custom parsing and processing pipeline, providing more flexible document parsing and chunking strategies. This feature includes a complete pipeline of parsing, extraction, conversion, and splitting, enabling users to customize data processing pipelines according to business requirements.
Custom Parsers: Adds support for multiple parsing methods including MinerU, vLLM, and DotsOCR. The MinerU parser supports calling remote services via HTTP API, the vLLM parser enables document understanding using large language models, and the DotsOCR parser specializes in processing documents containing charts and formulas.
Flow Components: Includes core components such as converters, extractors, parsers, and splitters. Converters handle document format conversion, extractors support entity extraction and metadata extraction, parsers process various document formats, and splitters provide intelligent chunking strategies.
Server and APIs: Provides complete backend service support, including a standalone PowerRAG server, RESTful API interfaces, and task queue management. Supports asynchronous task processing and task status queries.
Frontend Updates: Adds PowerRAG-related configuration interfaces and operational components, including parser configuration forms, flow design interfaces, and task monitoring panels, improving user experience.

新功能

全新的解析和处理流程：引入自定义解析和处理流程，提供更灵活的文档解析和分块策略。该功能包含完整的解析、提取、转换和分割流程，支持用户根据业务需求自定义数据处理管道。
自定义解析器：新增支持 MinerU、vLLM、DotsOCR 等多种解析方式。MinerU 解析器支持通过 HTTP API 调用远程服务，vLLM 解析器支持使用大语言模型进行文档理解，DotsOCR 解析器专门处理包含图表和公式的文档。
流程组件：包括转换器、提取器、解析器、分割器等核心组件。转换器负责文档格式转换，提取器支持实体提取和元数据提取，解析器处理各种文档格式，分割器提供智能分块策略。
服务器和 API：提供完整的后端服务支持，包括独立的 PowerRAG 服务器、RESTful API 接口和任务队列管理。支持异步任务处理和任务状态查询。
前端界面更新：新增 PowerRAG 相关配置界面和操作组件，包括解析器配置表单、流程设计界面、任务监控面板等，提升用户体验。

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

What's Changed

Contributors

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Release Notes

v0.2.0

RAGFlow Integration

New Features

Improvements

Bug Fixes

Contributors

发布说明

v0.2.0

RAGFlow 集成

新功能

改进

错误修复

贡献者

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

New features

新功能

Uh oh!

Releases: oceanbase/powerrag

v0.2.1

What's Changed

Contributors

Uh oh!

v0.2.0

Release Notes

v0.2.0

RAGFlow Integration

New Features

Improvements

Bug Fixes

Contributors

发布说明

v0.2.0

RAGFlow 集成

新功能

改进

错误修复

贡献者

Uh oh!

v0.1.0

New features

新功能

Uh oh!