feat: Data backup #4105

RC-CHN · 2025-12-18T06:54:22Z

Modifications / 改动点

添加了从orm导出数据库，备份插件目录以及再导入的相关方法，在web上添加了相关页面与路由允许用户导入导出astrbot备份。

This is NOT a breaking change. / 这不是一个破坏性变更。

Screenshots or Test Results / 运行截图或测试结果

Checklist / 检查清单

😊 如果 PR 中有新加入的功能，已经通过 Issue / 邮件等方式和作者讨论过。/ If there are new features added in the PR, I have discussed it with the authors through issues/emails, etc.
👀 我的更改经过了良好的测试，并已在上方提供了“验证步骤”和“运行截图”。/ My changes have been well-tested, and "Verification Steps" and "Screenshots" have been provided above.
🤓 我确保没有引入新依赖库，或者引入了新依赖库的同时将其添加到了 requirements.txt 和 pyproject.toml 文件相应位置。/ I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations in requirements.txt and pyproject.toml.
😮 我的更改没有引入恶意代码。/ My changes do not introduce malicious code.

Summary by Sourcery

在 AstrBot 核心和控制台中新增完整的数据备份与恢复能力，涵盖后端 API、导入/导出逻辑，以及设置中的 Web UI 对话框。

新功能：

引入后端导出和导入模块，以 ZIP 归档的形式创建和恢复完整的 AstrBot 备份，包括数据库、配置、附件、插件和知识库数据。
暴露备份管理 HTTP API，用于列出、创建、导入、下载和删除备份文件，并支持异步任务跟踪和进度报告。
在控制台 UI 的设置中新增备份对话框，用于导出、导入、列出、下载和删除备份，并提供进度指示和重启集成功能。

优化改进：

扩展控制台设置页面，新增备份入口和对话组件，并接入现有的重启流程。

测试：

新增单元测试和集成测试，覆盖备份导出/导入器行为、清单生成、错误处理以及导出-导入的完整闭环流程。

Original summary in English

Summary by Sourcery

Add a full data backup and restore capability across the AstrBot core and dashboard, including backend APIs, import/export logic, and a web UI dialog in settings.

New Features:

Introduce backend export and import modules to create and restore complete AstrBot backups as ZIP archives, including databases, configuration, attachments, plugins, and knowledge base data.
Expose backup management HTTP APIs for listing, creating, importing, downloading, and deleting backup files with async task tracking and progress reporting.
Add a settings dialog in the dashboard UI to export, import, list, download, and delete backups, with progress indicators and restart integration.

Enhancements:

Extend the dashboard settings page with a backup entry point and dialog component wired into the existing restart flow.

Tests:

Add unit and integration tests covering backup exporter/importer behavior, manifest generation, error handling, and an export-import roundtrip.

sourcery-ai

你好，我已经查看了你的改动，有一些问题需要处理。

在 AstrBotImporter._import_directories 中，backup_version 目前以字符串形式比较（例如 if backup_version < "1.1"），对于多位数版本（如 "1.10" 和 "1.2"）会出现错误行为；建议在比较前先将其解析为语义化版本或数字元组。
在 BackupDialog.vue 中，v-file-input 通常会返回一个 File 对象数组，但 startImport 直接将 importFile.value 追加到 FormData；请确保正确处理：可以使用数组的第一个元素（例如 importFile.value[0]），或者将 v-file-input 配置为 multiple=false 并显式处理其类型。
BackupRoute 中内存里的 backup_tasks 和 backup_progress 字典从未被清理，因此长时间运行的实例可能会无限积累任务状态；建议为已完成/失败的任务设置过期或定期清理，或限制保留的历史数量。

给 AI Agent 的提示词

请根据以下代码审查中的评论进行修改：

## 总体评论
- 在 `AstrBotImporter._import_directories` 中，`backup_version` 目前以字符串形式比较（例如 `if backup_version < "1.1"`），对于多位数版本（如 "1.10" 和 "1.2"）会出现错误行为；建议在比较前先将其解析为语义化版本或数字元组。
- 在 `BackupDialog.vue` 中，`v-file-input` 通常会返回一个 `File` 对象数组，但 `startImport` 直接将 `importFile.value` 追加到 `FormData`；请确保正确处理：可以使用数组的第一个元素（例如 `importFile.value[0]`），或者将 `v-file-input` 配置为 `multiple=false` 并显式处理其类型。
- `BackupRoute` 中内存里的 `backup_tasks` 和 `backup_progress` 字典从未被清理，因此长时间运行的实例可能会无限积累任务状态；建议为已完成/失败的任务设置过期或定期清理，或限制保留的历史数量。

## 逐条评论

### 评论 1
<location> `astrbot/dashboard/routes/backup.py:273-279` </location>
<code_context>
+                if "file" not in files:
+                    return Response().error("缺少备份文件").__dict__
+
+                file = files["file"]
+                if not file.filename or not file.filename.endswith(".zip"):
+                    return Response().error("请上传 ZIP 格式的备份文件").__dict__
+
+                # 保存上传的文件
+                Path(self.backup_dir).mkdir(parents=True, exist_ok=True)
+                zip_path = os.path.join(self.backup_dir, file.filename)
+                await file.save(zip_path)
+            else:
</code_context>

<issue_to_address>
**🚨 issue (security):** 直接使用上传文件名而不做清洗，会带来路径问题和意外覆盖的风险。

这里将文件写入 `self.backup_dir` 时使用了未经处理的 `file.filename`。即使目录是固定的，由客户端控制的文件名在某些平台上仍可能导致路径遍历问题，并可能覆盖已有备份。建议对文件名进行规范化处理（例如使用类似 `secure_filename` 的工具方法），并/或生成唯一的内部文件名（UUID/时间戳），如有需要，仅在元数据中保留原始文件名用于展示。
</issue_to_address>

### 评论 2
<location> `astrbot/core/backup/importer.py:523-524` </location>
<code_context>
+        dir_stats: dict[str, int] = {}
+
+        # 检查备份版本是否支持目录备份
+        backup_version = manifest.get("version", "1.0")
+        if backup_version < "1.1":
+            logger.info("备份版本不支持目录备份，跳过目录导入")
+            return dir_stats
</code_context>

<issue_to_address>
**issue (bug_risk):** 使用字符串比较 manifest 版本过于脆弱，在多位数版本号时会出现错误行为。

这里对 `backup_version` 使用字符串比较（例如 `"1.10" < "1.2"` 为 `True`），在多位数版本时会产生错误行为。由于该条件决定是否导入目录备份，建议将版本解析为数值组件（例如 `major, minor = map(int, backup_version.split('.'))`）再进行比较，或者使用专门的语义化版本比较工具方法来处理。
</issue_to_address>

### 评论 3
<location> `astrbot/core/backup/exporter.py:63-70` </location>
<code_context>
+}
+
+
+# 需要备份的目录列表
+BACKUP_DIRECTORIES = {
+    "plugins": "data/plugins",  # 插件本体
+    "plugin_data": "data/plugin_data",  # 插件数据
+    "config": "data/config",  # 配置目录
+    "t2i_templates": "data/t2i_templates",  # T2I 模板
+    "webchat": "data/webchat",  # WebChat 数据
+    "temp": "data/temp",  # 临时文件
+}
+
</code_context>

<issue_to_address>
**suggestion:** `BACKUP_DIRECTORIES` 映射在导出端和导入端都有一份重复定义，建议进行集中管理。

在两个文件中复制这份映射，存在两者配置逐渐不一致的风险（例如新增了导出的目录但忘记在导入端添加）。建议只在一个共享模块（或其中一个文件）中暴露统一常量，并在两边复用，以保持行为一致。
</issue_to_address>

### 评论 4
<location> `astrbot/dashboard/routes/backup.py:420-424` </location>
<code_context>
+            if not os.path.exists(file_path):
+                return Response().error("备份文件不存在").__dict__
+
+            return await send_file(
+                file_path,
+                as_attachment=True,
+                attachment_filename=filename,
+            )
+        except Exception as e:
</code_context>

<issue_to_address>
**suggestion:** 在较新的 Werkzeug/Quart 版本中，`send_file` 的 `attachment_filename` 参数已经被弃用。

为避免弃用警告，并保持与新版本 Werkzeug/Quart 的兼容性，请将此参数替换为 `download_name=filename`，同时继续保留 `as_attachment=True`。`attachment_filename` 已被弃用，并可能在未来版本中移除。

```suggestion
            return await send_file(
                file_path,
                as_attachment=True,
                download_name=filename,
            )
```
</issue_to_address>

### 评论 5
<location> `astrbot/core/backup/importer.py:43` </location>
<code_context>
+
+
+# 主数据库模型类映射
+MAIN_DB_MODELS: dict[str, type[SQLModel]] = {
+    "platform_stats": PlatformStat,
+    "conversations": ConversationV2,
</code_context>

<issue_to_address>
**issue (complexity):** 建议抽取共享映射/工具方法，并将较大的 `import_all` 流程拆分成更小的辅助函数，以在保持行为不变的前提下降低重复和结构复杂度。

可以在不改变任何功能的前提下，通过几个小而集中的重构显著降低结构复杂度：

---

### 1. 和导出逻辑去重映射/常量

`MAIN_DB_MODELS`、`KB_METADATA_MODELS` 和 `BACKUP_DIRECTORIES` 都是纯数据，并且和导出端重复。可以将它们提取到一个共享模块中，然后在导入/导出两侧引用：

```python
# astrbot/core/backup/common.py
from sqlmodel import SQLModel
from astrbot.core.db.po import (
    Attachment,
    CommandConfig,
    CommandConflict,
    ConversationV2,
    Persona,
    PlatformMessageHistory,
    PlatformSession,
    PlatformStat,
    Preference,
)
from astrbot.core.knowledge_base.models import (
    KBDocument,
    KBMedia,
    KnowledgeBase,
)

MAIN_DB_MODELS: dict[str, type[SQLModel]] = {
    "platform_stats": PlatformStat,
    "conversations": ConversationV2,
    "personas": Persona,
    "preferences": Preference,
    "platform_message_history": PlatformMessageHistory,
    "platform_sessions": PlatformSession,
    "attachments": Attachment,
    "command_configs": CommandConfig,
    "command_conflicts": CommandConflict,
}

KB_METADATA_MODELS: dict[str, type[SQLModel]] = {
    "knowledge_bases": KnowledgeBase,
    "kb_documents": KBDocument,
    "kb_media": KBMedia,
}

BACKUP_DIRECTORIES = {
    "plugins": "data/plugins",
    "plugin_data": "data/plugin_data",
    "config": "data/config",
    "t2i_templates": "data/t2i_templates",
    "webchat": "data/webchat",
    "temp": "data/temp",
}
```

然后在 importer 中：

```python
# from astrbot.core.db.po import ...
# from astrbot.core.knowledge_base.models import ...
from astrbot.core.backup.common import (
    MAIN_DB_MODELS,
    KB_METADATA_MODELS,
    BACKUP_DIRECTORIES,
)
```

这样可以去除重复，并让 schema/备份策略的修改更易于理解和维护。

---

### 2. 将 `import_all` 的编排逻辑拆分为更小的私有方法

`import_all` 目前在一个长方法中串联了很多职责。你可以通过拆分为多个“步骤”方法，在保持行为一致的情况下，让代码更易读、更易测试：

```python
# inside AstrBotImporter

async def import_all(self, zip_path: str, mode: str = "replace", progress_callback: Any | None = None) -> ImportResult:
    result = ImportResult()

    if not os.path.exists(zip_path):
        result.add_error(f"备份文件不存在: {zip_path}")
        return result

    logger.info(f"开始从 {zip_path} 导入备份")

    try:
        with zipfile.ZipFile(zip_path, "r") as zf:
            manifest = await self._load_and_validate_manifest(zf, result, progress_callback)
            if not result.success:
                return result

            main_data = await self._import_main_db_from_zip(zf, mode, result, progress_callback)
            await self._import_kb_from_zip(zf, manifest, main_data, result, mode, progress_callback)
            await self._import_config_from_zip(zf, result, progress_callback)
            await self._import_attachments_from_zip(zf, main_data, result, progress_callback)
            await self._import_directories_from_zip(zf, manifest, result, progress_callback)

        logger.info(f"备份导入完成: {result.to_dict()}")
        return result
    except zipfile.BadZipFile:
        result.add_error("无效的 ZIP 文件")
        return result
    except Exception as e:
        result.add_error(f"导入失败: {e}")
        return result
```

配合一些小的辅助方法（这里只展示签名和关键行）：

```python
async def _load_and_validate_manifest(self, zf, result, progress_callback) -> dict:
    if progress_callback:
        await progress_callback("validate", 0, 100, "正在验证备份文件...")
    try:
        manifest_data = zf.read("manifest.json")
        manifest = json.loads(manifest_data)
        self._validate_version(manifest)
    except (KeyError, json.JSONDecodeError, ValueError) as e:
        result.add_error(f"manifest 错误: {e}")
        return {}
    if progress_callback:
        await progress_callback("validate", 100, 100, "验证完成")
    return manifest

async def _import_main_db_from_zip(self, zf, mode, result, progress_callback) -> dict:
    if progress_callback:
        await progress_callback("main_db", 0, 100, "正在导入主数据库...")
    main_data = json.loads(zf.read("databases/main_db.json"))
    if mode == "replace":
        await self._clear_main_db()
    imported = await self._import_main_database(main_data)
    result.imported_tables.update(imported)
    if progress_callback:
        await progress_callback("main_db", 100, 100, "主数据库导入完成")
    return main_data
```

其它步骤（`_import_kb_from_zip`、`_import_config_from_zip` 等）也可以用类似方式封装现有内部调用，将编排样板代码从 `import_all` 中移出。

---

### 3. 将日期时间转换逻辑提取为可复用工具

`_convert_datetime_fields` 同时处理 SQLAlchemy 反射和时间解析。将其移动到模块级辅助函数（或共享备份工具）会让类更精简，并允许导出端/测试复用：

```python
# astrbot/core/backup/datetime_utils.py
from datetime import datetime
from typing import Any
from sqlalchemy import inspect as sa_inspect, DateTime

def convert_datetime_fields(row: dict[str, Any], model_class: type) -> dict[str, Any]:
    result = row.copy()
    try:
        mapper = sa_inspect(model_class)
        for column in mapper.columns:
            if column.name in result and result[column.name] is not None:
                if isinstance(column.type, DateTime) and isinstance(result[column.name], str):
                    result[column.name] = datetime.fromisoformat(result[column.name])
    except Exception:
        pass
    return result
```

然后在 importer 方法中：

```python
from astrbot.core.backup.datetime_utils import convert_datetime_fields

# ...
row = convert_datetime_fields(row, model_class)
```

---

### 4. 在 `_import_directories` 中避免基于字符串的版本比较

`backup_version < "1.1"` 这样的字符串比较在后续版本中不够健壮。切换到 `packaging.version` 可以让逻辑更可靠：

```python
from packaging.version import Version

# inside _import_directories
backup_version_str = manifest.get("version", "1.0")
backup_version = Version(backup_version_str)

if backup_version < Version("1.1"):
    logger.info("备份版本不支持目录备份，跳过目录导入")
    return dir_stats
```

这是一个对现有行为影响极小的修改，但可以避免未来出现像 "1.10" 和 "1.2" 这样的细微问题。

---

这些改动在保留现有行为的前提下，使 importer 更易于阅读、测试和演进，并且直接解决了此前指出的结构/重复问题，而无需进行大规模重构。
</issue_to_address>

### 评论 6
<location> `astrbot/core/backup/exporter.py:42` </location>
<code_context>
+    from astrbot.core.knowledge_base.kb_mgr import KnowledgeBaseManager
+
+
+# 主数据库模型类映射
+MAIN_DB_MODELS: dict[str, type[SQLModel]] = {
+    "platform_stats": PlatformStat,
</code_context>

<issue_to_address>
**issue (complexity):** 建议将共享常量、序列化逻辑和 manifest 构建辅助方法提取到可复用模块中，从而在不改变行为的前提下，让导出/导入代码更易读、更易维护。

你可以通过几个不改变行为的定向抽取，降低这里的理解成本。

### 1. 在导入/导出间共享常量

`MAIN_DB_MODELS`、`KB_METADATA_MODELS` 和 `BACKUP_DIRECTORIES` 都是纯数据，并且已经与 importer 中的定义完全相同。将它们提取到共享模块可以消除重复，又不会改变行为。

```python
# astrbot/core/backup/constants.py
from sqlmodel import SQLModel
from astrbot.core.db.po import (
    Attachment, CommandConfig, CommandConflict, ConversationV2,
    Persona, PlatformMessageHistory, PlatformSession, PlatformStat, Preference,
)
from astrbot.core.knowledge_base.models import KBDocument, KBMedia, KnowledgeBase

MAIN_DB_MODELS: dict[str, type[SQLModel]] = {
    "platform_stats": PlatformStat,
    "conversations": ConversationV2,
    "personas": Persona,
    "preferences": Preference,
    "platform_message_history": PlatformMessageHistory,
    "platform_sessions": PlatformSession,
    "attachments": Attachment,
    "command_configs": CommandConfig,
    "command_conflicts": CommandConflict,
}

KB_METADATA_MODELS: dict[str, type[SQLModel]] = {
    "knowledge_bases": KnowledgeBase,
    "kb_documents": KBDocument,
    "kb_media": KBMedia,
}

BACKUP_DIRECTORIES = {
    "plugins": "data/plugins",
    "plugin_data": "data/plugin_data",
    "config": "data/config",
    "t2i_templates": "data/t2i_templates",
    "webchat": "data/webchat",
    "temp": "data/temp",
}
```

然后在 importer/exporter 中统一：

```python
from astrbot.core.backup.constants import (
    MAIN_DB_MODELS, KB_METADATA_MODELS, BACKUP_DIRECTORIES,
)
```

这可以直接解决审查意见中的第 (2) 点，并简化未来的 schema/目录调整。

### 2. 将模型序列化提取为共享工具

`_model_to_dict` 是通用逻辑，并不特定于导出流程。将其移动到共享工具模块，可以将 ORM 细节从 exporter 中抽离，并允许 importer 复用：

```python
# astrbot/core/backup/serialization.py
from datetime import datetime
from typing import Any, Dict

from sqlalchemy import inspect as sa_inspect

def model_to_dict(record: Any) -> Dict:
    # 优先使用 SQLModel 自带的 model_dump
    if hasattr(record, "model_dump"):
        data = record.model_dump(mode="python")
    else:
        data = {}
        mapper = sa_inspect(record.__class__)
        for column in mapper.columns:
            data[column.name] = getattr(record, column.name)

    # 将 datetime 统一转换为 ISO 字符串
    for key, value in list(data.items()):
        if isinstance(value, datetime):
            data[key] = value.isoformat()
    return data
```

在 exporter 中：

```python
# 原来：
# export_data[table_name] = [self._model_to_dict(record) for record in records]

from astrbot.core.backup.serialization import model_to_dict

export_data[table_name] = [model_to_dict(record) for record in records]
```

在 importer 中也可以同样使用，然后就可以完全删除 exporter 里的 `_model_to_dict`。

### 3. 将 `_generate_manifest` 拆分为更小的辅助方法

在保持类整体结构不变的前提下，可以通过提取收集逻辑为专用助手，使 manifest 构建代码更易于理解和测试，同时保持行为完全一致。

```python
def _collect_attachment_entries(self, main_data: dict[str, list[dict]]) -> list[str]:
    files: list[str] = []
    for attachment in main_data.get("attachments", []):
        attachment_id = attachment.get("attachment_id", "")
        path = attachment.get("path", "")
        if attachment_id and path:
            ext = os.path.splitext(path)[1]
            files.append(f"{attachment_id}{ext}")
    return files

def _collect_kb_media_entries(self) -> dict[str, list[str]]:
    kb_media_files: dict[str, list[str]] = {}
    if not self.kb_manager:
        return kb_media_files

    for kb_id, kb_helper in self.kb_manager.kb_insts.items():
        media_dir = kb_helper.kb_medias_dir
        if not media_dir.exists():
            continue
        files: list[str] = []
        for root, _, filenames in os.walk(media_dir):
            for f in filenames:
                files.append(f)
        if files:
            kb_media_files[kb_id] = files
    return kb_media_files

def _collect_kb_document_tables(self) -> dict[str, str]:
    if not self.kb_manager:
        return {}
    return {kb_id: "documents" for kb_id in self.kb_manager.kb_insts.keys()}
```

然后 `_generate_manifest` 主要是数据组装：

```python
def _generate_manifest(
    self,
    main_data: dict[str, list[dict]],
    kb_meta_data: dict[str, list[dict]],
    dir_stats: dict[str, dict[str, int]] | None = None,
) -> dict:
    if dir_stats is None:
        dir_stats = {}

    manifest = {
        "version": "1.1",
        "astrbot_version": VERSION,
        "exported_at": datetime.now(timezone.utc).isoformat(),
        "schema_version": {"main_db": "v4", "kb_db": "v1"},
        "tables": {
            "main_db": list(main_data.keys()),
            "kb_metadata": list(kb_meta_data.keys()),
            "kb_documents": self._collect_kb_document_tables(),
        },
        "files": {
            "attachments": self._collect_attachment_entries(main_data),
            "kb_media": self._collect_kb_media_entries(),
        },
        "directories": list(dir_stats.keys()),
        "checksums": self._checksums,
        "statistics": {
            "main_db": {t: len(rs) for t, rs in main_data.items()},
            "kb_metadata": {t: len(rs) for t, rs in kb_meta_data.items()},
            "directories": dir_stats,
        },
    }
    return manifest
```

这可以在不改变 manifest 结构的前提下，解决复杂度方面的第 (3) 点意见。

---

如果之后希望再进一步重构，下一步可以按照之前建议，将目录导出和知识库导出拆分为小的协作类。但目前上述三点已经在不产生较大变更的前提下，去掉了相当多的重复和本地复杂度。
</issue_to_address>

Sourcery 对开源项目免费使用——如果你觉得这些审查有帮助，欢迎分享 ✨

_{请帮我变得更有用！欢迎对每条评论点 👍 或 👎，我会根据反馈改进后续的审查质量。}

Original comment in English

Hey there - I've reviewed your changes and found some issues that need to be addressed.

In AstrBotImporter._import_directories, backup_version is compared as a string (e.g., if backup_version < "1.1"), which will behave incorrectly for multi-digit versions ("1.10" vs "1.2"); consider parsing it as a semantic version or numeric tuple before comparing.
In BackupDialog.vue, v-file-input typically returns an array of File objects, but startImport appends importFile.value directly to FormData; ensure you account for this by using the first element (e.g., importFile.value[0]) or configuring v-file-input with multiple=false and handling the type explicitly.
The in-memory backup_tasks and backup_progress dictionaries in BackupRoute are never cleaned up, so long-running instances may accumulate task state indefinitely; consider expiring or pruning completed/failed tasks after some time or limiting the stored history.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- In `AstrBotImporter._import_directories`, `backup_version` is compared as a string (e.g., `if backup_version < "1.1"`), which will behave incorrectly for multi-digit versions ("1.10" vs "1.2"); consider parsing it as a semantic version or numeric tuple before comparing.
- In `BackupDialog.vue`, `v-file-input` typically returns an array of `File` objects, but `startImport` appends `importFile.value` directly to `FormData`; ensure you account for this by using the first element (e.g., `importFile.value[0]`) or configuring `v-file-input` with `multiple=false` and handling the type explicitly.
- The in-memory `backup_tasks` and `backup_progress` dictionaries in `BackupRoute` are never cleaned up, so long-running instances may accumulate task state indefinitely; consider expiring or pruning completed/failed tasks after some time or limiting the stored history.

## Individual Comments

### Comment 1
<location> `astrbot/dashboard/routes/backup.py:273-279` </location>
<code_context>
+                if "file" not in files:
+                    return Response().error("缺少备份文件").__dict__
+
+                file = files["file"]
+                if not file.filename or not file.filename.endswith(".zip"):
+                    return Response().error("请上传 ZIP 格式的备份文件").__dict__
+
+                # 保存上传的文件
+                Path(self.backup_dir).mkdir(parents=True, exist_ok=True)
+                zip_path = os.path.join(self.backup_dir, file.filename)
+                await file.save(zip_path)
+            else:
</code_context>

<issue_to_address>
**🚨 issue (security):** Using the uploaded filename directly without sanitization risks path issues and unintended overwrites.

Here the file is written to `self.backup_dir` using the raw `file.filename`. Even with a fixed directory, client-controlled names can enable path traversal on some platforms and can overwrite existing backups. Normalize the filename (e.g., `secure_filename`-style helper) and/or generate a unique internal name (UUID/timestamp), retaining the original name only in metadata if needed for display.
</issue_to_address>

### Comment 2
<location> `astrbot/core/backup/importer.py:523-524` </location>
<code_context>
+        dir_stats: dict[str, int] = {}
+
+        # 检查备份版本是否支持目录备份
+        backup_version = manifest.get("version", "1.0")
+        if backup_version < "1.1":
+            logger.info("备份版本不支持目录备份，跳过目录导入")
+            return dir_stats
</code_context>

<issue_to_address>
**issue (bug_risk):** String comparison for manifest version is brittle and can misbehave for multi-digit versions.

Here `backup_version` is compared as a string (e.g., `"1.10" < "1.2"` is `True`), which will misbehave for multi-digit versions. Since this controls whether directory backups are imported, consider parsing the version into numeric components (e.g., `major, minor = map(int, backup_version.split('.'))`) for comparison, or delegating to a helper that does semantic version comparison.
</issue_to_address>

### Comment 3
<location> `astrbot/core/backup/exporter.py:63-70` </location>
<code_context>
+}
+
+
+# 需要备份的目录列表
+BACKUP_DIRECTORIES = {
+    "plugins": "data/plugins",  # 插件本体
+    "plugin_data": "data/plugin_data",  # 插件数据
+    "config": "data/config",  # 配置目录
+    "t2i_templates": "data/t2i_templates",  # T2I 模板
+    "webchat": "data/webchat",  # WebChat 数据
+    "temp": "data/temp",  # 临时文件
+}
+
</code_context>

<issue_to_address>
**suggestion:** The BACKUP_DIRECTORIES mapping is duplicated in both exporter and importer; consider centralizing it.

Duplicating this mapping in both files risks them drifting out of sync (e.g., new directories exported but not imported). Exposing a single constant (from a shared module or one of the files) and reusing it in both places would keep behavior aligned.
</issue_to_address>

### Comment 4
<location> `astrbot/dashboard/routes/backup.py:420-424` </location>
<code_context>
+            if not os.path.exists(file_path):
+                return Response().error("备份文件不存在").__dict__
+
+            return await send_file(
+                file_path,
+                as_attachment=True,
+                attachment_filename=filename,
+            )
+        except Exception as e:
</code_context>

<issue_to_address>
**suggestion:** The `attachment_filename` argument to send_file is deprecated in modern Werkzeug/Quart versions.

To avoid deprecation warnings and stay compatible with newer Werkzeug/Quart releases, please switch this to `download_name=filename` while keeping `as_attachment=True`. `attachment_filename` is deprecated and may be removed in future versions.

```suggestion
            return await send_file(
                file_path,
                as_attachment=True,
                download_name=filename,
            )
```
</issue_to_address>

### Comment 5
<location> `astrbot/core/backup/importer.py:43` </location>
<code_context>
+
+
+# 主数据库模型类映射
+MAIN_DB_MODELS: dict[str, type[SQLModel]] = {
+    "platform_stats": PlatformStat,
+    "conversations": ConversationV2,
</code_context>

<issue_to_address>
**issue (complexity):** Consider extracting shared mappings/utilities and splitting the large import_all flow into smaller helpers to reduce duplication and structural complexity while preserving behavior.

You can keep all functionality and significantly reduce structural complexity with a few small, focused refactors:

---

### 1. Deduplicate mappings/constants with exporter

`MAIN_DB_MODELS`, `KB_METADATA_MODELS`, and `BACKUP_DIRECTORIES` are pure data and are duplicated with the exporter. Extract them to a shared module and import from both sides:

```python
# astrbot/core/backup/common.py
from sqlmodel import SQLModel
from astrbot.core.db.po import (
    Attachment,
    CommandConfig,
    CommandConflict,
    ConversationV2,
    Persona,
    PlatformMessageHistory,
    PlatformSession,
    PlatformStat,
    Preference,
)
from astrbot.core.knowledge_base.models import (
    KBDocument,
    KBMedia,
    KnowledgeBase,
)

MAIN_DB_MODELS: dict[str, type[SQLModel]] = {
    "platform_stats": PlatformStat,
    "conversations": ConversationV2,
    "personas": Persona,
    "preferences": Preference,
    "platform_message_history": PlatformMessageHistory,
    "platform_sessions": PlatformSession,
    "attachments": Attachment,
    "command_configs": CommandConfig,
    "command_conflicts": CommandConflict,
}

KB_METADATA_MODELS: dict[str, type[SQLModel]] = {
    "knowledge_bases": KnowledgeBase,
    "kb_documents": KBDocument,
    "kb_media": KBMedia,
}

BACKUP_DIRECTORIES = {
    "plugins": "data/plugins",
    "plugin_data": "data/plugin_data",
    "config": "data/config",
    "t2i_templates": "data/t2i_templates",
    "webchat": "data/webchat",
    "temp": "data/temp",
}
```

Then in the importer:

```python
# from astrbot.core.db.po import ...
# from astrbot.core.knowledge_base.models import ...
from astrbot.core.backup.common import (
    MAIN_DB_MODELS,
    KB_METADATA_MODELS,
    BACKUP_DIRECTORIES,
)
```

This removes duplication and makes schema/backup changes easier to reason about.

---

### 2. Split `import_all` orchestration into smaller private methods

`import_all` currently wires many concerns together in one long method. You can preserve behavior while making it much easier to follow and test by extracting “step” methods:

```python
# inside AstrBotImporter

async def import_all(self, zip_path: str, mode: str = "replace", progress_callback: Any | None = None) -> ImportResult:
    result = ImportResult()

    if not os.path.exists(zip_path):
        result.add_error(f"备份文件不存在: {zip_path}")
        return result

    logger.info(f"开始从 {zip_path} 导入备份")

    try:
        with zipfile.ZipFile(zip_path, "r") as zf:
            manifest = await self._load_and_validate_manifest(zf, result, progress_callback)
            if not result.success:
                return result

            main_data = await self._import_main_db_from_zip(zf, mode, result, progress_callback)
            await self._import_kb_from_zip(zf, manifest, main_data, result, mode, progress_callback)
            await self._import_config_from_zip(zf, result, progress_callback)
            await self._import_attachments_from_zip(zf, main_data, result, progress_callback)
            await self._import_directories_from_zip(zf, manifest, result, progress_callback)

        logger.info(f"备份导入完成: {result.to_dict()}")
        return result
    except zipfile.BadZipFile:
        result.add_error("无效的 ZIP 文件")
        return result
    except Exception as e:
        result.add_error(f"导入失败: {e}")
        return result
```

With small helpers (showing only signatures + key lines):

```python
async def _load_and_validate_manifest(self, zf, result, progress_callback) -> dict:
    if progress_callback:
        await progress_callback("validate", 0, 100, "正在验证备份文件...")
    try:
        manifest_data = zf.read("manifest.json")
        manifest = json.loads(manifest_data)
        self._validate_version(manifest)
    except (KeyError, json.JSONDecodeError, ValueError) as e:
        result.add_error(f"manifest 错误: {e}")
        return {}
    if progress_callback:
        await progress_callback("validate", 100, 100, "验证完成")
    return manifest

async def _import_main_db_from_zip(self, zf, mode, result, progress_callback) -> dict:
    if progress_callback:
        await progress_callback("main_db", 0, 100, "正在导入主数据库...")
    main_data = json.loads(zf.read("databases/main_db.json"))
    if mode == "replace":
        await self._clear_main_db()
    imported = await self._import_main_database(main_data)
    result.imported_tables.update(imported)
    if progress_callback:
        await progress_callback("main_db", 100, 100, "主数据库导入完成")
    return main_data
```

The other steps (`_import_kb_from_zip`, `_import_config_from_zip`, etc.) can similarly just wrap existing internal calls and move orchestration boilerplate out of `import_all`.

---

### 3. Extract datetime conversion to a reusable utility

`_convert_datetime_fields` mixes SQLAlchemy reflection and parsing. Moving it to a module-level helper (or shared backup utility) makes the class leaner and allows reuse by exporter/tests:

```python
# astrbot/core/backup/datetime_utils.py
from datetime import datetime
from typing import Any
from sqlalchemy import inspect as sa_inspect, DateTime

def convert_datetime_fields(row: dict[str, Any], model_class: type) -> dict[str, Any]:
    result = row.copy()
    try:
        mapper = sa_inspect(model_class)
        for column in mapper.columns:
            if column.name in result and result[column.name] is not None:
                if isinstance(column.type, DateTime) and isinstance(result[column.name], str):
                    result[column.name] = datetime.fromisoformat(result[column.name])
    except Exception:
        pass
    return result
```

Then in the importer methods:

```python
from astrbot.core.backup.datetime_utils import convert_datetime_fields

# ...
row = convert_datetime_fields(row, model_class)
```

---

### 4. Avoid string-based version comparison in `_import_directories`

The `backup_version < "1.1"` string comparison is fragile for future versions. Switching to `packaging.version` keeps logic robust:

```python
from packaging.version import Version

# inside _import_directories
backup_version_str = manifest.get("version", "1.0")
backup_version = Version(backup_version_str)

if backup_version < Version("1.1"):
    logger.info("备份版本不支持目录备份，跳过目录导入")
    return dir_stats
```

This is a minimal change that keeps behavior the same now but avoids subtle future issues (e.g., "1.10" vs "1.2").

---

These changes keep all existing behavior but make the importer easier to read, test, and evolve, and they directly address the flagged structural/duplication issues without a large refactor.
</issue_to_address>

### Comment 6
<location> `astrbot/core/backup/exporter.py:42` </location>
<code_context>
+    from astrbot.core.knowledge_base.kb_mgr import KnowledgeBaseManager
+
+
+# 主数据库模型类映射
+MAIN_DB_MODELS: dict[str, type[SQLModel]] = {
+    "platform_stats": PlatformStat,
</code_context>

<issue_to_address>
**issue (complexity):** Consider extracting the shared constants, serialization logic, and manifest-assembly helpers into reusable modules to make the exporter/importer code easier to read and maintain without changing behavior.

You can reduce the cognitive load here with a few targeted extractions that don’t change behavior.

### 1. Share constants between importer/exporter

`MAIN_DB_MODELS`, `KB_METADATA_MODELS` and `BACKUP_DIRECTORIES` are pure data and already identical to the importer’s copies. Moving them into a shared module removes duplication without changing behavior.

```python
# astrbot/core/backup/constants.py
from sqlmodel import SQLModel
from astrbot.core.db.po import (
    Attachment, CommandConfig, CommandConflict, ConversationV2,
    Persona, PlatformMessageHistory, PlatformSession, PlatformStat, Preference,
)
from astrbot.core.knowledge_base.models import KBDocument, KBMedia, KnowledgeBase

MAIN_DB_MODELS: dict[str, type[SQLModel]] = {
    "platform_stats": PlatformStat,
    "conversations": ConversationV2,
    "personas": Persona,
    "preferences": Preference,
    "platform_message_history": PlatformMessageHistory,
    "platform_sessions": PlatformSession,
    "attachments": Attachment,
    "command_configs": CommandConfig,
    "command_conflicts": CommandConflict,
}

KB_METADATA_MODELS: dict[str, type[SQLModel]] = {
    "knowledge_bases": KnowledgeBase,
    "kb_documents": KBDocument,
    "kb_media": KBMedia,
}

BACKUP_DIRECTORIES = {
    "plugins": "data/plugins",
    "plugin_data": "data/plugin_data",
    "config": "data/config",
    "t2i_templates": "data/t2i_templates",
    "webchat": "data/webchat",
    "temp": "data/temp",
}
```

Then in both importer/exporter:

```python
from astrbot.core.backup.constants import (
    MAIN_DB_MODELS, KB_METADATA_MODELS, BACKUP_DIRECTORIES,
)
```

This directly addresses point (2) from the review and simplifies future schema/directory changes.

### 2. Extract model serialization into a shared utility

`_model_to_dict` is generic and not specific to exporting. Moving it into a shared utility removes ORM details from the exporter and allows reuse in the importer.

```python
# astrbot/core/backup/serialization.py
from datetime import datetime
from typing import Any, Dict

from sqlalchemy import inspect as sa_inspect

def model_to_dict(record: Any) -> Dict:
    # Prefer SQLModel's model_dump when available
    if hasattr(record, "model_dump"):
        data = record.model_dump(mode="python")
    else:
        data = {}
        mapper = sa_inspect(record.__class__)
        for column in mapper.columns:
            data[column.name] = getattr(record, column.name)

    # Normalize datetime to ISO strings
    for key, value in list(data.items()):
        if isinstance(value, datetime):
            data[key] = value.isoformat()
    return data
```

Exporter:

```python
# from:
# export_data[table_name] = [self._model_to_dict(record) for record in records]

from astrbot.core.backup.serialization import model_to_dict

export_data[table_name] = [model_to_dict(record) for record in records]
```

You can do the same in the importer, then delete `_model_to_dict` from the exporter entirely.

### 3. Split `_generate_manifest` into small helpers

You can keep the class monolithic for now but simplify the manifest construction by extracting the collection logic into dedicated helpers. This keeps behavior identical but makes `_generate_manifest` much easier to scan and test.

```python
def _collect_attachment_entries(self, main_data: dict[str, list[dict]]) -> list[str]:
    files: list[str] = []
    for attachment in main_data.get("attachments", []):
        attachment_id = attachment.get("attachment_id", "")
        path = attachment.get("path", "")
        if attachment_id and path:
            ext = os.path.splitext(path)[1]
            files.append(f"{attachment_id}{ext}")
    return files

def _collect_kb_media_entries(self) -> dict[str, list[str]]:
    kb_media_files: dict[str, list[str]] = {}
    if not self.kb_manager:
        return kb_media_files

    for kb_id, kb_helper in self.kb_manager.kb_insts.items():
        media_dir = kb_helper.kb_medias_dir
        if not media_dir.exists():
            continue
        files: list[str] = []
        for root, _, filenames in os.walk(media_dir):
            for f in filenames:
                files.append(f)
        if files:
            kb_media_files[kb_id] = files
    return kb_media_files

def _collect_kb_document_tables(self) -> dict[str, str]:
    if not self.kb_manager:
        return {}
    return {kb_id: "documents" for kb_id in self.kb_manager.kb_insts.keys()}
```

Then `_generate_manifest` becomes mostly data assembly:

```python
def _generate_manifest(
    self,
    main_data: dict[str, list[dict]],
    kb_meta_data: dict[str, list[dict]],
    dir_stats: dict[str, dict[str, int]] | None = None,
) -> dict:
    if dir_stats is None:
        dir_stats = {}

    manifest = {
        "version": "1.1",
        "astrbot_version": VERSION,
        "exported_at": datetime.now(timezone.utc).isoformat(),
        "schema_version": {"main_db": "v4", "kb_db": "v1"},
        "tables": {
            "main_db": list(main_data.keys()),
            "kb_metadata": list(kb_meta_data.keys()),
            "kb_documents": self._collect_kb_document_tables(),
        },
        "files": {
            "attachments": self._collect_attachment_entries(main_data),
            "kb_media": self._collect_kb_media_entries(),
        },
        "directories": list(dir_stats.keys()),
        "checksums": self._checksums,
        "statistics": {
            "main_db": {t: len(rs) for t, rs in main_data.items()},
            "kb_metadata": {t: len(rs) for t, rs in kb_meta_data.items()},
            "directories": dir_stats,
        },
    }
    return manifest
```

This addresses complexity point (3) without changing the manifest structure.

---

If you want to go a step further later, the next low-risk refactor would be to turn the directory export and KB export into small collaborator classes (as the reviewer suggested) but the three changes above already remove a good amount of duplication and local complexity with minimal churn.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

astrbot/dashboard/routes/backup.py

astrbot/core/backup/importer.py

astrbot/core/backup/exporter.py

sourcery-ai · 2025-12-18T06:56:43Z

astrbot/dashboard/routes/backup.py

+            return await send_file(
+                file_path,
+                as_attachment=True,
+                attachment_filename=filename,
+            )


suggestion: 在较新的 Werkzeug/Quart 版本中，send_file 的 attachment_filename 参数已经被弃用。

为避免弃用警告，并保持与新版本 Werkzeug/Quart 的兼容性，请将此参数替换为 download_name=filename，同时继续保留 as_attachment=True。attachment_filename 已被弃用，并可能在未来版本中移除。

Suggested change

return await send_file(

file_path,

as_attachment=True,

attachment_filename=filename,

)

return await send_file(

file_path,

as_attachment=True,

download_name=filename,

)

Original comment in English

suggestion: The attachment_filename argument to send_file is deprecated in modern Werkzeug/Quart versions.

To avoid deprecation warnings and stay compatible with newer Werkzeug/Quart releases, please switch this to download_name=filename while keeping as_attachment=True. attachment_filename is deprecated and may be removed in future versions.

Suggested change

return await send_file(

file_path,

as_attachment=True,

attachment_filename=filename,

)

return await send_file(

file_path,

as_attachment=True,

download_name=filename,

)

astrbot/core/backup/importer.py

astrbot/core/backup/exporter.py

RC-CHN added 3 commits December 18, 2025 11:10

feat: 添加数据迁移功能

fa8e202

test: 添加迁移相关测试

70cda9a

feat: 备份插件及相关持久化目录

57bacab

auto-assign bot requested review from LIghtJUNction and anka-afk December 18, 2025 06:54

sourcery-ai bot reviewed Dec 18, 2025

View reviewed changes

RC-CHN added 6 commits December 18, 2025 15:07

fix: 修复版本号比较逻辑，添加相关测试

372628a

fix: 清洗文件名，添加相关测试

ce1af61

fix: 修复安全文件名测试用例断言

0d592af

refactor: 优化代码，为备份模块提取公用常量

39af9ab

feat: 修改备份版本校验逻辑，允许强制小版本间导入

25cb864

fix: 修复备份创建时间读取，修复备份相关i18n

40e9cfe

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

feat: Data backup #4105

feat: Data backup #4105

Uh oh!

RC-CHN commented Dec 18, 2025 •

edited by sourcery-ai bot

Loading

Uh oh!

sourcery-ai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sourcery-ai bot Dec 18, 2025

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

feat: Data backup #4105

Are you sure you want to change the base?

feat: Data backup #4105

Uh oh!

Conversation

RC-CHN commented Dec 18, 2025 • edited by sourcery-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Modifications / 改动点

Screenshots or Test Results / 运行截图或测试结果

Checklist / 检查清单

Summary by Sourcery

Summary by Sourcery

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sourcery-ai bot Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

RC-CHN commented Dec 18, 2025 •

edited by sourcery-ai bot

Loading