-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
feat: Data backup #4105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
feat: Data backup #4105
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
你好,我已经查看了你的改动,有一些问题需要处理。
- 在
AstrBotImporter._import_directories中,backup_version目前以字符串形式比较(例如if backup_version < "1.1"),对于多位数版本(如 "1.10" 和 "1.2")会出现错误行为;建议在比较前先将其解析为语义化版本或数字元组。 - 在
BackupDialog.vue中,v-file-input通常会返回一个File对象数组,但startImport直接将importFile.value追加到FormData;请确保正确处理:可以使用数组的第一个元素(例如importFile.value[0]),或者将v-file-input配置为multiple=false并显式处理其类型。 BackupRoute中内存里的backup_tasks和backup_progress字典从未被清理,因此长时间运行的实例可能会无限积累任务状态;建议为已完成/失败的任务设置过期或定期清理,或限制保留的历史数量。
给 AI Agent 的提示词
请根据以下代码审查中的评论进行修改:
## 总体评论
- 在 `AstrBotImporter._import_directories` 中,`backup_version` 目前以字符串形式比较(例如 `if backup_version < "1.1"`),对于多位数版本(如 "1.10" 和 "1.2")会出现错误行为;建议在比较前先将其解析为语义化版本或数字元组。
- 在 `BackupDialog.vue` 中,`v-file-input` 通常会返回一个 `File` 对象数组,但 `startImport` 直接将 `importFile.value` 追加到 `FormData`;请确保正确处理:可以使用数组的第一个元素(例如 `importFile.value[0]`),或者将 `v-file-input` 配置为 `multiple=false` 并显式处理其类型。
- `BackupRoute` 中内存里的 `backup_tasks` 和 `backup_progress` 字典从未被清理,因此长时间运行的实例可能会无限积累任务状态;建议为已完成/失败的任务设置过期或定期清理,或限制保留的历史数量。
## 逐条评论
### 评论 1
<location> `astrbot/dashboard/routes/backup.py:273-279` </location>
<code_context>
+ if "file" not in files:
+ return Response().error("缺少备份文件").__dict__
+
+ file = files["file"]
+ if not file.filename or not file.filename.endswith(".zip"):
+ return Response().error("请上传 ZIP 格式的备份文件").__dict__
+
+ # 保存上传的文件
+ Path(self.backup_dir).mkdir(parents=True, exist_ok=True)
+ zip_path = os.path.join(self.backup_dir, file.filename)
+ await file.save(zip_path)
+ else:
</code_context>
<issue_to_address>
**🚨 issue (security):** 直接使用上传文件名而不做清洗,会带来路径问题和意外覆盖的风险。
这里将文件写入 `self.backup_dir` 时使用了未经处理的 `file.filename`。即使目录是固定的,由客户端控制的文件名在某些平台上仍可能导致路径遍历问题,并可能覆盖已有备份。建议对文件名进行规范化处理(例如使用类似 `secure_filename` 的工具方法),并/或生成唯一的内部文件名(UUID/时间戳),如有需要,仅在元数据中保留原始文件名用于展示。
</issue_to_address>
### 评论 2
<location> `astrbot/core/backup/importer.py:523-524` </location>
<code_context>
+ dir_stats: dict[str, int] = {}
+
+ # 检查备份版本是否支持目录备份
+ backup_version = manifest.get("version", "1.0")
+ if backup_version < "1.1":
+ logger.info("备份版本不支持目录备份,跳过目录导入")
+ return dir_stats
</code_context>
<issue_to_address>
**issue (bug_risk):** 使用字符串比较 manifest 版本过于脆弱,在多位数版本号时会出现错误行为。
这里对 `backup_version` 使用字符串比较(例如 `"1.10" < "1.2"` 为 `True`),在多位数版本时会产生错误行为。由于该条件决定是否导入目录备份,建议将版本解析为数值组件(例如 `major, minor = map(int, backup_version.split('.'))`)再进行比较,或者使用专门的语义化版本比较工具方法来处理。
</issue_to_address>
### 评论 3
<location> `astrbot/core/backup/exporter.py:63-70` </location>
<code_context>
+}
+
+
+# 需要备份的目录列表
+BACKUP_DIRECTORIES = {
+ "plugins": "data/plugins", # 插件本体
+ "plugin_data": "data/plugin_data", # 插件数据
+ "config": "data/config", # 配置目录
+ "t2i_templates": "data/t2i_templates", # T2I 模板
+ "webchat": "data/webchat", # WebChat 数据
+ "temp": "data/temp", # 临时文件
+}
+
</code_context>
<issue_to_address>
**suggestion:** `BACKUP_DIRECTORIES` 映射在导出端和导入端都有一份重复定义,建议进行集中管理。
在两个文件中复制这份映射,存在两者配置逐渐不一致的风险(例如新增了导出的目录但忘记在导入端添加)。建议只在一个共享模块(或其中一个文件)中暴露统一常量,并在两边复用,以保持行为一致。
</issue_to_address>
### 评论 4
<location> `astrbot/dashboard/routes/backup.py:420-424` </location>
<code_context>
+ if not os.path.exists(file_path):
+ return Response().error("备份文件不存在").__dict__
+
+ return await send_file(
+ file_path,
+ as_attachment=True,
+ attachment_filename=filename,
+ )
+ except Exception as e:
</code_context>
<issue_to_address>
**suggestion:** 在较新的 Werkzeug/Quart 版本中,`send_file` 的 `attachment_filename` 参数已经被弃用。
为避免弃用警告,并保持与新版本 Werkzeug/Quart 的兼容性,请将此参数替换为 `download_name=filename`,同时继续保留 `as_attachment=True`。`attachment_filename` 已被弃用,并可能在未来版本中移除。
```suggestion
return await send_file(
file_path,
as_attachment=True,
download_name=filename,
)
```
</issue_to_address>
### 评论 5
<location> `astrbot/core/backup/importer.py:43` </location>
<code_context>
+
+
+# 主数据库模型类映射
+MAIN_DB_MODELS: dict[str, type[SQLModel]] = {
+ "platform_stats": PlatformStat,
+ "conversations": ConversationV2,
</code_context>
<issue_to_address>
**issue (complexity):** 建议抽取共享映射/工具方法,并将较大的 `import_all` 流程拆分成更小的辅助函数,以在保持行为不变的前提下降低重复和结构复杂度。
可以在不改变任何功能的前提下,通过几个小而集中的重构显著降低结构复杂度:
---
### 1. 和导出逻辑去重映射/常量
`MAIN_DB_MODELS`、`KB_METADATA_MODELS` 和 `BACKUP_DIRECTORIES` 都是纯数据,并且和导出端重复。可以将它们提取到一个共享模块中,然后在导入/导出两侧引用:
```python
# astrbot/core/backup/common.py
from sqlmodel import SQLModel
from astrbot.core.db.po import (
Attachment,
CommandConfig,
CommandConflict,
ConversationV2,
Persona,
PlatformMessageHistory,
PlatformSession,
PlatformStat,
Preference,
)
from astrbot.core.knowledge_base.models import (
KBDocument,
KBMedia,
KnowledgeBase,
)
MAIN_DB_MODELS: dict[str, type[SQLModel]] = {
"platform_stats": PlatformStat,
"conversations": ConversationV2,
"personas": Persona,
"preferences": Preference,
"platform_message_history": PlatformMessageHistory,
"platform_sessions": PlatformSession,
"attachments": Attachment,
"command_configs": CommandConfig,
"command_conflicts": CommandConflict,
}
KB_METADATA_MODELS: dict[str, type[SQLModel]] = {
"knowledge_bases": KnowledgeBase,
"kb_documents": KBDocument,
"kb_media": KBMedia,
}
BACKUP_DIRECTORIES = {
"plugins": "data/plugins",
"plugin_data": "data/plugin_data",
"config": "data/config",
"t2i_templates": "data/t2i_templates",
"webchat": "data/webchat",
"temp": "data/temp",
}
```
然后在 importer 中:
```python
# from astrbot.core.db.po import ...
# from astrbot.core.knowledge_base.models import ...
from astrbot.core.backup.common import (
MAIN_DB_MODELS,
KB_METADATA_MODELS,
BACKUP_DIRECTORIES,
)
```
这样可以去除重复,并让 schema/备份策略的修改更易于理解和维护。
---
### 2. 将 `import_all` 的编排逻辑拆分为更小的私有方法
`import_all` 目前在一个长方法中串联了很多职责。你可以通过拆分为多个“步骤”方法,在保持行为一致的情况下,让代码更易读、更易测试:
```python
# inside AstrBotImporter
async def import_all(self, zip_path: str, mode: str = "replace", progress_callback: Any | None = None) -> ImportResult:
result = ImportResult()
if not os.path.exists(zip_path):
result.add_error(f"备份文件不存在: {zip_path}")
return result
logger.info(f"开始从 {zip_path} 导入备份")
try:
with zipfile.ZipFile(zip_path, "r") as zf:
manifest = await self._load_and_validate_manifest(zf, result, progress_callback)
if not result.success:
return result
main_data = await self._import_main_db_from_zip(zf, mode, result, progress_callback)
await self._import_kb_from_zip(zf, manifest, main_data, result, mode, progress_callback)
await self._import_config_from_zip(zf, result, progress_callback)
await self._import_attachments_from_zip(zf, main_data, result, progress_callback)
await self._import_directories_from_zip(zf, manifest, result, progress_callback)
logger.info(f"备份导入完成: {result.to_dict()}")
return result
except zipfile.BadZipFile:
result.add_error("无效的 ZIP 文件")
return result
except Exception as e:
result.add_error(f"导入失败: {e}")
return result
```
配合一些小的辅助方法(这里只展示签名和关键行):
```python
async def _load_and_validate_manifest(self, zf, result, progress_callback) -> dict:
if progress_callback:
await progress_callback("validate", 0, 100, "正在验证备份文件...")
try:
manifest_data = zf.read("manifest.json")
manifest = json.loads(manifest_data)
self._validate_version(manifest)
except (KeyError, json.JSONDecodeError, ValueError) as e:
result.add_error(f"manifest 错误: {e}")
return {}
if progress_callback:
await progress_callback("validate", 100, 100, "验证完成")
return manifest
async def _import_main_db_from_zip(self, zf, mode, result, progress_callback) -> dict:
if progress_callback:
await progress_callback("main_db", 0, 100, "正在导入主数据库...")
main_data = json.loads(zf.read("databases/main_db.json"))
if mode == "replace":
await self._clear_main_db()
imported = await self._import_main_database(main_data)
result.imported_tables.update(imported)
if progress_callback:
await progress_callback("main_db", 100, 100, "主数据库导入完成")
return main_data
```
其它步骤(`_import_kb_from_zip`、`_import_config_from_zip` 等)也可以用类似方式封装现有内部调用,将编排样板代码从 `import_all` 中移出。
---
### 3. 将日期时间转换逻辑提取为可复用工具
`_convert_datetime_fields` 同时处理 SQLAlchemy 反射和时间解析。将其移动到模块级辅助函数(或共享备份工具)会让类更精简,并允许导出端/测试复用:
```python
# astrbot/core/backup/datetime_utils.py
from datetime import datetime
from typing import Any
from sqlalchemy import inspect as sa_inspect, DateTime
def convert_datetime_fields(row: dict[str, Any], model_class: type) -> dict[str, Any]:
result = row.copy()
try:
mapper = sa_inspect(model_class)
for column in mapper.columns:
if column.name in result and result[column.name] is not None:
if isinstance(column.type, DateTime) and isinstance(result[column.name], str):
result[column.name] = datetime.fromisoformat(result[column.name])
except Exception:
pass
return result
```
然后在 importer 方法中:
```python
from astrbot.core.backup.datetime_utils import convert_datetime_fields
# ...
row = convert_datetime_fields(row, model_class)
```
---
### 4. 在 `_import_directories` 中避免基于字符串的版本比较
`backup_version < "1.1"` 这样的字符串比较在后续版本中不够健壮。切换到 `packaging.version` 可以让逻辑更可靠:
```python
from packaging.version import Version
# inside _import_directories
backup_version_str = manifest.get("version", "1.0")
backup_version = Version(backup_version_str)
if backup_version < Version("1.1"):
logger.info("备份版本不支持目录备份,跳过目录导入")
return dir_stats
```
这是一个对现有行为影响极小的修改,但可以避免未来出现像 "1.10" 和 "1.2" 这样的细微问题。
---
这些改动在保留现有行为的前提下,使 importer 更易于阅读、测试和演进,并且直接解决了此前指出的结构/重复问题,而无需进行大规模重构。
</issue_to_address>
### 评论 6
<location> `astrbot/core/backup/exporter.py:42` </location>
<code_context>
+ from astrbot.core.knowledge_base.kb_mgr import KnowledgeBaseManager
+
+
+# 主数据库模型类映射
+MAIN_DB_MODELS: dict[str, type[SQLModel]] = {
+ "platform_stats": PlatformStat,
</code_context>
<issue_to_address>
**issue (complexity):** 建议将共享常量、序列化逻辑和 manifest 构建辅助方法提取到可复用模块中,从而在不改变行为的前提下,让导出/导入代码更易读、更易维护。
你可以通过几个不改变行为的定向抽取,降低这里的理解成本。
### 1. 在导入/导出间共享常量
`MAIN_DB_MODELS`、`KB_METADATA_MODELS` 和 `BACKUP_DIRECTORIES` 都是纯数据,并且已经与 importer 中的定义完全相同。将它们提取到共享模块可以消除重复,又不会改变行为。
```python
# astrbot/core/backup/constants.py
from sqlmodel import SQLModel
from astrbot.core.db.po import (
Attachment, CommandConfig, CommandConflict, ConversationV2,
Persona, PlatformMessageHistory, PlatformSession, PlatformStat, Preference,
)
from astrbot.core.knowledge_base.models import KBDocument, KBMedia, KnowledgeBase
MAIN_DB_MODELS: dict[str, type[SQLModel]] = {
"platform_stats": PlatformStat,
"conversations": ConversationV2,
"personas": Persona,
"preferences": Preference,
"platform_message_history": PlatformMessageHistory,
"platform_sessions": PlatformSession,
"attachments": Attachment,
"command_configs": CommandConfig,
"command_conflicts": CommandConflict,
}
KB_METADATA_MODELS: dict[str, type[SQLModel]] = {
"knowledge_bases": KnowledgeBase,
"kb_documents": KBDocument,
"kb_media": KBMedia,
}
BACKUP_DIRECTORIES = {
"plugins": "data/plugins",
"plugin_data": "data/plugin_data",
"config": "data/config",
"t2i_templates": "data/t2i_templates",
"webchat": "data/webchat",
"temp": "data/temp",
}
```
然后在 importer/exporter 中统一:
```python
from astrbot.core.backup.constants import (
MAIN_DB_MODELS, KB_METADATA_MODELS, BACKUP_DIRECTORIES,
)
```
这可以直接解决审查意见中的第 (2) 点,并简化未来的 schema/目录调整。
### 2. 将模型序列化提取为共享工具
`_model_to_dict` 是通用逻辑,并不特定于导出流程。将其移动到共享工具模块,可以将 ORM 细节从 exporter 中抽离,并允许 importer 复用:
```python
# astrbot/core/backup/serialization.py
from datetime import datetime
from typing import Any, Dict
from sqlalchemy import inspect as sa_inspect
def model_to_dict(record: Any) -> Dict:
# 优先使用 SQLModel 自带的 model_dump
if hasattr(record, "model_dump"):
data = record.model_dump(mode="python")
else:
data = {}
mapper = sa_inspect(record.__class__)
for column in mapper.columns:
data[column.name] = getattr(record, column.name)
# 将 datetime 统一转换为 ISO 字符串
for key, value in list(data.items()):
if isinstance(value, datetime):
data[key] = value.isoformat()
return data
```
在 exporter 中:
```python
# 原来:
# export_data[table_name] = [self._model_to_dict(record) for record in records]
from astrbot.core.backup.serialization import model_to_dict
export_data[table_name] = [model_to_dict(record) for record in records]
```
在 importer 中也可以同样使用,然后就可以完全删除 exporter 里的 `_model_to_dict`。
### 3. 将 `_generate_manifest` 拆分为更小的辅助方法
在保持类整体结构不变的前提下,可以通过提取收集逻辑为专用助手,使 manifest 构建代码更易于理解和测试,同时保持行为完全一致。
```python
def _collect_attachment_entries(self, main_data: dict[str, list[dict]]) -> list[str]:
files: list[str] = []
for attachment in main_data.get("attachments", []):
attachment_id = attachment.get("attachment_id", "")
path = attachment.get("path", "")
if attachment_id and path:
ext = os.path.splitext(path)[1]
files.append(f"{attachment_id}{ext}")
return files
def _collect_kb_media_entries(self) -> dict[str, list[str]]:
kb_media_files: dict[str, list[str]] = {}
if not self.kb_manager:
return kb_media_files
for kb_id, kb_helper in self.kb_manager.kb_insts.items():
media_dir = kb_helper.kb_medias_dir
if not media_dir.exists():
continue
files: list[str] = []
for root, _, filenames in os.walk(media_dir):
for f in filenames:
files.append(f)
if files:
kb_media_files[kb_id] = files
return kb_media_files
def _collect_kb_document_tables(self) -> dict[str, str]:
if not self.kb_manager:
return {}
return {kb_id: "documents" for kb_id in self.kb_manager.kb_insts.keys()}
```
然后 `_generate_manifest` 主要是数据组装:
```python
def _generate_manifest(
self,
main_data: dict[str, list[dict]],
kb_meta_data: dict[str, list[dict]],
dir_stats: dict[str, dict[str, int]] | None = None,
) -> dict:
if dir_stats is None:
dir_stats = {}
manifest = {
"version": "1.1",
"astrbot_version": VERSION,
"exported_at": datetime.now(timezone.utc).isoformat(),
"schema_version": {"main_db": "v4", "kb_db": "v1"},
"tables": {
"main_db": list(main_data.keys()),
"kb_metadata": list(kb_meta_data.keys()),
"kb_documents": self._collect_kb_document_tables(),
},
"files": {
"attachments": self._collect_attachment_entries(main_data),
"kb_media": self._collect_kb_media_entries(),
},
"directories": list(dir_stats.keys()),
"checksums": self._checksums,
"statistics": {
"main_db": {t: len(rs) for t, rs in main_data.items()},
"kb_metadata": {t: len(rs) for t, rs in kb_meta_data.items()},
"directories": dir_stats,
},
}
return manifest
```
这可以在不改变 manifest 结构的前提下,解决复杂度方面的第 (3) 点意见。
---
如果之后希望再进一步重构,下一步可以按照之前建议,将目录导出和知识库导出拆分为小的协作类。但目前上述三点已经在不产生较大变更的前提下,去掉了相当多的重复和本地复杂度。
</issue_to_address>请帮我变得更有用!欢迎对每条评论点 👍 或 👎,我会根据反馈改进后续的审查质量。
Original comment in English
Hey there - I've reviewed your changes and found some issues that need to be addressed.
- In
AstrBotImporter._import_directories,backup_versionis compared as a string (e.g.,if backup_version < "1.1"), which will behave incorrectly for multi-digit versions ("1.10" vs "1.2"); consider parsing it as a semantic version or numeric tuple before comparing. - In
BackupDialog.vue,v-file-inputtypically returns an array ofFileobjects, butstartImportappendsimportFile.valuedirectly toFormData; ensure you account for this by using the first element (e.g.,importFile.value[0]) or configuringv-file-inputwithmultiple=falseand handling the type explicitly. - The in-memory
backup_tasksandbackup_progressdictionaries inBackupRouteare never cleaned up, so long-running instances may accumulate task state indefinitely; consider expiring or pruning completed/failed tasks after some time or limiting the stored history.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `AstrBotImporter._import_directories`, `backup_version` is compared as a string (e.g., `if backup_version < "1.1"`), which will behave incorrectly for multi-digit versions ("1.10" vs "1.2"); consider parsing it as a semantic version or numeric tuple before comparing.
- In `BackupDialog.vue`, `v-file-input` typically returns an array of `File` objects, but `startImport` appends `importFile.value` directly to `FormData`; ensure you account for this by using the first element (e.g., `importFile.value[0]`) or configuring `v-file-input` with `multiple=false` and handling the type explicitly.
- The in-memory `backup_tasks` and `backup_progress` dictionaries in `BackupRoute` are never cleaned up, so long-running instances may accumulate task state indefinitely; consider expiring or pruning completed/failed tasks after some time or limiting the stored history.
## Individual Comments
### Comment 1
<location> `astrbot/dashboard/routes/backup.py:273-279` </location>
<code_context>
+ if "file" not in files:
+ return Response().error("缺少备份文件").__dict__
+
+ file = files["file"]
+ if not file.filename or not file.filename.endswith(".zip"):
+ return Response().error("请上传 ZIP 格式的备份文件").__dict__
+
+ # 保存上传的文件
+ Path(self.backup_dir).mkdir(parents=True, exist_ok=True)
+ zip_path = os.path.join(self.backup_dir, file.filename)
+ await file.save(zip_path)
+ else:
</code_context>
<issue_to_address>
**🚨 issue (security):** Using the uploaded filename directly without sanitization risks path issues and unintended overwrites.
Here the file is written to `self.backup_dir` using the raw `file.filename`. Even with a fixed directory, client-controlled names can enable path traversal on some platforms and can overwrite existing backups. Normalize the filename (e.g., `secure_filename`-style helper) and/or generate a unique internal name (UUID/timestamp), retaining the original name only in metadata if needed for display.
</issue_to_address>
### Comment 2
<location> `astrbot/core/backup/importer.py:523-524` </location>
<code_context>
+ dir_stats: dict[str, int] = {}
+
+ # 检查备份版本是否支持目录备份
+ backup_version = manifest.get("version", "1.0")
+ if backup_version < "1.1":
+ logger.info("备份版本不支持目录备份,跳过目录导入")
+ return dir_stats
</code_context>
<issue_to_address>
**issue (bug_risk):** String comparison for manifest version is brittle and can misbehave for multi-digit versions.
Here `backup_version` is compared as a string (e.g., `"1.10" < "1.2"` is `True`), which will misbehave for multi-digit versions. Since this controls whether directory backups are imported, consider parsing the version into numeric components (e.g., `major, minor = map(int, backup_version.split('.'))`) for comparison, or delegating to a helper that does semantic version comparison.
</issue_to_address>
### Comment 3
<location> `astrbot/core/backup/exporter.py:63-70` </location>
<code_context>
+}
+
+
+# 需要备份的目录列表
+BACKUP_DIRECTORIES = {
+ "plugins": "data/plugins", # 插件本体
+ "plugin_data": "data/plugin_data", # 插件数据
+ "config": "data/config", # 配置目录
+ "t2i_templates": "data/t2i_templates", # T2I 模板
+ "webchat": "data/webchat", # WebChat 数据
+ "temp": "data/temp", # 临时文件
+}
+
</code_context>
<issue_to_address>
**suggestion:** The BACKUP_DIRECTORIES mapping is duplicated in both exporter and importer; consider centralizing it.
Duplicating this mapping in both files risks them drifting out of sync (e.g., new directories exported but not imported). Exposing a single constant (from a shared module or one of the files) and reusing it in both places would keep behavior aligned.
</issue_to_address>
### Comment 4
<location> `astrbot/dashboard/routes/backup.py:420-424` </location>
<code_context>
+ if not os.path.exists(file_path):
+ return Response().error("备份文件不存在").__dict__
+
+ return await send_file(
+ file_path,
+ as_attachment=True,
+ attachment_filename=filename,
+ )
+ except Exception as e:
</code_context>
<issue_to_address>
**suggestion:** The `attachment_filename` argument to send_file is deprecated in modern Werkzeug/Quart versions.
To avoid deprecation warnings and stay compatible with newer Werkzeug/Quart releases, please switch this to `download_name=filename` while keeping `as_attachment=True`. `attachment_filename` is deprecated and may be removed in future versions.
```suggestion
return await send_file(
file_path,
as_attachment=True,
download_name=filename,
)
```
</issue_to_address>
### Comment 5
<location> `astrbot/core/backup/importer.py:43` </location>
<code_context>
+
+
+# 主数据库模型类映射
+MAIN_DB_MODELS: dict[str, type[SQLModel]] = {
+ "platform_stats": PlatformStat,
+ "conversations": ConversationV2,
</code_context>
<issue_to_address>
**issue (complexity):** Consider extracting shared mappings/utilities and splitting the large import_all flow into smaller helpers to reduce duplication and structural complexity while preserving behavior.
You can keep all functionality and significantly reduce structural complexity with a few small, focused refactors:
---
### 1. Deduplicate mappings/constants with exporter
`MAIN_DB_MODELS`, `KB_METADATA_MODELS`, and `BACKUP_DIRECTORIES` are pure data and are duplicated with the exporter. Extract them to a shared module and import from both sides:
```python
# astrbot/core/backup/common.py
from sqlmodel import SQLModel
from astrbot.core.db.po import (
Attachment,
CommandConfig,
CommandConflict,
ConversationV2,
Persona,
PlatformMessageHistory,
PlatformSession,
PlatformStat,
Preference,
)
from astrbot.core.knowledge_base.models import (
KBDocument,
KBMedia,
KnowledgeBase,
)
MAIN_DB_MODELS: dict[str, type[SQLModel]] = {
"platform_stats": PlatformStat,
"conversations": ConversationV2,
"personas": Persona,
"preferences": Preference,
"platform_message_history": PlatformMessageHistory,
"platform_sessions": PlatformSession,
"attachments": Attachment,
"command_configs": CommandConfig,
"command_conflicts": CommandConflict,
}
KB_METADATA_MODELS: dict[str, type[SQLModel]] = {
"knowledge_bases": KnowledgeBase,
"kb_documents": KBDocument,
"kb_media": KBMedia,
}
BACKUP_DIRECTORIES = {
"plugins": "data/plugins",
"plugin_data": "data/plugin_data",
"config": "data/config",
"t2i_templates": "data/t2i_templates",
"webchat": "data/webchat",
"temp": "data/temp",
}
```
Then in the importer:
```python
# from astrbot.core.db.po import ...
# from astrbot.core.knowledge_base.models import ...
from astrbot.core.backup.common import (
MAIN_DB_MODELS,
KB_METADATA_MODELS,
BACKUP_DIRECTORIES,
)
```
This removes duplication and makes schema/backup changes easier to reason about.
---
### 2. Split `import_all` orchestration into smaller private methods
`import_all` currently wires many concerns together in one long method. You can preserve behavior while making it much easier to follow and test by extracting “step” methods:
```python
# inside AstrBotImporter
async def import_all(self, zip_path: str, mode: str = "replace", progress_callback: Any | None = None) -> ImportResult:
result = ImportResult()
if not os.path.exists(zip_path):
result.add_error(f"备份文件不存在: {zip_path}")
return result
logger.info(f"开始从 {zip_path} 导入备份")
try:
with zipfile.ZipFile(zip_path, "r") as zf:
manifest = await self._load_and_validate_manifest(zf, result, progress_callback)
if not result.success:
return result
main_data = await self._import_main_db_from_zip(zf, mode, result, progress_callback)
await self._import_kb_from_zip(zf, manifest, main_data, result, mode, progress_callback)
await self._import_config_from_zip(zf, result, progress_callback)
await self._import_attachments_from_zip(zf, main_data, result, progress_callback)
await self._import_directories_from_zip(zf, manifest, result, progress_callback)
logger.info(f"备份导入完成: {result.to_dict()}")
return result
except zipfile.BadZipFile:
result.add_error("无效的 ZIP 文件")
return result
except Exception as e:
result.add_error(f"导入失败: {e}")
return result
```
With small helpers (showing only signatures + key lines):
```python
async def _load_and_validate_manifest(self, zf, result, progress_callback) -> dict:
if progress_callback:
await progress_callback("validate", 0, 100, "正在验证备份文件...")
try:
manifest_data = zf.read("manifest.json")
manifest = json.loads(manifest_data)
self._validate_version(manifest)
except (KeyError, json.JSONDecodeError, ValueError) as e:
result.add_error(f"manifest 错误: {e}")
return {}
if progress_callback:
await progress_callback("validate", 100, 100, "验证完成")
return manifest
async def _import_main_db_from_zip(self, zf, mode, result, progress_callback) -> dict:
if progress_callback:
await progress_callback("main_db", 0, 100, "正在导入主数据库...")
main_data = json.loads(zf.read("databases/main_db.json"))
if mode == "replace":
await self._clear_main_db()
imported = await self._import_main_database(main_data)
result.imported_tables.update(imported)
if progress_callback:
await progress_callback("main_db", 100, 100, "主数据库导入完成")
return main_data
```
The other steps (`_import_kb_from_zip`, `_import_config_from_zip`, etc.) can similarly just wrap existing internal calls and move orchestration boilerplate out of `import_all`.
---
### 3. Extract datetime conversion to a reusable utility
`_convert_datetime_fields` mixes SQLAlchemy reflection and parsing. Moving it to a module-level helper (or shared backup utility) makes the class leaner and allows reuse by exporter/tests:
```python
# astrbot/core/backup/datetime_utils.py
from datetime import datetime
from typing import Any
from sqlalchemy import inspect as sa_inspect, DateTime
def convert_datetime_fields(row: dict[str, Any], model_class: type) -> dict[str, Any]:
result = row.copy()
try:
mapper = sa_inspect(model_class)
for column in mapper.columns:
if column.name in result and result[column.name] is not None:
if isinstance(column.type, DateTime) and isinstance(result[column.name], str):
result[column.name] = datetime.fromisoformat(result[column.name])
except Exception:
pass
return result
```
Then in the importer methods:
```python
from astrbot.core.backup.datetime_utils import convert_datetime_fields
# ...
row = convert_datetime_fields(row, model_class)
```
---
### 4. Avoid string-based version comparison in `_import_directories`
The `backup_version < "1.1"` string comparison is fragile for future versions. Switching to `packaging.version` keeps logic robust:
```python
from packaging.version import Version
# inside _import_directories
backup_version_str = manifest.get("version", "1.0")
backup_version = Version(backup_version_str)
if backup_version < Version("1.1"):
logger.info("备份版本不支持目录备份,跳过目录导入")
return dir_stats
```
This is a minimal change that keeps behavior the same now but avoids subtle future issues (e.g., "1.10" vs "1.2").
---
These changes keep all existing behavior but make the importer easier to read, test, and evolve, and they directly address the flagged structural/duplication issues without a large refactor.
</issue_to_address>
### Comment 6
<location> `astrbot/core/backup/exporter.py:42` </location>
<code_context>
+ from astrbot.core.knowledge_base.kb_mgr import KnowledgeBaseManager
+
+
+# 主数据库模型类映射
+MAIN_DB_MODELS: dict[str, type[SQLModel]] = {
+ "platform_stats": PlatformStat,
</code_context>
<issue_to_address>
**issue (complexity):** Consider extracting the shared constants, serialization logic, and manifest-assembly helpers into reusable modules to make the exporter/importer code easier to read and maintain without changing behavior.
You can reduce the cognitive load here with a few targeted extractions that don’t change behavior.
### 1. Share constants between importer/exporter
`MAIN_DB_MODELS`, `KB_METADATA_MODELS` and `BACKUP_DIRECTORIES` are pure data and already identical to the importer’s copies. Moving them into a shared module removes duplication without changing behavior.
```python
# astrbot/core/backup/constants.py
from sqlmodel import SQLModel
from astrbot.core.db.po import (
Attachment, CommandConfig, CommandConflict, ConversationV2,
Persona, PlatformMessageHistory, PlatformSession, PlatformStat, Preference,
)
from astrbot.core.knowledge_base.models import KBDocument, KBMedia, KnowledgeBase
MAIN_DB_MODELS: dict[str, type[SQLModel]] = {
"platform_stats": PlatformStat,
"conversations": ConversationV2,
"personas": Persona,
"preferences": Preference,
"platform_message_history": PlatformMessageHistory,
"platform_sessions": PlatformSession,
"attachments": Attachment,
"command_configs": CommandConfig,
"command_conflicts": CommandConflict,
}
KB_METADATA_MODELS: dict[str, type[SQLModel]] = {
"knowledge_bases": KnowledgeBase,
"kb_documents": KBDocument,
"kb_media": KBMedia,
}
BACKUP_DIRECTORIES = {
"plugins": "data/plugins",
"plugin_data": "data/plugin_data",
"config": "data/config",
"t2i_templates": "data/t2i_templates",
"webchat": "data/webchat",
"temp": "data/temp",
}
```
Then in both importer/exporter:
```python
from astrbot.core.backup.constants import (
MAIN_DB_MODELS, KB_METADATA_MODELS, BACKUP_DIRECTORIES,
)
```
This directly addresses point (2) from the review and simplifies future schema/directory changes.
### 2. Extract model serialization into a shared utility
`_model_to_dict` is generic and not specific to exporting. Moving it into a shared utility removes ORM details from the exporter and allows reuse in the importer.
```python
# astrbot/core/backup/serialization.py
from datetime import datetime
from typing import Any, Dict
from sqlalchemy import inspect as sa_inspect
def model_to_dict(record: Any) -> Dict:
# Prefer SQLModel's model_dump when available
if hasattr(record, "model_dump"):
data = record.model_dump(mode="python")
else:
data = {}
mapper = sa_inspect(record.__class__)
for column in mapper.columns:
data[column.name] = getattr(record, column.name)
# Normalize datetime to ISO strings
for key, value in list(data.items()):
if isinstance(value, datetime):
data[key] = value.isoformat()
return data
```
Exporter:
```python
# from:
# export_data[table_name] = [self._model_to_dict(record) for record in records]
from astrbot.core.backup.serialization import model_to_dict
export_data[table_name] = [model_to_dict(record) for record in records]
```
You can do the same in the importer, then delete `_model_to_dict` from the exporter entirely.
### 3. Split `_generate_manifest` into small helpers
You can keep the class monolithic for now but simplify the manifest construction by extracting the collection logic into dedicated helpers. This keeps behavior identical but makes `_generate_manifest` much easier to scan and test.
```python
def _collect_attachment_entries(self, main_data: dict[str, list[dict]]) -> list[str]:
files: list[str] = []
for attachment in main_data.get("attachments", []):
attachment_id = attachment.get("attachment_id", "")
path = attachment.get("path", "")
if attachment_id and path:
ext = os.path.splitext(path)[1]
files.append(f"{attachment_id}{ext}")
return files
def _collect_kb_media_entries(self) -> dict[str, list[str]]:
kb_media_files: dict[str, list[str]] = {}
if not self.kb_manager:
return kb_media_files
for kb_id, kb_helper in self.kb_manager.kb_insts.items():
media_dir = kb_helper.kb_medias_dir
if not media_dir.exists():
continue
files: list[str] = []
for root, _, filenames in os.walk(media_dir):
for f in filenames:
files.append(f)
if files:
kb_media_files[kb_id] = files
return kb_media_files
def _collect_kb_document_tables(self) -> dict[str, str]:
if not self.kb_manager:
return {}
return {kb_id: "documents" for kb_id in self.kb_manager.kb_insts.keys()}
```
Then `_generate_manifest` becomes mostly data assembly:
```python
def _generate_manifest(
self,
main_data: dict[str, list[dict]],
kb_meta_data: dict[str, list[dict]],
dir_stats: dict[str, dict[str, int]] | None = None,
) -> dict:
if dir_stats is None:
dir_stats = {}
manifest = {
"version": "1.1",
"astrbot_version": VERSION,
"exported_at": datetime.now(timezone.utc).isoformat(),
"schema_version": {"main_db": "v4", "kb_db": "v1"},
"tables": {
"main_db": list(main_data.keys()),
"kb_metadata": list(kb_meta_data.keys()),
"kb_documents": self._collect_kb_document_tables(),
},
"files": {
"attachments": self._collect_attachment_entries(main_data),
"kb_media": self._collect_kb_media_entries(),
},
"directories": list(dir_stats.keys()),
"checksums": self._checksums,
"statistics": {
"main_db": {t: len(rs) for t, rs in main_data.items()},
"kb_metadata": {t: len(rs) for t, rs in kb_meta_data.items()},
"directories": dir_stats,
},
}
return manifest
```
This addresses complexity point (3) without changing the manifest structure.
---
If you want to go a step further later, the next low-risk refactor would be to turn the directory export and KB export into small collaborator classes (as the reviewer suggested) but the three changes above already remove a good amount of duplication and local complexity with minimal churn.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| return await send_file( | ||
| file_path, | ||
| as_attachment=True, | ||
| attachment_filename=filename, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: 在较新的 Werkzeug/Quart 版本中,send_file 的 attachment_filename 参数已经被弃用。
为避免弃用警告,并保持与新版本 Werkzeug/Quart 的兼容性,请将此参数替换为 download_name=filename,同时继续保留 as_attachment=True。attachment_filename 已被弃用,并可能在未来版本中移除。
| return await send_file( | |
| file_path, | |
| as_attachment=True, | |
| attachment_filename=filename, | |
| ) | |
| return await send_file( | |
| file_path, | |
| as_attachment=True, | |
| download_name=filename, | |
| ) |
Original comment in English
suggestion: The attachment_filename argument to send_file is deprecated in modern Werkzeug/Quart versions.
To avoid deprecation warnings and stay compatible with newer Werkzeug/Quart releases, please switch this to download_name=filename while keeping as_attachment=True. attachment_filename is deprecated and may be removed in future versions.
| return await send_file( | |
| file_path, | |
| as_attachment=True, | |
| attachment_filename=filename, | |
| ) | |
| return await send_file( | |
| file_path, | |
| as_attachment=True, | |
| download_name=filename, | |
| ) |
Modifications / 改动点
添加了从orm导出数据库,备份插件目录以及再导入的相关方法,在web上添加了相关页面与路由允许用户导入导出astrbot备份。
Screenshots or Test Results / 运行截图或测试结果
Checklist / 检查清单
requirements.txt和pyproject.toml文件相应位置。/ I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations inrequirements.txtandpyproject.toml.Summary by Sourcery
在 AstrBot 核心和控制台中新增完整的数据备份与恢复能力,涵盖后端 API、导入/导出逻辑,以及设置中的 Web UI 对话框。
新功能:
优化改进:
测试:
Original summary in English
Summary by Sourcery
Add a full data backup and restore capability across the AstrBot core and dashboard, including backend APIs, import/export logic, and a web UI dialog in settings.
New Features:
Enhancements:
Tests: