cubenlp · RexWzh · Nov 15, 2025 · Oct 16, 2025 · Oct 16, 2025 · Oct 16, 2025
diff --git a/.github/workflows/publish.yml b/.github/workflows/publish.yml
@@ -26,6 +26,6 @@ jobs:
           TWINE_USERNAME: ${{ secrets.PYPI_USERNAME }}
           TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }}
         run: |
-          python setup.py sdist bdist_wheel
+          python3 -m build
           twine check dist/*
           twine upload dist/* --skip-existing
diff --git a/.github/workflows/test.yml b/.github/workflows/test.yml
@@ -37,6 +37,9 @@ jobs:
     - name: Test with pytest and coverage
       run: |
         pip install coverage
+        mkdir -p .logs
+        nohup chattool.capture-server --daemon --port 8000 &
+        sleep 1
         coverage run -m pytest -s tests/
     - name: Upload coverage to Codecov
       uses: codecov/codecov-action@v4

diff --git a/README.md b/README.md
@@ -38,15 +38,6 @@ export OPENAI_API_KEY="sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
 export OPENAI_API_BASE="https://api.example.com/v1"
 export OPENAI_API_BASE_URL="https://api.example.com" # 可选
 ```
-
-或者在代码中设置：
-
-```py
-import chattool
-chattool.api_key = "sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
-chattool.api_base = "https://api.example.com/v1"
-```
-
 注：环境变量 `OPENAI_API_BASE` 优先于 `OPENAI_API_BASE_URL`，二者选其一即可。
 
 ### 示例
@@ -55,12 +46,12 @@ chattool.api_base = "https://api.example.com/v1"
 
 ```python
 # 初次对话
-chat = Chat("Hello, GPT-3.5!")
-resp = chat.getresponse()
+chat = Chat("Hello!")
+resp = chat.get_response()
 
 # 继续对话
 chat.user("How are you?")
-next_resp = chat.getresponse()
+next_resp = chat.get_response()
 
 # 人为添加返回内容
 chat.user("What's your name?")
@@ -76,114 +67,47 @@ chat.print_log()
 示例2，批量处理数据（串行），并使用缓存文件 `chat.jsonl`：
 
 ```python
-# 编写处理函数
-def data2chat(msg):
-    chat = Chat()
-    chat.system("你是一个熟练的数字翻译家。")
-    chat.user(f"请将该数字翻译为罗马数字：{msg}")
-    # 注意，在函数内获取返回
-    chat.getresponse()
-    return chat
-
-checkpoint = "chat.jsonl" # 缓存文件的名称
+# 串行处理（按需保存）
 msgs = ["1", "2", "3", "4", "5", "6", "7", "8", "9"]
-# 处理数据，如果 checkpoint 存在，则从上次中断处继续
-continue_chats = process_chats(msgs, data2chat, checkpoint)
-```
-
-示例3，批量处理数据（异步并行），用不同语言打印 hello，并使用两个协程：
-
-```python
-from chattool import chat_completion_async, load_chats, Chat
-
-langs = ["python", "java", "Julia", "C++"]
-def data2chat(msg):
+results = []
+for m in msgs:
     chat = Chat()
-    chat.user("请用语言 %s 打印 hello world" % msg)
-    # 注意，这里不需要 getresponse 而交给异步处理
-    return chat
-
-chat_completion_async(langs, chkpoint="async_chat.jsonl", nproc=2, data2chat=data2chat)
-chats = load_chats("async_chat.jsonl")
-```
-
-在 Jupyter Notebook 中运行，因其[特殊机制](https://stackoverflow.com/questions/47518874/how-do-i-run-python-asyncio-code-in-a-jupyter-notebook)，需使用 `await` 关键字和 `wait=True` 参数：
-
-```python
-await chat_completion_async(langs, chkpoint="async_chat.jsonl", nproc=2, data2chat=data2chat, wait=True)
+    chat.system("你是一个熟练的数字翻译家。")
+    resp = chat.user(f"请将该数字翻译为罗马数字：{m}").get_response()
+    results.append(resp.content)
+    chat.save("chat.jsonl", mode="a")
 ```
 
-### 工具调用
-
-定义函数：
+示例3，异步并发与流式输出：
 
 ```python
-def add(a: int, b: int) -> int:
-    """
-    This function adds two numbers.
-
-    Parameters:
-        a (int): The first number.
-        b (int): The second number.
-
-    Returns:
-        int: The sum of the two numbers.
-    """
-    return a + b
-
-def mult(a:int, b:int) -> int:
-    """This function multiplies two numbers.
-    It is a useful calculator!
-
-    Args:
-        a (int): The first number.
-        b (int): The second number.
-
-    Returns:
-        int: The product of the two numbers.
-    """
-    return a * b
-```
-
-添加函数到 `Chat` 对象：
-
-```py
+import asyncio
 from chattool import Chat
-chat = Chat("find the value of (23723 * 1322312 ) + 12312")
-chat.settools([add, mult])
-```
 
-自动执行工具，根据返回信息判断是否结束，`maxturns` 默认为 3：
-
-```py
-chat.autoresponse(display=True, tool_type='tool_choice', maxturns=3) 
+async def run():
+    # 并发问答
+    base = Chat().system("你是一个有用的助手")
+    tasks = [base.copy().user(f"请解释：主题 {i}").async_get_response() for i in range(2)]
+    responses = await asyncio.gather(*tasks)
+    for r in responses:
+        print(r.content)
+
+    # 流式输出
+    print("流式: ", end="")
+    async for chunk in Chat().user("写一首关于春天的短诗").async_get_response_stream():
+        if chunk.delta_content:
+            print(chunk.delta_content, end="", flush=True)
+    print()
+
+asyncio.run(run())
 ```
 
-使用通用函数 `python`
-
-```py
-from chattool.functioncall import python
-chat = Chat("find the value of (23723 * 1322312 ) + 12312")
-chat.settools([python])
-chat.autoresponse(display=True, tool_type='tool_choice', maxturns=3) 
-```
-
-注意，执行模型生成的任意代码有潜在风险。
-
 ## 开源协议
 
 使用 MIT 协议开源。
 
 ## 更新日志
 
-- 当前版本 `3.2.1`，简化异步处理和串行处理的接口，更新子模块名称，避免冲突
-- 版本 `2.3.0`，支持调用外部工具，异步处理数据，以及模型微调功能
-- 版本 `2.0.0` 开始，更名为 `chattool`
-- 版本 `1.0.0` 开始，支持异步处理数据
-- 版本 `0.6.0` 开始，支持 [function call](https://platform.openai.com/docs/guides/gpt/function-calling) 功能
-- 版本 `0.5.0` 开始，支持使用 `process_chats` 处理数据，借助 `msg2chat` 函数以及 `checkpoint` 文件
-- 版本 `0.4.0` 开始，工具维护转至 [CubeNLP](https://github.com/cubenlp) 组织账号
-- 版本 `0.3.0` 开始不依赖模块 `openai.py` ，而是直接使用 `requests` 发送请求
-    - 支持对每个 `Chat` 使用不同 API 密钥
-    - 支持使用代理链接
-- 版本 `0.2.0` 改用 `Chat` 类型作为中心交互对象
+- 当前版本 `4.1.0`，统一 `Chat` API（同步/异步/流式），默认环境变量配置，改进重试与调试工具
+- 历史：`2.x-3.x` 阶段逐步完善异步处理与批量用法
+- 更早版本沿革请参考仓库提交记录
diff --git a/README_en.md b/README_en.md
@@ -43,29 +43,20 @@ export OPENAI_API_KEY="sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
 export OPENAI_API_BASE="https://api.example.com/v1" 
 export OPENAI_API_BASE_URL="https://api.example.com" # optional
 ```
-
-Or in Python code:
-
-```py
-import chattool
-chattool.api_key = "sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
-chattool.api_base = "https://api.example.com/v1"
-```
-
-Note: `OPENAI_API_BASE` is prior to `OPENAI_API_BASE_URL`, and you only need to set one of them.
+Note: `OPENAI_API_BASE` takes precedence over `OPENAI_API_BASE_URL`. Set one.
 
 ## Examples
 
 Example 1, simulate multi-turn dialogue:
 
 ```python
 # first chat
-chat = Chat("Hello, GPT-3.5!")
-resp = chat.getresponse()
+chat = Chat("Hello!")
+resp = chat.get_response()
 
 # continue the chat
 chat.user("How are you?")
-next_resp = chat.getresponse()
+next_resp = chat.get_response()
 
 # add response manually
 chat.user("What's your name?")
@@ -78,117 +69,49 @@ chat.save("chat.json", mode="w") # default to "a"
 chat.print_log()
 ```
 
-Example 2, process data in batch, and use a checkpoint file `checkpoint`:
+Example 2, process data in batch (serial), and append to a checkpoint file `chat.jsonl`:
 
 ```python
-# write a function to process the data
-def msg2chat(msg):
+msgs = [str(i) for i in range(1, 10)]
+results = []
+for m in msgs:
     chat = Chat()
     chat.system("You are a helpful translator for numbers.")
-    chat.user(f"Please translate the digit to Roman numerals: {msg}")
-    # We need to call `getresponse` here to get the response
-    chat.getresponse()
-    return chat
-
-checkpoint = "chat.jsonl"
-msgs = ["%d" % i for i in range(1, 10)]
-# process the data in batch, if the checkpoint file exists, it will continue from the last checkpoint
-continue_chats = process_chats(msgs, msg2chat, checkpoint)
-```
-
-Example 3, process data in batch (asynchronous), print hello using different languages, and use two coroutines:
-
-```python
-from chattool import chat_completion_async, load_chats
-
-langs = ["python", "java", "Julia", "C++"]
-def data2chat(msg):
-    chat = Chat()
-    chat.user("Please print hello world using %s" % msg)
-    # Note that we don't need to call `getresponse` here, and leave it to the asynchronous processing
-    return chat
-
-chat_completion_async(langs, chkpoint="async_chat.jsonl", nproc=2, data2chat=data2chat)
-chats = load_chats("async_chat.jsonl")
-```
-
-when using `chat_completion_async` in Jupyter notebook, you should use the `await` keyword and the `wait=True` parameter:
-
-```python
-await chat_completion_async(langs, chkpoint="async_chat.jsonl", nproc=2, data2chat=data2chat, wait=True)
+    resp = chat.user(f"Translate this digit to Roman numerals: {m}").get_response()
+    results.append(resp.content)
+    chat.save("chat.jsonl", mode="a")
 ```
 
-
-### Tool Call
-
-Define functions:
-
-```python
-def add(a: int, b: int) -> int:
-    """
-    This function adds two numbers.
-
-    Parameters:
-        a (int): The first number.
-        b (int): The second number.
-
-    Returns:
-        int: The sum of the two numbers.
-    """
-    return a + b
-
-def mult(a: int, b: int) -> int:
-    """This function multiplies two numbers.
-    It is a useful calculator!
-
-    Args:
-        a (int): The first number.
-        b (int): The second number.
-
-    Returns:
-        int: The product of the two numbers.
-    """
-    return a * b
-```
-
-Add functions to the `Chat` object:
+Example 3, asynchronous concurrency and streaming:
 
 ```python
+import asyncio
 from chattool import Chat
-chat = Chat("find the value of (23723 * 1322312 ) + 12312")
-chat.settools([add, mult])
-```
-
-Automatically execute the tool based on the return information. The default value for `maxturns` is 3:
 
-```python
-chat.autoresponse(display=True, tool_type='tool_choice', maxturns=3)
+async def run():
+    # concurrent Q&A
+    base = Chat().system("You are a helpful assistant")
+    tasks = [base.copy().user(f"Explain topic {i}").async_get_response() for i in range(2)]
+    responses = await asyncio.gather(*tasks)
+    for r in responses:
+        print(r.content)
+
+    # streaming output
+    print("Streaming: ", end="")
+    async for chunk in Chat().user("Write a short poem about spring").async_get_response_stream():
+        if chunk.delta_content:
+            print(chunk.delta_content, end="", flush=True)
+    print()
+
+asyncio.run(run())
 ```
 
-Use the general function `python`:
-
-```python
-from chattool.functioncall import python
-chat = Chat("find the value of (23723 * 1322312 ) + 12312")
-chat.settools([python])
-chat.autoresponse(display=True, tool_type='tool_choice', maxturns=3)
-```
-
-Note that executing any code generated by the model has potential risks.
-
 ## License
 
 This package is licensed under the MIT license. See the LICENSE file for more details.
 
-## update log
-
-Current version: `2.3.0`. The features of function call, asynchronous processing, and finetuning are supported.
+## Update Log
 
-### Beta version
-- Since version `0.2.0`, `Chat` type is used to handle data
-- Since version `0.3.0`, you can use different API Key to send requests.
-- Since version `0.4.0`, this package is mantained by [cubenlp](https://github.com/cubenlp).
-- Since version `0.5.0`, one can use `process_chats` to process the data, with a customized `msg2chat` function and a checkpoint file.
-- Since version `0.6.0`, the feature [function call](https://platform.openai.com/docs/guides/gpt/function-calling) is added.
-- Since version `1.0.0`, the feature [function call](https://platform.openai.com/docs/guides/gpt/function-calling) is removed, and the asynchronous processing tool is added.
-- Since version `2.0.0`, the package is renamed to `chattool`, and the asynchronous processing tool is improved.
+- Current version `4.1.0`: unified `Chat` API (sync/async/stream), default env-based configuration, improved retries and debugging helpers.
+- History `2.x–3.x`: iterative improvements to async and batch usage.
+- For earlier changes, please refer to the repository commits.