Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,6 @@ jobs:
TWINE_USERNAME: ${{ secrets.PYPI_USERNAME }}
TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }}
run: |
python setup.py sdist bdist_wheel
python3 -m build
twine check dist/*
twine upload dist/* --skip-existing
3 changes: 3 additions & 0 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,9 @@ jobs:
- name: Test with pytest and coverage
run: |
pip install coverage
mkdir -p .logs
nohup chattool.capture-server --daemon --port 8000 &
sleep 1
coverage run -m pytest -s tests/
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v4
Expand Down
138 changes: 31 additions & 107 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,15 +38,6 @@ export OPENAI_API_KEY="sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
export OPENAI_API_BASE="https://api.example.com/v1"
export OPENAI_API_BASE_URL="https://api.example.com" # 可选
```

或者在代码中设置:

```py
import chattool
chattool.api_key = "sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
chattool.api_base = "https://api.example.com/v1"
```

注:环境变量 `OPENAI_API_BASE` 优先于 `OPENAI_API_BASE_URL`,二者选其一即可。

### 示例
Expand All @@ -55,12 +46,12 @@ chattool.api_base = "https://api.example.com/v1"

```python
# 初次对话
chat = Chat("Hello, GPT-3.5!")
resp = chat.getresponse()
chat = Chat("Hello!")
resp = chat.get_response()

# 继续对话
chat.user("How are you?")
next_resp = chat.getresponse()
next_resp = chat.get_response()

# 人为添加返回内容
chat.user("What's your name?")
Expand All @@ -76,114 +67,47 @@ chat.print_log()
示例2,批量处理数据(串行),并使用缓存文件 `chat.jsonl`:

```python
# 编写处理函数
def data2chat(msg):
chat = Chat()
chat.system("你是一个熟练的数字翻译家。")
chat.user(f"请将该数字翻译为罗马数字:{msg}")
# 注意,在函数内获取返回
chat.getresponse()
return chat

checkpoint = "chat.jsonl" # 缓存文件的名称
# 串行处理(按需保存)
msgs = ["1", "2", "3", "4", "5", "6", "7", "8", "9"]
# 处理数据,如果 checkpoint 存在,则从上次中断处继续
continue_chats = process_chats(msgs, data2chat, checkpoint)
```

示例3,批量处理数据(异步并行),用不同语言打印 hello,并使用两个协程:

```python
from chattool import chat_completion_async, load_chats, Chat

langs = ["python", "java", "Julia", "C++"]
def data2chat(msg):
results = []
for m in msgs:
chat = Chat()
chat.user("请用语言 %s 打印 hello world" % msg)
# 注意,这里不需要 getresponse 而交给异步处理
return chat

chat_completion_async(langs, chkpoint="async_chat.jsonl", nproc=2, data2chat=data2chat)
chats = load_chats("async_chat.jsonl")
```

在 Jupyter Notebook 中运行,因其[特殊机制](https://stackoverflow.com/questions/47518874/how-do-i-run-python-asyncio-code-in-a-jupyter-notebook),需使用 `await` 关键字和 `wait=True` 参数:

```python
await chat_completion_async(langs, chkpoint="async_chat.jsonl", nproc=2, data2chat=data2chat, wait=True)
chat.system("你是一个熟练的数字翻译家。")
resp = chat.user(f"请将该数字翻译为罗马数字:{m}").get_response()
results.append(resp.content)
chat.save("chat.jsonl", mode="a")
```

### 工具调用

定义函数:
示例3,异步并发与流式输出:

```python
def add(a: int, b: int) -> int:
"""
This function adds two numbers.

Parameters:
a (int): The first number.
b (int): The second number.

Returns:
int: The sum of the two numbers.
"""
return a + b

def mult(a:int, b:int) -> int:
"""This function multiplies two numbers.
It is a useful calculator!

Args:
a (int): The first number.
b (int): The second number.

Returns:
int: The product of the two numbers.
"""
return a * b
```

添加函数到 `Chat` 对象:

```py
import asyncio
from chattool import Chat
chat = Chat("find the value of (23723 * 1322312 ) + 12312")
chat.settools([add, mult])
```

自动执行工具,根据返回信息判断是否结束,`maxturns` 默认为 3:

```py
chat.autoresponse(display=True, tool_type='tool_choice', maxturns=3)
async def run():
# 并发问答
base = Chat().system("你是一个有用的助手")
tasks = [base.copy().user(f"请解释:主题 {i}").async_get_response() for i in range(2)]
responses = await asyncio.gather(*tasks)
for r in responses:
print(r.content)

# 流式输出
print("流式: ", end="")
async for chunk in Chat().user("写一首关于春天的短诗").async_get_response_stream():
if chunk.delta_content:
print(chunk.delta_content, end="", flush=True)
print()

asyncio.run(run())
```

使用通用函数 `python`

```py
from chattool.functioncall import python
chat = Chat("find the value of (23723 * 1322312 ) + 12312")
chat.settools([python])
chat.autoresponse(display=True, tool_type='tool_choice', maxturns=3)
```

注意,执行模型生成的任意代码有潜在风险。

## 开源协议

使用 MIT 协议开源。

## 更新日志

- 当前版本 `3.2.1`,简化异步处理和串行处理的接口,更新子模块名称,避免冲突
- 版本 `2.3.0`,支持调用外部工具,异步处理数据,以及模型微调功能
- 版本 `2.0.0` 开始,更名为 `chattool`
- 版本 `1.0.0` 开始,支持异步处理数据
- 版本 `0.6.0` 开始,支持 [function call](https://platform.openai.com/docs/guides/gpt/function-calling) 功能
- 版本 `0.5.0` 开始,支持使用 `process_chats` 处理数据,借助 `msg2chat` 函数以及 `checkpoint` 文件
- 版本 `0.4.0` 开始,工具维护转至 [CubeNLP](https://github.com/cubenlp) 组织账号
- 版本 `0.3.0` 开始不依赖模块 `openai.py` ,而是直接使用 `requests` 发送请求
- 支持对每个 `Chat` 使用不同 API 密钥
- 支持使用代理链接
- 版本 `0.2.0` 改用 `Chat` 类型作为中心交互对象
- 当前版本 `4.1.0`,统一 `Chat` API(同步/异步/流式),默认环境变量配置,改进重试与调试工具
- 历史:`2.x-3.x` 阶段逐步完善异步处理与批量用法
- 更早版本沿革请参考仓库提交记录
143 changes: 33 additions & 110 deletions README_en.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,29 +43,20 @@ export OPENAI_API_KEY="sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
export OPENAI_API_BASE="https://api.example.com/v1"
export OPENAI_API_BASE_URL="https://api.example.com" # optional
```

Or in Python code:

```py
import chattool
chattool.api_key = "sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
chattool.api_base = "https://api.example.com/v1"
```

Note: `OPENAI_API_BASE` is prior to `OPENAI_API_BASE_URL`, and you only need to set one of them.
Note: `OPENAI_API_BASE` takes precedence over `OPENAI_API_BASE_URL`. Set one.

## Examples

Example 1, simulate multi-turn dialogue:

```python
# first chat
chat = Chat("Hello, GPT-3.5!")
resp = chat.getresponse()
chat = Chat("Hello!")
resp = chat.get_response()

# continue the chat
chat.user("How are you?")
next_resp = chat.getresponse()
next_resp = chat.get_response()

# add response manually
chat.user("What's your name?")
Expand All @@ -78,117 +69,49 @@ chat.save("chat.json", mode="w") # default to "a"
chat.print_log()
```

Example 2, process data in batch, and use a checkpoint file `checkpoint`:
Example 2, process data in batch (serial), and append to a checkpoint file `chat.jsonl`:

```python
# write a function to process the data
def msg2chat(msg):
msgs = [str(i) for i in range(1, 10)]
results = []
for m in msgs:
chat = Chat()
chat.system("You are a helpful translator for numbers.")
chat.user(f"Please translate the digit to Roman numerals: {msg}")
# We need to call `getresponse` here to get the response
chat.getresponse()
return chat

checkpoint = "chat.jsonl"
msgs = ["%d" % i for i in range(1, 10)]
# process the data in batch, if the checkpoint file exists, it will continue from the last checkpoint
continue_chats = process_chats(msgs, msg2chat, checkpoint)
```

Example 3, process data in batch (asynchronous), print hello using different languages, and use two coroutines:

```python
from chattool import chat_completion_async, load_chats

langs = ["python", "java", "Julia", "C++"]
def data2chat(msg):
chat = Chat()
chat.user("Please print hello world using %s" % msg)
# Note that we don't need to call `getresponse` here, and leave it to the asynchronous processing
return chat

chat_completion_async(langs, chkpoint="async_chat.jsonl", nproc=2, data2chat=data2chat)
chats = load_chats("async_chat.jsonl")
```

when using `chat_completion_async` in Jupyter notebook, you should use the `await` keyword and the `wait=True` parameter:

```python
await chat_completion_async(langs, chkpoint="async_chat.jsonl", nproc=2, data2chat=data2chat, wait=True)
resp = chat.user(f"Translate this digit to Roman numerals: {m}").get_response()
results.append(resp.content)
chat.save("chat.jsonl", mode="a")
```


### Tool Call

Define functions:

```python
def add(a: int, b: int) -> int:
"""
This function adds two numbers.

Parameters:
a (int): The first number.
b (int): The second number.

Returns:
int: The sum of the two numbers.
"""
return a + b

def mult(a: int, b: int) -> int:
"""This function multiplies two numbers.
It is a useful calculator!

Args:
a (int): The first number.
b (int): The second number.

Returns:
int: The product of the two numbers.
"""
return a * b
```

Add functions to the `Chat` object:
Example 3, asynchronous concurrency and streaming:

```python
import asyncio
from chattool import Chat
chat = Chat("find the value of (23723 * 1322312 ) + 12312")
chat.settools([add, mult])
```

Automatically execute the tool based on the return information. The default value for `maxturns` is 3:

```python
chat.autoresponse(display=True, tool_type='tool_choice', maxturns=3)
async def run():
# concurrent Q&A
base = Chat().system("You are a helpful assistant")
tasks = [base.copy().user(f"Explain topic {i}").async_get_response() for i in range(2)]
responses = await asyncio.gather(*tasks)
for r in responses:
print(r.content)

# streaming output
print("Streaming: ", end="")
async for chunk in Chat().user("Write a short poem about spring").async_get_response_stream():
if chunk.delta_content:
print(chunk.delta_content, end="", flush=True)
print()

asyncio.run(run())
```

Use the general function `python`:

```python
from chattool.functioncall import python
chat = Chat("find the value of (23723 * 1322312 ) + 12312")
chat.settools([python])
chat.autoresponse(display=True, tool_type='tool_choice', maxturns=3)
```

Note that executing any code generated by the model has potential risks.

## License

This package is licensed under the MIT license. See the LICENSE file for more details.

## update log

Current version: `2.3.0`. The features of function call, asynchronous processing, and finetuning are supported.
## Update Log

### Beta version
- Since version `0.2.0`, `Chat` type is used to handle data
- Since version `0.3.0`, you can use different API Key to send requests.
- Since version `0.4.0`, this package is mantained by [cubenlp](https://github.com/cubenlp).
- Since version `0.5.0`, one can use `process_chats` to process the data, with a customized `msg2chat` function and a checkpoint file.
- Since version `0.6.0`, the feature [function call](https://platform.openai.com/docs/guides/gpt/function-calling) is added.
- Since version `1.0.0`, the feature [function call](https://platform.openai.com/docs/guides/gpt/function-calling) is removed, and the asynchronous processing tool is added.
- Since version `2.0.0`, the package is renamed to `chattool`, and the asynchronous processing tool is improved.
- Current version `4.1.0`: unified `Chat` API (sync/async/stream), default env-based configuration, improved retries and debugging helpers.
- History `2.x–3.x`: iterative improvements to async and batch usage.
- For earlier changes, please refer to the repository commits.
Loading