Skip to content

Commit 998968f

Browse files
committed
[Doc] Update parameters of serving
1 parent fe0e3f5 commit 998968f

File tree

2 files changed

+2
-0
lines changed

2 files changed

+2
-0
lines changed

docs/online_serving/README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -94,6 +94,7 @@ The differences in request parameters between FastDeploy and the OpenAI protocol
9494
- `enable_thinking`: Optional[bool] = True (whether to enable reasoning for models that support deep thinking)
9595
- `repetition_penalty`: Optional[float] = None (coefficient for directly penalizing repeated token generation (>1 penalizes repetition, <1 encourages repetition))
9696
- `return_token_ids`: Optional[bool] = False: (whether to return token ids as a list)
97+
- `include_stop_str_in_output`: Optional[bool] = False: (whether to include the stop strings in output text. Defaults to False.)
9798

9899
> Note: For multimodal models, since the reasoning chain is enabled by default, resulting in overly long outputs, `max_tokens` can be set to the model's maximum output length or the default value can be used.
99100

docs/zh/online_serving/README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -93,6 +93,7 @@ FastDeploy 与 OpenAI 协议的请求参数差异如下,其余请求参数会
9393
- `enable_thinking`: Optional[bool] = True 支持深度思考的模型是否打开思考
9494
- `repetition_penalty`: Optional[float] = None: 直接对重复生成的token进行惩罚的系数(>1时惩罚重复,<1时鼓励重复)
9595
- `return_token_ids`: Optional[bool] = False: 是否返回 token id 列表
96+
- `include_stop_str_in_output`: Optional[bool] = False: 是否返回结束符
9697

9798
> 注: 若为多模态模型 由于思考链默认打开导致输出过长,max tokens 可以设置为模型最长输出,或使用默认值。
9899

0 commit comments

Comments
 (0)