Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 11 additions & 3 deletions app/client/platforms/openai.ts
Original file line number Diff line number Diff line change
Expand Up @@ -200,6 +200,7 @@ export class ChatGPTApi implements LLMApi {
options.config.model.startsWith("o1") ||
options.config.model.startsWith("o3") ||
options.config.model.startsWith("o4-mini");
const isGpt5 = options.config.model.startsWith("gpt-5");
if (isDalle3) {
const prompt = getMessageTextContent(
options.messages.slice(-1)?.pop() as any,
Expand Down Expand Up @@ -230,15 +231,21 @@ export class ChatGPTApi implements LLMApi {
messages,
stream: options.config.stream,
model: modelConfig.model,
temperature: !isO1OrO3 ? modelConfig.temperature : 1,
temperature: (!isO1OrO3 && !isGpt5) ? modelConfig.temperature : 1,
presence_penalty: !isO1OrO3 ? modelConfig.presence_penalty : 0,
frequency_penalty: !isO1OrO3 ? modelConfig.frequency_penalty : 0,
top_p: !isO1OrO3 ? modelConfig.top_p : 1,
// max_tokens: Math.max(modelConfig.max_tokens, 1024),
Comment on lines +234 to 238
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Confirm intent: forcing temperature=1 for GPT‑5

You’re forcing temperature=1 for GPT‑5 (similar to o1/o3/o4-mini) but leaving other sampling params (top_p, penalties) unchanged. Verify this matches GPT‑5 API semantics.

Optionally, extract a single guard to clarify intent:

-        temperature: (!isO1OrO3 && !isGpt5) ? modelConfig.temperature : 1,
+        // If GPT‑5 requires fixed temperature, keep 1; otherwise pass-through.
+        temperature: (isO1OrO3 || isGpt5) ? 1 : modelConfig.temperature,

🌐 Web query:

For OpenAI GPT-5 chat/completions, should temperature be fixed (e.g., 1 or 0) or is it supported like standard GPT models? Are presence_penalty, frequency_penalty, and top_p supported for GPT-5?

💡 Result:

Yes, OpenAI's GPT-5 model supports the temperature, top_p, presence_penalty, and frequency_penalty parameters in its chat/completions API, similar to previous GPT models. These parameters allow you to control various aspects of the model's output:

  • temperature: Controls the randomness of the model's responses. Lower values (e.g., 0.2) make the output more deterministic, while higher values (e.g., 0.8) increase randomness and creativity. (medium.com)

  • top_p: Also known as nucleus sampling, this parameter determines the cumulative probability threshold for token selection. A value of 0.9 means the model considers the smallest set of tokens whose cumulative probability is at least 90%. (medium.com)

  • presence_penalty: Encourages the model to introduce new topics by penalizing tokens that have already appeared in the text. Positive values increase the likelihood of discussing new topics. (notes.kodekloud.com)

  • frequency_penalty: Reduces the model's tendency to repeat the same tokens by penalizing tokens based on their existing frequency in the text. Positive values decrease the likelihood of repetition. (notes.kodekloud.com)

These parameters are designed to be flexible and can be adjusted to suit your specific application needs. For example, setting temperature to 0.0 and top_p to 0.0 can produce deterministic outputs, while higher values can generate more diverse and creative responses. (notes.kodekloud.com)

It's important to note that while these parameters are supported in GPT-5, the optimal settings may vary depending on your specific use case and desired output characteristics.

Citations:


Remove GPT-5 special-case for temperature

GPT-5 supports temperature, top_p, presence_penalty, and frequency_penalty just like other GPT models. We shouldn’t force temperature = 1 only for GPT-5—let it use modelConfig.temperature unless we have a clear reason to pin it. Please update in app/client/platforms/openai.ts around lines 234–238:

-        temperature: (!isO1OrO3 && !isGpt5) ? modelConfig.temperature : 1,
+        // Only force temperature=1 for O1/O3; GPT-5 should respect the configured value.
+        temperature: isO1OrO3 ? 1 : modelConfig.temperature,
         presence_penalty: !isO1OrO3 ? modelConfig.presence_penalty : 0,
         frequency_penalty: !isO1OrO3 ? modelConfig.frequency_penalty : 0,
         top_p: !isO1OrO3 ? modelConfig.top_p : 1,

This removes the isGpt5 check so GPT-5 calls remain configurable.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
temperature: (!isO1OrO3 && !isGpt5) ? modelConfig.temperature : 1,
presence_penalty: !isO1OrO3 ? modelConfig.presence_penalty : 0,
frequency_penalty: !isO1OrO3 ? modelConfig.frequency_penalty : 0,
top_p: !isO1OrO3 ? modelConfig.top_p : 1,
// max_tokens: Math.max(modelConfig.max_tokens, 1024),
// Only force temperature=1 for O1/O3; GPT-5 should respect the configured value.
temperature: isO1OrO3 ? 1 : modelConfig.temperature,
presence_penalty: !isO1OrO3 ? modelConfig.presence_penalty : 0,
frequency_penalty: !isO1OrO3 ? modelConfig.frequency_penalty : 0,
top_p: !isO1OrO3 ? modelConfig.top_p : 1,
// max_tokens: Math.max(modelConfig.max_tokens, 1024),
🤖 Prompt for AI Agents
In app/client/platforms/openai.ts around lines 234 to 238, remove the
special-case check for GPT-5 in the temperature setting. Update the temperature
assignment to use modelConfig.temperature for all models, including GPT-5, by
eliminating the isGpt5 condition. Ensure presence_penalty, frequency_penalty,
and top_p remain conditional only on isO1OrO3 as before, without affecting GPT-5
handling.

// Please do not ask me why not send max_tokens, no reason, this param is just shit, I dont want to explain anymore.
};

if (isO1OrO3) {
if (isGpt5) {
// Remove max_tokens if present
delete requestPayload.max_tokens;
// Add max_completion_tokens (or max_completion_tokens if that's what you meant)
requestPayload["max_completion_tokens"] = modelConfig.max_tokens;

Comment on lines +242 to +247
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

TS safety and clarity: avoid deleting from a union-typed payload; set max_completion_tokens via a narrowed alias

delete on a union (RequestPayload | DalleRequestPayload) can trip TS checks. Also the comment repeats “max_completion_tokens”.

Apply:

-      if (isGpt5) {
-  	// Remove max_tokens if present
-  	delete requestPayload.max_tokens;
-  	// Add max_completion_tokens (or max_completion_tokens if that's what you meant)
-  	requestPayload["max_completion_tokens"] = modelConfig.max_tokens;
-      } else if (isO1OrO3) {
+      if (isGpt5) {
+        // GPT‑5 uses max_completion_tokens
+        const payload = requestPayload as RequestPayload;
+        payload.max_completion_tokens = modelConfig.max_tokens;
+      } else if (isO1OrO3) {

This avoids delete on a union and fixes the comment.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (isGpt5) {
// Remove max_tokens if present
delete requestPayload.max_tokens;
// Add max_completion_tokens (or max_completion_tokens if that's what you meant)
requestPayload["max_completion_tokens"] = modelConfig.max_tokens;
if (isGpt5) {
// GPT-5 uses max_completion_tokens
const payload = requestPayload as RequestPayload;
payload.max_completion_tokens = modelConfig.max_tokens;
} else if (isO1OrO3) {
🤖 Prompt for AI Agents
In app/client/platforms/openai.ts around lines 242 to 247, avoid using delete on
a union-typed requestPayload as it causes TypeScript errors. Instead, create a
narrowed alias for the payload type that allows safe modification, then set
max_completion_tokens on that alias. Also, correct the comment to remove the
repeated phrase "max_completion_tokens" for clarity.

} else if (isO1OrO3) {
// by default the o1/o3 models will not attempt to produce output that includes markdown formatting
// manually add "Formatting re-enabled" developer message to encourage markdown inclusion in model responses
// (https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/reasoning?tabs=python-secure#markdown-output)
Expand All @@ -251,8 +258,9 @@ export class ChatGPTApi implements LLMApi {
requestPayload["max_completion_tokens"] = modelConfig.max_tokens;
}


// add max_tokens to vision model
if (visionModel && !isO1OrO3) {
if (visionModel && !isO1OrO3 && ! isGpt5) {
requestPayload["max_tokens"] = Math.max(modelConfig.max_tokens, 4000);
}
}
Expand Down
6 changes: 6 additions & 0 deletions app/constant.ts
Original file line number Diff line number Diff line change
Expand Up @@ -493,6 +493,7 @@ export const VISION_MODEL_REGEXES = [
/o3/,
/o4-mini/,
/grok-4/i,
/gpt-5/
];
Comment on lines +496 to 497
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Avoid blanket vision classification for all GPT‑5 variants

Using /gpt-5/ will flag nano/mini variants as vision-capable, which may cause image payloads to be sent to text-only models. Narrow the regex.

Apply:

-  /grok-4/i,
-  /gpt-5/
+  /grok-4/i,
+  /gpt-5(?!-(nano|mini))/

If specific GPT‑5 vision SKUs differ, switch to an explicit allowlist instead of a broad regex. Please verify GPT‑5 nano/mini image-input support in current docs.


🌐 Web query:

Which OpenAI GPT-5 model variants support image inputs via the chat/completions API? Specifically, do "gpt-5-mini" and "gpt-5-nano" accept vision (image_url) content?

💡 Result:

GPT-5 Full supports image (vision) inputs via the chat/completions API, but gpt-5-mini and gpt-5-nano are either limited or do not have native support for image (image_url) content as input.

Details and evidence:

  • GPT-5 Full is clearly described in official model documentation and API guides as accepting both text and image inputs for chat/completions, with a 400,000 token context window[3][5]. Multiple sources confirm that the flagship GPT-5 is a multimodal model designed to handle simultaneous text and vision inputs directly in the chat/completions endpoint[3][5][8][9].

  • GPT-5 Mini offers a downsized model for cost and speed, but public technical docs and third-party guides do not confirm vision input support via the chat/completions API. The most detailed API specification and model overview for gpt-5-mini emphasize text processing, large context windows, and quick response times, but do not state it accepts image_url or vision content as input[2][4]. This is also consistent with third-party documentation and community reports[6][10].

  • GPT-5 Nano is even further optimized for edge and real-time text tasks. Official model guides state it processes single images, but only for constrained use cases such as mobile inference—not for general multimodal chat/completions (e.g., not by passing "image_url" in the typical API structure)[2]. This suggests the Nano variant's image handling is not equivalent to the flagship's fully supported multimodal input[2][6].

  • Table summary (based on official and developer sources):

Model Variant Text Input Image Input via Chat/Completions API
gpt-5 (Full) Yes Yes
gpt-5-mini Yes No (not supported or very limited)
gpt-5-nano Yes No (not supported or very limited)
  • Additional Clarification:
    Official OpenAI documentation and reputable blog reviews clarify that GPT-5 supports both text and image (vision) input, but audio and image generation remain covered by specialized models like GPT-4o Audio and GPT Image 1[8]. There is no public evidence that gpt-5-mini and gpt-5-nano offer full multimodal chat/completions support for vision input (image_url)[2][6][8].

Conclusion:
You can use image inputs with GPT-5 Full via the chat/completions API. gpt-5-mini and gpt-5-nano do not natively accept vision (image_url) content as input in the chat/completions endpoint; they are text-specialized models, with only minimal or separate image processing capabilities[2][6][8].

Citations:


Narrow GPT-5 Regex to Exclude Non-Vision Variants
GPT-5 Full accepts image inputs via the chat/completions API, but gpt-5-mini and gpt-5-nano do not. Using a blanket /gpt-5/ will mistakenly flag those text-only variants as vision-capable.

Please update the regex in app/constant.ts (around lines 496–497):

-  /grok-4/i,
-  /gpt-5/
+  /grok-4/i,
+  /gpt-5(?!-(nano|mini))/

If OpenAI introduces new GPT-5 SKUs with vision support in the future, switch to an explicit allowlist for better precision.

🤖 Prompt for AI Agents
In app/constant.ts around lines 496 to 497, the regex /gpt-5/ is too broad and
incorrectly matches GPT-5 variants without vision capabilities like gpt-5-mini
and gpt-5-nano. Update the regex to explicitly match only the GPT-5 variants
that support vision inputs, excluding the non-vision ones. Consider using an
explicit allowlist pattern for the known vision-capable GPT-5 SKUs to ensure
precise matching and easier future updates.


export const EXCLUDE_VISION_MODEL_REGEXES = [/claude-3-5-haiku-20241022/];
Expand All @@ -517,6 +518,11 @@ const openaiModels = [
"gpt-4.1-nano-2025-04-14",
"gpt-4.5-preview",
"gpt-4.5-preview-2025-02-27",
"gpt-5-chat",
"gpt-5-mini",
"gpt-5-nano",
"gpt-5",
"gpt-5-chat-2025-01-01-preview",
"gpt-4o",
"gpt-4o-2024-05-13",
"gpt-4o-2024-08-06",
Expand Down
Loading