-
Notifications
You must be signed in to change notification settings - Fork 590
Description
Is your feature request related to a problem? Please describe.
Currently, converters like CharSwapConverter, or other prompt perturbation tools in PyRIT, are only usable before prompts are sent to the model. However, when using the OpenAI Responses API with function/tool calling enabled, there is no way for the model to dynamically request text transformations like perturbations, mutations, or obfuscation during a conversation.
This limits experimentation with model-driven robustness, self-evaluation, and adversarial reasoning patterns, especially in multi-agent scenarios where tool flexibility is key.
Describe the solution you'd like
I propose exposing converters (like CharSwapConverter, etc.) as function-callable tools, so they can be used within the custom_functions registry of OpenAIResponseTarget.
This would allow the model (or one of the agents) to call a converter dynamically via tool calling, enabling dynamic, contextual adversarial attacks or robustness tests during the conversation.
For example:
converter_tool = {
"type": "function",
"name": "apply_char_swap",
"description": "Apply character swaps to words for robustness testing",
"parameters": {
"type": "object",
"properties": {
"text": {"type": "string", "description": "Input text to perturb"},
},
"required": ["text"],
}
}The corresponding Python implementation can wrap the CharSwapConverter internally and return the perturbed text.
Additional context
This would integrate cleanly into the existing multi-agent orchestrator pipeline (#930), where:
- The Recon Agent gathers environmental/system info but doesn’t invoke transformations.
- The Strategy Agent analyzes context and decides which tools or mutations to apply.
- The AI Red Team Agent performs the actual tool call, such as invoking a converter, and delivers the perturbed prompt to the target model.
Note: This proposal does not replace or modify existing converters. Instead, it wraps them with lightweight async functions compatible with the Responses API's custom_functions registry. The converters are reused as-is, plugged into a callable interface, so the model can invoke them during conversation turns, just like any other tool.