Skip to content

v0.7.0

Choose a tag to compare

@jsong468 jsong468 released this 17 Mar 16:40
· 309 commits to main since this release

What's Changed

Targets:

  • [BREAKING] OpenAIChatTarget has become more generalized to more broadly support OpenAI-compatible models. See the blog describing the changes here!
  • If api_version is set to None when instantiating OpenAITarget objects, it will not be added as a query parameter to requests.
  • Added Google Gemini example environment variables to .env_example and added integration tests for Gemini/OpenAIChatTargets

Converters:

  • [New] AddImageVideoConverter: PyRIT's first video converter! it allows users to add an image to a video in at a specified position. More video converters to come!
  • [New] InsertPunctuationConverter: Inserts various punctuation into a prompt to test model robustness to perturbations.

Orchestrators:

  • [New] ManyShotJailbreakOrchestrator: Prepend a faux dialogue between a human and an AI assistant within a single prompt for the target.
  • [New] [BREAKING] ContextComplianceOrchestrator: Update the context to prime an objective_chat_target to answer. The context is set using instructions defined in context_description_instructions_path, along with an adversarial_chat to generate the first turns to send.
  • [BREAKING] RolePlayOrchestrator improvements: Refactored for greater code re-use
  • FlipAttackOrchestrator improvement: Allow for additional converters applied after the flip attack

Memory:

  • Multimodal Seed Prompts Encoding Metadata: Adding non-text seed prompts to the database will automatically have metadata populated, including format (png, wav, etc.) and things like bitrate and duration for audio and video seed prompts.
  • SeedPrompt Duplicates: Duplicate seed prompts within the same dataset (identical dataset_name) will no longer be uploaded to memory.
  • Using Configured Paths for Multimodal Seed Prompts: Multimodal SeedPrompt file paths within .yaml files no longer use relative paths that break based on where the .yaml files are accessed. Instead, configured paths (located in paths.py) are used.
  • [BREAKING] Removed calls to disposing memory engines in Orchestrator and Prompt Target objects and replaces it with the atexit and weakref methods of cleanup in the Memory interface to ensure cleanup on process exit. Orchestrators and targets no longer support the context manager protocol.
  • Added get_values() method to the SeedPromptDataset class to simplify prompt values extraction from datasets. Optional filtering to retrieve the first and/or last N values has also been implemented.

Scorers:

  • [New] HumanInTheLoopScorerGradio: Create scores from manual human input by running the Gradio interface in a separate process and adds the scores to the database. For now, the possible scores that users can give are "safe" and "unsafe."

Datasets:

  • [New] Added new fetch function for Aya Red-Teaming Dataset
  • [New] Added Pliny's prompts from the l1b3rt4s repo as templates
  • [New] Added the Babelscape ALERT dataset
  • Added support for filtering based on harm categories for PKU-SafeRLHF and AdvBench datasets

Misc:

  • Other changes include various maintenance improvements and bug fixes, addition of integration tests, website enhancements, dependency updates, and doc improvements.

Full list of changes

  • FIX unblock test pipelines by skipping certain tests on Ubuntu and adding Windows additionally by @romanlutz in #727
  • MAINT: Update release version to 0.6.1.dev0 by @nina-msft in #731
  • MAINT: Upgrading DuckDB by @jbolor21 in #712
  • [FEAT][MAINT][4019] Make multi-modal easier to configure in seedprompt files by @shivenchawla in #696
  • FEAT: set favicon for the website by @paulinek13 in #717
  • FEAT: simplify extracting prompt values by @paulinek13 in #718
  • FEAT: add a fetch function for Aya Red-teaming Dataset by @paulinek13 in #713
  • MAINT update Roakey image to have transparent background by @romanlutz in #735
  • FEAT Moonshot Attack Module: Insert Punctuation Attack by @u7780339 in #475
  • FEAT: include scored_prompt_id in orchestrator_identifier of the system prompt by @NicolePell in #725
  • FEAT: Create many shot jailbreak orchestrator by @AdrGav941 in #709
  • MAINT pre-commit hook to remove notebook header from notebooks by @jbolor21 in #737
  • FEAT Add Encoding Data to Multimodal Seed Prompts by @jsong468 in #740
  • FEAT added Pliny's prompts from the l1b3rt4s repo as templates by @joaodunas in #710
  • FEAT Adding babelscape dataset by @Jarro01X in #738
  • FIX: Upgrading Packages by @rlundeen2 in #741
  • FIX: Increasing pipeline timout by @rlundeen2 in #743
  • FEAT PyRIT to not upload duplicate seed-prompts by @shivenchawla in #742
  • MAINT: Azure SQL Integration Test Misc. Updates by @nina-msft in #745
  • FIX Small bug fixes (renaming file, editing MANIFEST) by @jsong468 in #746
  • [BREAKING] FEAT: OpenAI Generalization Improvements by @rlundeen2 in #747
  • FEAT: Add example_count field to ManyShotJailbreakOrchestrator by @nina-msft in #748
  • DOC: Blog: A More Generalized OpenAIChatTarget by @rlundeen2 in #751
  • DOC: Updating git docs by @rlundeen2 in #753
  • FIX: Fixing integration tests broken with OpenAIChatTarget Update by @rlundeen2 in #755
  • FEAT Video Converter: Adding Images to Videos by @jbolor21 in #702
  • FIX: Adding back static js by @rlundeen2 in #761
  • [BREAKING] FEAT: RolePlayOrchestrator Improvements by @rlundeen2 in #758
  • [BREAKING] FIX: Dispose Memory in Memory vs Class Objects by @nina-msft in #752
  • MAINT clean up dependencies by @romanlutz in #757
  • FEAT Adding converter support to many shot jailbreak orchestrator by @AdrGav941 in #760
  • FIX: Default API Version for TTS Target by @jbolor21 in #749
  • [BREAKING] FEAT: Adding Context Compliance Orchestrator by @rlundeen2 in #763
  • DOC: Add Instructions for Tagging Breaking Changes in PR Template by @nina-msft in #765
  • FEAT: support filtering based on harm categories for PKU-SafeRLHF dataset by @paulinek13 in #756
  • DOC Update CCA Documentation for Clarity by @eugeniavkim in #773
  • DOC: Update OpenAI Environment Variable Names in Documentation by @nina-msft in #776
  • FEAT: add harm categories to AdvBench Dataset by @paulinek13 in #732
  • FIX: Allow api_version to be set to None when instantiating OpenAITarget objects by @LeoVrana in #764
  • MAINT standardize Hugging Face token environment variable, add integration tests for Google Gemini and Open AI by @romanlutz in #778
  • FEAT: Gradio HiTL Scorer by @mart123p in #722
  • DOC: clarify OpenAIChatTarget usage with Ollama by @jsdlm in #777
  • FIX: small edits to make integration tests pass by @jsong468 in #780
  • MAINT add notice generation to component governance by @romanlutz in #781
  • MAINT update NOTICE file by @romanlutz in #782

New Contributors

Full Changelog: releases/v0.6.0...releases/v0.7.0