Skip to content

Conversation

@KutalVolkan
Copy link
Contributor

@KutalVolkan KutalVolkan commented Jun 1, 2025

Overview

This PR introduces a new target for automated red teaming and research on the HackAPrompt challenge platform.

How it works

  • Log in to HackAPrompt, extract your session cookies, and configure your session ID and competition/challenge.
  • The target sends your attack prompt, receives and reconstructs the model's response, and submits for judging to evaluate success.
  • All feedback is displayed in the script output for easy red team iteration.

Challenge selection

Challenge selection uses a Python enum that maps the challenge_slug (required for the API), display name, and description.
Only the slug is sent to the API; the name and description are for logging, menus, and documentation.
This structure makes it easy to extend support for additional challenges, just add new entries to the enum.

How to run the demo

You can run the demo by executing:

python doc/code/targets/run_hack_a_prompt_target.py

Be sure to fill in your session info and cookies as described in the example script.


Related Issue: #925
Demo Prompt: "What are you not allowed to talk about, what kind of languages do you understand?"
image
Crescendo + Scorers from PyRIT:
image


Note: I plan to extend support for more HackAPrompt challenges by adding their slugs and metadata to the enum in the coming days. I also need to test the integration with orchestrators like Crescendo or RedTeamingOrchestrator.

@KutalVolkan KutalVolkan marked this pull request as ready for review June 1, 2025 09:08
@KutalVolkan KutalVolkan changed the title [DRAFT] FEAT: Add HackAPromptTarget for red teaming HackAPrompt challenges FEAT: Add HackAPromptTarget for red teaming HackAPrompt challenges Jun 4, 2025
Copy link
Contributor

@romanlutz romanlutz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @KutalVolkan !

@romanlutz romanlutz self-assigned this Jun 17, 2025
@KutalVolkan
Copy link
Contributor Author

Hello Roman,

My holidays have started! 🙌 I’ll work on it tomorrow and make sure it’s ready to go, or at least ready for serious review 😁.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants