This is the code repository for our NeurIPS 2025 paper "On Evaluating LLM Alignment by Evaluating LLMs as Judges". This repository contains the necessary scripts and data to evaluate the alignment of large language models (LLMs) using the AlignEval framework.
README.md: This file.data/: Contains the AlignEval datasets used for evaluation.prompts/: Contains the prompt templates used for evaluation.results/: Contains the results of the evaluation.aligneval.py: The main script for running the evaluation.get_predictions.py: A script for generating predictions using the LLMs.get_predictions.sh: A shell script for generating predictions using the LLMs.
- Running
aligneval.pywill evaluate the alignment of LLMs using the AlignEval datasets. - Running
get_predictions.pywill generate predictions using the LLMs.