Reliability of Psychological Scales on LLMs:

Execution Process

Create Utils File

An example utils.py:

api_key = "<API key>"
temperature = <model temperature>
delay_time = <the seconds between each request>
model = "<the name of the model>"

Specify Test Cases

In main.py, specify the server parameters:

template: a list of prompt templates.
version: a list of question versions.
language: a list of language versions.
label: a list of level option labels.
order: a list of level orders.
questionnaire_name: the selected questionnaire.
name_exp: name of the save file.

Start a Server class, all pre-testing cases are created and stored in save/<name_exp>.json

test = Server(questionnaire_name, template, version, language, label, order, name_exp)

Load the saved file as a new save, a protection mechanism for test interruption

test = load("<save_path>", "<new_save_name>")

Run for all pre-testing cases

test.run()

An Example Run

from server import *

template = ['t1','t2','t3','t4','t5']
version = ['v1','v2','v3','v4','v5']
language = ['En', 'Zh', 'Ko', 'Es', 'Fr', 'De', 'It', 'Ar', 'Ru', 'Ja']
label = ['n', 'al', 'au', 'rl', 'ru']
order = ['r', 'f']
questionnaire_name = 'BFI'
name_exp = 'bfi-save'

bfi_test = Server(questionnaire_name, template, version, language, label, order, name_exp)
bfi_test.run()

Rephrase the Statements

In main.py, execute:

rephrase("<questionnaire_name>", "<specified_language>")

References

For more details, please refer to this paper. Please remember to cite us if you find our work helpful in your work!

@inproceedings{huang2024reliability,
  title={On the reliability of psychological scales on large language models},
  author={Huang, Jen-tse and Jiao, Wenxiang and Lam, Man Ho and Li, Eric John and Wang, Wenxuan and Lyu, Michael},
  booktitle={Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing},
  pages={6152--6173},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
Analysis		Analysis
Append		Append
dataset		dataset
llm_settings		llm_settings
prompt_template		prompt_template
.gitignore		.gitignore
README.md		README.md
global_functions.py		global_functions.py
gpt_setting.py		gpt_setting.py
main.py		main.py
server.py		server.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reliability of Psychological Scales on LLMs:

Execution Process

Create Utils File

Specify Test Cases

An Example Run

Rephrase the Statements

References

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

CUHK-ARISE/LLMPersonality

Folders and files

Latest commit

History

Repository files navigation

Reliability of Psychological Scales on LLMs:

Execution Process

Create Utils File

Specify Test Cases

An Example Run

Rephrase the Statements

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages