failed reproduce llama3-8b result 

i can not reproduce llama3-8b result according ur advice, just got
{'exact_match': 53.9604, 'num_predicted': 202, 'mean_prediction_length_characters': 1.0, 'LEval_score': 53.9604, 'display_keys': ['exact_match'], 'display': [53.9604]}

here is my codes:
python Baselines/llama2-chat-test.py
--metric exam_eval
--task_name quality
--max_length 4k

and change llama2-chat-test.py
elif args.metric == "exam_eval":
context = "Document is as follows. {document} \nQuestion: {inst}. Please directly give the answer without any additional output or explanation "

message="<|begin_of_text|>"+sys_prompt    # B_INST + B_SYS + sys_prompt + E_SYS + context + E_INST
message += "\nAnswer:"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

failed reproduce llama3-8b result #14

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

failed reproduce llama3-8b result #14

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions