dataset reading

https://github.com/AIDASLab/MMPB/blob/00eb15e6659fdb1bce4ed3a9cd95f5c20a1a8aac/VLMEvalKit/vlmeval/dataset/image_mcq.py#L1130-L1151

Thank authors for sharing the dataset. I have observed that the dataset available on HuggingFace is in the ".parquet" format. Nevertheless, the method of reading the dataset in this repository does not align with this format. I would like to inquire whether there are any additional processing operations that are still necessary.


	class MMPB(ImageBaseDataset):
	TYPE = 'MCQ'
	DATASET_URL = {
	'MMPB': '',
	}
	# DATASET_MD5 = {
	# 'MathVision': '93f6de14f7916e598aa1b7165589831e',
	# 'MathVision_MINI': '060fe4fa5d868987ce179307bd5f8a33'
	# }
	def __init__(self, dataset='MMBench', skip_noimg=True, **dataset_kwargs):
	self.dataset_name = dataset
	self.data_path = "/workspace/MMPB/dataset6.csv"
	# self.preference_path = "/workspace/MMPB/human/human_total_preference.json"
	self.dataset_path = "/workspace/MMPB"
	self.multi_turn_base_path = "/workspace/MMPB/multi_turn"
	self.eval_prompting_path = "/workspace/MMPB/eval_prompt/eval_prompt.csv"
	# with open(self.preference_path, "r", encoding="utf-8") as file:
	# self.total_preferences = json.load(file)

	self.data = load(self.data_path)
	# self.mt_data = load(self.multi_turn_path)
	self.model = dataset_kwargs.get("model", None)[0]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

dataset reading #1

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

dataset reading #1

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions