-
Notifications
You must be signed in to change notification settings - Fork 26
Description
First, thank you for the effort put into RAGTruth. There is a tremendous need for such a dataset.
Unfortunately, some of the labels are sorely inaccurate. Consider Response ID 11898 as one example. This response states three supposed hallucinations, all with implicit_true being false.
Consider the first:
- Stated Hallucination: "Cons include potentially earning less than those with graduate degrees."
- Annotator Explanation: "Passages have no mention of this earning less than those with graduate degrees."
- Supporting Text in Passage: "graduates who are able to find work end up making a lot more than their undergraduate counterparts"
In other words, the provided passage does state that there is a potential for those with graduate degrees to earn more than their undergraduate counterparts; which means that there is a potential for undergrads to earn less than those with graduate degrees. Hence, the annotation is incorrect.
Consider the second:
- Stated Hallucination: "earning a higher income upon graduation"
- Annotator Explanation: "Passages have no mention of this detail."
- Supporting Text in Passage: "the graduates who are able to find work end up making a lot more than their undergraduate counterparts; the median annual salary plus bonus for a person fresh out of grad school with an MBA is $105,000"
Yet, "fresh out of grad school" is equivalent to "upon graduation." And the whole context is "earning a higher income" ("making a lot more than their undergraduate counterparts"). Hence, the annotation is incorrect.
Finally, consider the third:
- Stated Hallucination: "gaining practical experience"
- Annotator Explanation: "Passages have no mention of this tip."
- Supporting Text in Passage: None
Hence, this annotation is correct.
Naturally, the value of the dataset is directly proportional to the correctness of the annotations. While I recognize the immense effort that has gone into this dataset, there's still a need for additional annotators to fix errant labels (and there are a lot of errant labels).
Kindly consider fixing the errant labels to make RAGTruth the incredible resource that it can be.