Dataset for inducing hallucinations or dataset with hallucinations in it

amperie · February 12, 2024, 3:39am

Hi all

I’m looking to test a hallucination detection library so I’m looking for one or both of these things:

and/or

A dataset that has inputs and outputs (that may or may not be hallucinations) already in it. Ideally this dataset would have multiple outputs for each input and it would specify what model (open source or proprietary) created the output. More ideally it would have this data for chatgpt as one of the models

Anything like this exist? Thanks!

Topic		Replies	Views
Creating a new dataset Beginners	1	249	February 13, 2024
Request for Further Information on Datasets Beginners	0	281	November 26, 2020
Fine tune the text generation with gpt2 Beginners	2	441	February 22, 2023
Looking for Mental Health Support Datasets for building a Multi-turn Chatbot 🤗Datasets	6	2432	September 21, 2024
In RAG systems, who's really responsible for hallucination... the model, the retriever, or the data? Models	3	70	June 27, 2025