Hey all - I’ve been spending some time learning about RLHF and instruction tuning. I put together a list of human preference datasets and wanted to share it with this group. It’s at: GitHub - glgh/awesome-llm-human-preference-datasets: A curated list of Human Preference Datasets for LLM fine-tuning, RLHF, and eval.. Give me a star if this seems useful!
I’m curious for those of you who use such datasets, how well do you find the open source options are serving your needs. Do any of you also use commercial services like Scale or MTurk? What’s one thing you would do to make such data more useful?
(P.S. if this is a topic you find interesting, feel free to grab some time with me using Calendly - G Liu)