Where can I find wildchat-50m judgement data documentation?

chanansh · June 26, 2025, 9:26pm

I am looking for a dataset of prompts + model response + evaluation score (input, output, quality) across different generating models.
I found WildChat-50m Judgments - a nyu-dice-lab Collection
However, I cannot find any good documentation to understand the structure of this data.
Where can I find documentation of this dataset?
for example:

the evaluation model sometimes reply a grade which is not a number - how was is parsed?
how to interpret conversations with length > 2?
what do all the field mean? Some of them are almost always null

John6666 · June 27, 2025, 12:34am

It seems that there are no documents other than the paper and GitHub code…

github.com/penfever/wildchat-50m

src/core/generate_model_responses.py

main

import argparse
from itertools import islice
from pprint import pprint
from typing import Dict, Any, Optional, Tuple, List
import logging
import traceback
import time

from tqdm.auto import tqdm
from datasets import Dataset, load_dataset, concatenate_datasets, DatasetDict
from vllm import LLM, SamplingParams
from huggingface_hub.errors import HfHubHTTPError
from transformers import AutoTokenizer
import numpy as np

from judgment import get_judge_template
from utils import convert_dataset_types

cols_to_remove = [
    "timestamp",

This file has been truncated. show original

Topic		Replies	Views
Fine-Tuning + RAG based Chatbot: Dataset Structure & Instruction Adherence Issues Intermediate	7	346	March 11, 2025
Llama-2-7b-chat fine-tuning Models	4	6757	April 26, 2024
Missing dataset card for id_personachat Model cards	2	1455	November 15, 2021
Question answering Beginners	0	290	November 1, 2021
Finetune generative model Beginners	2	635	March 9, 2023

Where can I find wildchat-50m judgement data documentation?

Related topics