Hi! We use this code to read CSV files in datasets
: datasets/csv.py at f3b6697011cb6fc568b8f8b32f53501a8f2e8967 路 huggingface/datasets 路 GitHub. As you can see, the files are processed in chunks, so this could mean some chunks in your data contain string labels and some integer labels. Please verify that鈥檚 not the case.