I think Phil was correct when he said that is not the full error message. It seems that only the first 1024 characters are returned and so that’s why it cuts off like that. I also thought there might be an issue with the load_dataset function but am now thinking that might not be it, particularly because the training does seem to be happening. It seems at the upload stage, there is an error. All the more peculiar because I was able to run a training job on the xsum dataset with the exact same training configuration. So if there is a problem in the upload, it seems to be isolated to custom datasets.