Cannot access RedPajama-Data-1T-Sample sub-file

The error message is incorrect indeed.

The actual reason is that the dataset has a loading script RedPajama-Data-1T-Sample.py · togethercomputer/RedPajama-Data-1T-Sample at main that doesn’t check the data_files config argument