Used the following code to access arxiv_sample.jsonl from 1B-sized RedPajama-Data-1T-Sample but met a FileNotFound error. However, when clicking the link, I in fact can download the .jsonl file manually. Any clue why this happen? How can I enable loading in the code?
dataset = load_dataset("togethercomputer/RedPajama-Data-1T-Sample", data_files="arxiv_sample.jsonl")
FileNotFoundError: Unable to find 'https://huggingface.co/datasets/togethercomputer/RedPajama-Data-1T-Sample/resolve/main/arxiv_sample.jsonl'