Can I download only the first half of a large dataset?

I would like to grab the first and second chunk of SlimPajama-627B, which is 20% of the tokens.

Is there a way to tweak some configuration file to do this? I can only see how to load a faction of the dataset one the whole dataset has been downloaded. Is there a way to make load_dataset and/or some configuration script to do this? Thank you.

You can choose a subset of files to load:

data_files = ["train/chunk1/*.jsonl.zst", "train/chunk2/*.jsonl.zst"]
ds = load_dataset("cerebras/SlimPajama-627B", data_files=data_files)