Hi @PaulLerner and @dynamicwebpaige!
As far as I know, we do have datasets with some Terabytes. As Paige suggested, you can store your dataset in alternate locations, but it is also possible (as far as I know) to upload datasets above 5GB with huggingface-cli lfs-enable-largefiles
.
This is similar to the solution in Uploading files larger than 5GB to model hub.
I hope this helps!