Hello everyone
Iām trying to use the Pubmed dataset from HF repository, to perform some Information Retrieval.
However, it looks like this data set is really really huge: hundreds of GBs. Moreover, while extracting it, the process often crashes with the error āEOFError: Compressed file ended before the end-of-stream marker was reachedā.
In the dataset card, it says āThere are no splits in this dataset. It is given as is.ā, but I ask anyway if thereās a way to download only a part of it, or maybe there is a smaller version available for test.
Thanks everyone.