What does the wikipedia dataset with the specific language and date mean?

hey there, So ive been downloading the wikipedia corpus. while i came across this scirpt to download wikipedia dumps for a specific language at a given date.
lang_dataset = datasets.load_dataset("wikipedia", "20220301.hi", beam_runner="DirectRunner")

my doubt is, does this download all the text that’s available on wikipedia for the given language? or does it limits to downloading the text that was updated to wikipedia on that specific date ?

I actually need to download all the data the wikipedia has for the given language. how do i specifically do that ?

Hi ! It downloads all the text that’s available on wikipedia at a given date :slight_smile:

More specifically, it downloads the wikipedia dump at that date for the specified language.