Searching for a Multilingual Dataset


I want to do some experiments with some multilingual datasets. I want to compare the English model performance with the performance of a resource-poor language. So at least English texts should be in there. It would also be important that the dataset is published in a paper to be able to reference on this.

Has anyone some good or popular datasets in mind?