Hosting BookCorpus

I noticed that HuggingFace hosts BookCorpus (bookcorpus/bookcorpus · Datasets at Hugging Face), with the license listed as ‘unknown’. Several developers seem to have used it to train or fine-tune their commercial models. But it contains copyrighted works scraped from smashwords.com without permission.

Is HuggingFace aware of this? What is the policy on making datasets like this available?