Can we collect crowd source dataset via Huggingface Dataset?

Hello,

Assuming there is a robotics company which sell their robots in many countries. They let the users to collect data with the robots and upload the data to huggingface repositories (each user has a dataset repo). Then the company download data from huggingface and retrain the control policy of the robot.

Is it allowed by huggingface policy? Note that there may be thousands of users uploading their data (which can be quite large in size). Malicious users may even upload bad data to perform DDOS.

The good point is all the data will be open-source and beneficial to this research area.

This is a use case we 100% support :slight_smile: I would also be happy to share this kind of initiatives with the community if you want, I’m sure it can inspire many people for their projects

Btw if you need more fine grained permissions wrt to who can write/read data in repositories feel free to reach out to us by email