Available datasets online

Hi, I am a beginner. I was wondering if there was a predefined dataset that contains various service’s terms and conditions and or privacy policies to fine-tune a model. Thanks :slight_smile: :grin:

Hi ! Afaik there’s no such dataset available on the datasets hub yet.
Do you know any dataset that we could add ?

Hey. There is an annotated list of privacy policies.
The OPP-115 Corpus (Online Privacy Policies, set of 115) is a collection of website privacy policies (i.e., in natural language) with annotations that specify data practices in the text. Each privacy policy was read and annotated by three graduate students in law.

You can check this Usable Privacy Policy Project

This looks interesting ! Feel free to open an issue on github and mark it as a “Dataset request”.
I’m sure some people in the community can help you by adding it to the library.

If you’re interested in learning to add a dataset yourself, there’s also some documentation here.

There’s also a pretty recent question-answering dataset about privacy policies that might be of interest to you @theprincedrip:

