Biomedical Datasets Hackathon (Apr 22)

Hi everyone!
:cherry_blossom: BigScience is officially launching a Biomedical Datasets Hackathon next week!

2nd April 2022 - 15th April 2022
Large-scale language modeling and natural language prompting have demonstrated exciting capabilities for few and zero shot learning in NLP. However, translating these successes to specialized domains such as biomedicine remains challenging, due in part to biomedical NLP’s significant dataset debt – the technical costs associated with datasets that are not consistently documented or easily incorporated into popular machine learning frameworks. To help address these challenges, we are launching a hackathon to create an open source, community resource of over 150 biomedical datasets. We need your help :muscle:

We’re asking participants to implement data loaders from this curated list of biomedical datasets. If you are interested, you can sign up for specific datasets starting now!

Participants will also have the opportunity to be a co-author on our forthcoming academic paper, based on the level of contribution. See complete details on our project website and github :rocket:

BigScience Biomedical Working Group