Share your projects!

After following the first section of the course, you should be able to fine-tune a model on a text classification problem and upload it back to the Hub. Share your creations here and if you build a cool app using your model, please let us know!

6 Likes

It’s not exactly a project, but I’m super excited to share my first public Kaggle dataset

With the help from good folks at HF, I was able to query the metadata information available on model-hub and upload it as a Kaggle dataset.

It should be helpful to anyone looking to analyze and create EDA/Text-processing notebooks on the metadata of publicly available models. The dataset contains the README modelcard data as well.

Please have a look and provide feedback. :slight_smile:

5 Likes

@dk-crazydiv this is very cool! Would you like to add it as a HF dataset as well?

Here is the process in case you’re interested: Sharing your dataset — datasets 1.8.0 documentation

2 Likes

Yes. I was thinking of the following:

  • HF modelhub metadata as a HF dataset
  • HF datasets metadata as a HF dataset
  • HF datasets metadata as a Kaggle dataset

This should complete the inception loop. :smiley: Will update with progress soon.

2 Likes

great, looking forward to those!

Hi Everyone,

I’ve uploaded the modelhub data in HF datasets. Please provide feedback. :slight_smile:

The documentation guide to create and share a dataset were very good, informative and helpful.

I faced a couple of issues(most of which I overcame) while porting the data from kaggle style format to datasets library:

  • I couldn’t find any datetime object in Features And I saw a couple of other dataset using string as well.
  • Since I chose to share it as “community provided”, I had pip installed datasets library and some of the commands in the doc specific to datasets-cli which expected datasets repo cloned didn’t work smoothly with relative paths but worked with absolute paths.
  • The Explore dataset button on hub isn’t working on my dataset. Is this because it’s a community provided dataset?
3 Likes

I wrote a beginner-friendly introduction to Huggingface’s pipeline API on Kaggle-

4 Likes

Published first notebook on the modelhub dataset. Found a couple of interesting insights. :slight_smile: Readme analysis attempt is pending.

3 Likes

Hi All, I’ve also written a post on publishing a dataset.

The live sessions of course are super-duper-amazing( course itself is 10x, live speakers make it 100x, only comparable so far to fastai sessions in my experience). Thank you everyone for that. I finally am feeling that by fall, as a results of such high quality livestreams, transformers would be transparent as opposed to me using them only as a pretrained plugnplay blackbox.

More learning-along-the-way projects to follow :slight_smile:

3 Likes

this is great stuff @dk-crazydiv - thanks for sharing!

Wrote an introductory article to use the Hugging Face Datasets library for your next NLP project (published on TDS).

3 Likes