Tips for Debugging Model Cards

I am no expert but have learned a few things

check your yaml front matter with this tool
https://nodeca.github.io/js-yaml/
if it says “expected a single document in the stream, but found more”
Your yaml is valid, otherwise your yaml is invalid.

like this is valid

Valid:

---
language: 
- en
- zh
tags:
- translation
license: apache-2.0
---
### My fancy model

No spaces before tags (like “-zh”) will make this invalid.

More to come as I learn more!

Also the website takes 3 minutes for new tags to be reflected.

2 Likes

What tags can you add?

---
language: "ISO 639-1 code for your language, or `multilingual`"
thumbnail: "url to a thumbnail used in social sharing"
tags:
- array
- of
- tags
license: "any valid license identifier"
datasets:
- array of dataset identifiers
metrics:
- array of metric identifiers
---
2 Likes

This template is very unintutive for someone who just does it for the first time.

I raised a bunch of questions here: https://github.com/huggingface/transformers/issues/7208

I think it’d make things easier if we had:

  1. a complete template that shows a very perfect entry with multiple entries where array is expected/supported
  2. a document that explains each field’s format and variations.

Actually perhaps:

I think it’d make things easier if we had:

  1. template with just fields and no values
  2. a complete demo template that shows a very perfect entry with multiple entries where array is expected/supported
  3. a document that explains each field’s format and variations.

currently the template is also an instruction document and it doesn’t work well.

1 Like

At the moment the most useful tool is a multi-line preview feature of grep. e.g. I wanted to find the format for datasets entry, so I did:

grep -r -A2 datasets: model_cards
model_cards/asafaya/bert-mini-arabic/README.md:datasets:
model_cards/asafaya/bert-mini-arabic/README.md-- oscar
model_cards/asafaya/bert-mini-arabic/README.md-- wikipedia

this assumes that the cards are all valid, so that you’re not copying an invalid format from an existing card.

but no urls

agreed! You could draft a template, send a PR and add julien as a reviewer.
Then he could correct possible duckups.

1 Like

Surely it can be made better, but here is the initial attempt:

I think just linking to a YAML tutorial would be enough here?

Because in my opinion the doc here and in GitHub - huggingface/model_card is pretty clear as is?

Any way, we’ll have more validation built-in to the next version of the website.

3 Likes

Slightly off-topic: I would like to raise the case to make model cards required when submitting a model to the hub. At least some basic information such as the language. If not given it is very hard, as a user, to find the model. I would expect fo find all relevant models when I look for the Dutch tag, but not all of those models have cards and thus cannot be found this way.

Yes, it should be easy for people to upload new models but for the community’s sake it seems only fair that they also provide some information about those models.

I think stas PR is better than current README because it shows working model cards.
I read the README and still screwed up a number of times. It’s easy for me now to think it’s obvious but a mere 3 weeks ago it was very not obvious to me.

1 Like

I have already raised a request for a checklist here
https://github.com/huggingface/transformers/issues/6762.

2 Likes

Love it. Wholeheartedly agree!

FWIW, I couldn’t figure it out from those 2 docs.