Tips for Debugging Model Cards

sshleifer · August 21, 2020, 2:46pm

I am no expert but have learned a few things

check your yaml front matter with this tool
https://nodeca.github.io/js-yaml/
if it says “expected a single document in the stream, but found more”
Your yaml is valid, otherwise your yaml is invalid.

like this is valid

Valid:

---
language: 
- en
- zh
tags:
- translation
license: apache-2.0
---
### My fancy model

No spaces before tags (like “-zh”) will make this invalid.

More to come as I learn more!

Also the website takes 3 minutes for new tags to be reflected.

sshleifer · August 21, 2020, 2:51pm

What tags can you add?

---
language: "ISO 639-1 code for your language, or `multilingual`"
thumbnail: "url to a thumbnail used in social sharing"
tags:
- array
- of
- tags
license: "any valid license identifier"
datasets:
- array of dataset identifiers
metrics:
- array of metric identifiers
---

stas · September 17, 2020, 7:34pm

This template is very unintutive for someone who just does it for the first time.

I raised a bunch of questions here: https://github.com/huggingface/transformers/issues/7208

I think it’d make things easier if we had:

a complete template that shows a very perfect entry with multiple entries where array is expected/supported
a document that explains each field’s format and variations.

Actually perhaps:

I think it’d make things easier if we had:

template with just fields and no values
a complete demo template that shows a very perfect entry with multiple entries where array is expected/supported
a document that explains each field’s format and variations.

currently the template is also an instruction document and it doesn’t work well.

stas · September 17, 2020, 7:36pm

At the moment the most useful tool is a multi-line preview feature of grep. e.g. I wanted to find the format for datasets entry, so I did:

grep -r -A2 datasets: model_cards

model_cards/asafaya/bert-mini-arabic/README.md:datasets:
model_cards/asafaya/bert-mini-arabic/README.md-- oscar
model_cards/asafaya/bert-mini-arabic/README.md-- wikipedia

this assumes that the cards are all valid, so that you’re not copying an invalid format from an existing card.

but no urls

sshleifer · September 17, 2020, 7:52pm

agreed! You could draft a template, send a PR and add julien as a reviewer.
Then he could correct possible duckups.

stas · September 17, 2020, 8:47pm

Surely it can be made better, but here is the initial attempt:

github.com/huggingface/model_card

[doc] Model card example, template and instructions

huggingface:master ← stas00:template

opened 08:45PM - 17 Sep 20 UTC

stas00

+228 -8

As discussed here: https://discuss.huggingface.co/t/tips-for-debugging-model-car…ds/814 this is a proposal which may make it easier to create model_cards. This PR splits the all-in-one approach which is difficult to use into 3: 1. a template with just fields and no values 2. a complete demo template that shows a very perfect entry with multiple entries where array is expected/supported 3. a document that explains each field’s format and variations. Surely, what I made could use more work, but this is a start.

julien-c · September 18, 2020, 7:30am

I think just linking to a YAML tutorial would be enough here?

Because in my opinion the doc here and in GitHub - huggingface/model_card is pretty clear as is?

Any way, we’ll have more validation built-in to the next version of the website.

BramVanroy · September 18, 2020, 10:01am

Slightly off-topic: I would like to raise the case to make model cards required when submitting a model to the hub. At least some basic information such as the language. If not given it is very hard, as a user, to find the model. I would expect fo find all relevant models when I look for the Dutch tag, but not all of those models have cards and thus cannot be found this way.

Yes, it should be easy for people to upload new models but for the community’s sake it seems only fair that they also provide some information about those models.

sshleifer · September 18, 2020, 1:58pm

I think stas PR is better than current README because it shows working model cards.
I read the README and still screwed up a number of times. It’s easy for me now to think it’s obvious but a mere 3 weeks ago it was very not obvious to me.

prajjwal1 · September 18, 2020, 2:32pm

I have already raised a request for a checklist here
https://github.com/huggingface/transformers/issues/6762.

BramVanroy · September 18, 2020, 3:05pm

Love it. Wholeheartedly agree!

stas · September 18, 2020, 4:06pm

FWIW, I couldn’t figure it out from those 2 docs.

Topic		Replies	Views
[Announcement] All model cards will be migrated to hf.co model repos Models	5	8222	December 10, 2020
Catalog of valid arguments in model cards? Models	5	28	May 12, 2025
About the Model cards category Model cards	0	4119	December 10, 2020
How to upload a model card through the API? 🤗Hub	2	678	July 27, 2022
[Announcement] Model cards metadata automatic cleaning on the hub Model cards	1	1971	September 25, 2021

Tips for Debugging Model Cards

Related topics