I am looking at: mosaicml/mpt-1b-redpajama-200b · Hugging Face
If I click the “Use in Transformers” button it shows:
from transformers import MosaicGPT
model = MosaicGPT.from_pretrained("mosaicml/mpt-1b-redpajama-200b")
This fails if I copy+paste it because MosaicGPT
is not part of the transformers
library.
If I look at the model card they give this example instead:
import transformers
model = transformers.AutoModelForCausalLM.from_pretrained('mosaicml/mpt-1b-redpajama-200b', trust_remote_code=True)
This works, i.e. AutoModelForCausalLM
knows how to use the custom code in the model repo.
If I look at the config.json
for this model I can see:
{
"architectures": [
"MosaicGPT"
],
"auto_map": {
"AutoConfig": "configuration_mosaic_gpt.MosaicGPTConfig",
"AutoModelForCausalLM": "mosaic_gpt.MosaicGPT"
},
}
So it seems like the “Use in Transformers” button is maybe just taking the architectures
value and not making use of the auto_map
?
But then if I look at distilbert-base-uncased-finetuned-sst-2-english · Hugging Face I see different again. The “Use in Transformers” button gives:
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english")
model = AutoModelForSequenceClassification.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english")
…providing both a tokenizer and the model, and both using Auto*
classes.
The config.json
for the model has only:
{
"architectures": [
"DistilBertForSequenceClassification"
],
}
and no auto_map
.
So there’s something more going on