Model loading in Colab but not Jupyterlab?!

Dagriffpatchfan · May 8, 2025, 8:37am

Hi,
I just finetuned Tiny-Llama as tiny-sajar, a little experiment to test finetuning. Running the following code in google colab:

from transformers import AutoModelForCausalLM, AutoTokenizer

# Replace with your model's path on the Hub
model = AutoModelForCausalLM.from_pretrained("Dagriffpatchfan/tiny-sajar")
tokenizer = AutoTokenizer.from_pretrained("Dagriffpatchfan/tiny-sajar")

Worked perfectly, loading the model. I was then able to run the following code:

questions = [
    "Questions here",
]

for question in questions:
    prompt = f"{question}"
    inputs = tokenizer(prompt, return_tensors="pt")
    outputs = model.generate(
        inputs.input_ids,
        max_length=100,         # Maximum number of tokens to generate
        num_return_sequences=1, # Number of separate completions to generate
        temperature=0.7,        # Sampling temperature (lower is more focused, higher is more random)
        top_p=0.9,              # Nucleus sampling
        do_sample=True          # Enable sampling
    )

    # Decode the generated text
    generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
    print(f"**{question}**\n{generated_text}\n")

Which generated text as expected. I went to try this in a jupyterlab space and to my complete surprise I got the following error when I tried to load the model:
--------------------------------------------------------------------------- ValueError Traceback (most recent call last) Cell In[7], line 4 1 from transformers import AutoModelForCausalLM, AutoTokenizer 3 # Replace with your model’s path on the Hub ----> 4 model = AutoModelForCausalLM.from_pretrained(“Dagriffpatchfan/tiny-sajar”) 5 tokenizer = AutoTokenizer.from_pretrained(“Dagriffpatchfan/tiny-sajar”) 7 questions = [ 8 “Who are you, and what is your role in the story?”, 9 “How did you come to know David and the Avengers?”, (…) 17 “If you had to pick one person to go on a mission with, who would it be and why?” 18 ] File ~/miniconda/lib/python3.9/site-packages/transformers/models/auto/auto_factory.py:531, in _BaseAutoModelClass.from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs) 528 if kwargs.get(“quantization_config”, None) is not None: 529 _ = kwargs.pop(“quantization_config”) → 531 config, kwargs = AutoConfig.from_pretrained( 532 pretrained_model_name_or_path, 533 return_unused_kwargs=True, 534 trust_remote_code=trust_remote_code, 535 code_revision=code_revision, 536 _commit_hash=commit_hash, 537 **hub_kwargs, 538 **kwargs, 539 ) 541 # if torch_dtype=auto was passed here, ensure to pass it on 542 if kwargs_orig.get(“torch_dtype”, None) == “auto”: File ~/miniconda/lib/python3.9/site-packages/transformers/models/auto/configuration_auto.py:1151, in AutoConfig.from_pretrained(cls, pretrained_model_name_or_path, **kwargs) 1148 if pattern in str(pretrained_model_name_or_path): 1149 return CONFIG_MAPPING[pattern].from_dict(config_dict, **unused_kwargs) → 1151 raise ValueError( 1152 f"Unrecognized model in {pretrained_model_name_or_path}. " 1153 f"Should have a model_type key in its {CONFIG_NAME}, or contain one of the following strings " 1154 f"in its name: {', '.join(CONFIG_MAPPING.keys())}" 1155 ) ValueError: Unrecognized model in Dagriffpatchfan/tiny-sajar. Should have a model_type key in its config.json, or contain one of the following strings in its name: albert, align, altclip, aria, aria_text, audio-spectrogram-transformer, autoformer, aya_vision, bamba, bark, bart, beit, bert, bert-generation, big_bird, bigbird_pegasus, biogpt, bit, blenderbot, blenderbot-small, blip, blip-2, bloom, bridgetower, bros, camembert, canine, chameleon, chinese_clip, chinese_clip_vision_model, clap, clip, clip_text_model, clip_vision_model, clipseg, clvp, code_llama, codegen, cohere, cohere2, colpali, conditional_detr, convbert, convnext, convnextv2, cpmant, ctrl, cvt, dab-detr, dac, data2vec-audio, data2vec-text, data2vec-vision, dbrx, deberta, deberta-v2, decision_transformer, deepseek_v3, deformable_detr, deit, depth_anything, depth_pro, deta, detr, diffllama, dinat, dinov2, dinov2_with_registers, distilbert, donut-swin, dpr, dpt, efficientformer, efficientnet, electra, emu3, encodec, encoder-decoder, ernie, ernie_m, esm, falcon, falcon_mamba, fastspeech2_conformer, flaubert, flava, fnet, focalnet, fsmt, funnel, fuyu, gemma, gemma2, gemma3, gemma3_text, git, glm, glm4, glpn, got_ocr2, gpt-sw3, gpt2, gpt_bigcode, gpt_neo, gpt_neox, gpt_neox_japanese, gptj, gptsan-japanese, granite, granitemoe, granitemoeshared, granitevision, graphormer, grounding-dino, groupvit, helium, hiera, hubert, ibert, idefics, idefics2, idefics3, idefics3_vision, ijepa, imagegpt, informer, instructblip, instructblipvideo, jamba, jetmoe, jukebox, kosmos-2, layoutlm, layoutlmv2, layoutlmv3, led, levit, lilt, llama, llama4, llama4_text, llava, llava_next, llava_next_video, llava_onevision, longformer, longt5, luke, lxmert, m2m_100, mamba, mamba2, marian, markuplm, mask2former, maskformer, maskformer-swin, mbart, mctct, mega, megatron-bert, mgp-str, mimi, mistral, mistral3, mixtral, mllama, mobilebert, mobilenet_v1, mobilenet_v2, mobilevit, mobilevitv2, modernbert, moonshine, moshi, mpnet, mpt, mra, mt5, musicgen, musicgen_melody, mvp, nat, nemotron, nezha, nllb-moe, nougat, nystromformer, olmo, olmo2, olmoe, omdet-turbo, oneformer, open-llama, openai-gpt, opt, owlv2, owlvit, paligemma, patchtsmixer, patchtst, pegasus, pegasus_x, perceiver, persimmon, phi, phi3, phi4_multimodal, phimoe, pix2struct, pixtral, plbart, poolformer, pop2piano, prompt_depth_anything, prophetnet, pvt, pvt_v2, qdqbert, qwen2, qwen2_5_vl, qwen2_audio, qwen2_audio_encoder, qwen2_moe, qwen2_vl, qwen3, qwen3_moe, rag, realm, recurrent_gemma, reformer, regnet, rembert, resnet, retribert, roberta, roberta-prelayernorm, roc_bert, roformer, rt_detr, rt_detr_resnet, rt_detr_v2, rwkv, sam, sam_vision_model, seamless_m4t, seamless_m4t_v2, segformer, seggpt, sew, sew-d, shieldgemma2, siglip, siglip2, siglip_vision_model, smolvlm, smolvlm_vision, speech-encoder-decoder, speech_to_text, speech_to_text_2, speecht5, splinter, squeezebert, stablelm, starcoder2, superglue, superpoint, swiftformer, swin, swin2sr, swinv2, switch_transformers, t5, table-transformer, tapas, textnet, time_series_transformer, timesformer, timm_backbone, timm_wrapper, trajectory_transformer, transfo-xl, trocr, tvlt, tvp, udop, umt5, unispeech, unispeech-sat, univnet, upernet, van, video_llava, videomae, vilt, vipllava, vision-encoder-decoder, vision-text-dual-encoder, visual_bert, vit, vit_hybrid, vit_mae, vit_msn, vitdet, vitmatte, vitpose, vitpose_backbone, vits, vivit, wav2vec2, wav2vec2-bert, wav2vec2-conformer, wavlm, whisper, xclip, xglm, xlm, xlm-prophetnet, xlm-roberta, xlm-roberta-xl, xlnet, xmod, yolos, yoso, zamba, zamba2, zoedepth

I found this very confusing…does anyone know what I am experiencing?

John6666 · May 8, 2025, 11:55pm

Since it includes models close to the latest ones such as Gemma 3, the Transoformers version is likely to be almost the latest. In fact, even older Transoformers models should work with the Llama architecture. This is indeed a strange error. The cause is probably not the code or the model itself.

There seems to be a possibility of errors occurring in hf_transfer related to Jupyter. In other words, there may be an error in the download.

Dagriffpatchfan · May 11, 2025, 10:21pm

So I should set
export HF_HUB_ENABLE_HF_TRANSFER=1
to 0 instead of 1?

John6666 · May 11, 2025, 11:28pm

Yea. Or maybe try reinstalling hf_transfer. If that’s the cause.

pip install -U hf_transfer hf_xet

system · May 15, 2025, 11:33pm

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to load the finetuned model (merged weights) on colab? 🤗Transformers	1	1492	November 27, 2023
Running Llama model in Google colab Models	5	882	September 5, 2024
Download and load fine-tuned model locally (VS Code) Beginners	3	4529	January 24, 2025
Trouble with fine tuning DialoGPT-large Beginners	1	1637	January 7, 2022
Llama-2 on colab Beginners	3	11379	November 28, 2023

Model loading in Colab but not Jupyterlab?!

Related topics