Cannot export tflite using optimum for a fine-tuned gemma 3 model for task : question answering

subha290 · April 2, 2025, 5:26am

python3 -m optimum.commands.optimum_cli export tflite --model merged_gemma3_aie_finetuned_hf --task question-answering --sequence_length 1024 gemma_tflite/

we are using this command but the error we are facing is transformer doesnt recognize gemma 3 text config.

John6666 · April 2, 2025, 6:00am

Gemma 3 is maybe not yet supported except for the dev versions of Transformers and optimum, which can be installed from the github source.

I think that the conversion itself can be done by installing these dev versions, but I think there would be still a lot of problems.

If you have any questions about ONNX, the best way to get a reliable answer is to contact the ONNX Community within Hugging Face.

github.com/huggingface/optimum

'gemma is not supported yet with the onnx backend' - Exporting on-the-fly to onnx

opened 07:05AM - 27 Feb 24 UTC

closed 10:54AM - 27 Feb 24 UTC

RYangData

### Feature request ![image](https://github.com/huggingface/optimum/assets/53…895743/1267e66b-aba4-4d19-9763-b3a05d2ef854) ### Motivation I'm new to optimising for inference, this would benefit me other beginners greatly in evaluating runtimes and costs expected for many models. ### Your contribution - I could given time and more familiarity going forward - but would like this to be integrated quickly !

github.com/huggingface/optimum

Gemma Onnx suuport

opened 05:07PM - 27 Feb 24 UTC

closed 06:54AM - 19 Mar 24 UTC

Kaya-P

bug

### System Info ```shell My system currently is python = 3.8 optimum-intel …: optimum-1.18.0.dev0 ``` ### Who can help? @JingyaHuang @echarlaix ### Information - [X] The official example scripts - [ ] My own modified scripts ### Tasks - [ ] An officially supported task in the `examples` folder (such as GLUE/SQuAD, ...) - [X] My own task or dataset (give details below) ### Reproduction (minimal, reproducible, runnable) I am following the provided examples on #1714 but am running into some issues. when I run `optimum-cli export onnx -m google/gemma-2b gemma_onnx` I get the following error ![image](https://github.com/huggingface/optimum/assets/61036085/054916e1-5087-4d7b-a5ee-c90a9865d288) When I execute the python script provided I get the following error ![image](https://github.com/huggingface/optimum/assets/61036085/a7231029-c984-4267-8b82-5a7b59455e2b) ### Expected behavior The expected behavior is for the model to compile and be stored in onnx form

pip uninstall transformers optimum
pip install git+https://github.com/huggingface/optimum git+https://github.com/huggingface/transformers

subha290 · April 2, 2025, 9:59am

Thankyou for your response, i fine tuned a gemma 3 model and now i need to convert it into .tflite, what are all the ways to do it, i used the dev version of Transformers and optimum but still it throwing error like “Gemma3 Text config is not recognized by the transformer”.

What are all the ways to convert it?

And also if I convert it into onnx then can i convert into .tflite? if yes, how to convert it into onnx?

subha290 · April 2, 2025, 10:28am

this is the error i am getting

Traceback (most recent call last):
File “C:\Users\aiehy\OneDrive\Desktop\training1\tflite.py”, line 46, in
tf_model = TFAutoModelForCausalLM.from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “C:\Users\aiehy\OneDrive\Desktop\training1.venv\Lib\site-packages\transformers\models\auto\auto_factory.py”, line 576, in from_pretrained
raise ValueError(
ValueError: Unrecognized configuration class <class ‘transformers.models.gemma3.configuration_gemma3.Gemma3TextConfig’> for this kind of AutoModel: TFAutoModelForCausalLM.
Model type should be one of BertConfig, CamembertConfig, CTRLConfig, GPT2Config, GPT2Config, GPTJConfig, MistralConfig, OpenAIGPTConfig, OPTConfig, RemBertConfig, RobertaConfig, RobertaPreLayerNormConfig, RoFormerConfig, TransfoXLConfig, XGLMConfig, XLMConfig, XLMRobertaConfig, XLNetConfig.

John6666 · April 2, 2025, 10:36am

Hmm…

or perhaps:

#python3 -m optimum.commands.optimum_cli export tflite --model merged_gemma3_aie_finetuned_hf --task question-answering --sequence_length 1024 gemma_tflite/
python3 -m optimum.commands.optimum_cli export tflite --model merged_gemma3_aie_finetuned_hf --task text-generation --sequence_length 1024 gemma_tflite/

John6666 · April 2, 2025, 10:46am

transformers.models.gemma3.configuration_gemma3.Gemma3TextConfig

Oh…

github.com/huggingface/transformers

AttributeError: 'Gemma3Config' object has no attribute 'vocab_size'

opened 06:11PM - 12 Mar 25 UTC

closed 03:04PM - 19 Mar 25 UTC

jumelet

bug

### System Info v4.50.0.dev0 ### Who can help? @ArthurZucker @LysandreJik …@xenova ### Information - [ ] The official example scripts - [ ] My own modified scripts ### Tasks - [ ] An officially supported task in the `examples` folder (such as GLUE/SQuAD, ...) - [ ] My own task or dataset (give details below) ### Reproduction I am trying to run the new Gemma3 model, using version '4.50.0.dev0'. When loading the model I get the error: 'Gemma3Config' object has no attribute 'vocab_size'. Looking into this it seems Gemma3Config has `vocab_size` nested in a "text_config" attribute. I try to load the model as AutoModelForCausalLM, running it with Gemma3ForConditionalGeneration does not raise this issue. Am I wrong in assuming I can run Gemma 3 as AutoModelForCausalLM? ### Expected behavior Loading the model as AutoModelForCausalLM.from_pretrained without issue.

aydndglr · May 5, 2025, 4:09pm

Hello, I was able to overcome this problem by making some changes to the codes prepared by Google in colab. You can convert it directly to tflite and .task format later if desired. Instead of fine-tuning from the beginning, I used the already trained model.

import os
from google.colab import userdata
os.environ["HF_TOKEN"] = userdata.get('HF_TOKEN')

!pip3 install --upgrade -q -U bitsandbytes
!pip3 install --upgrade -q -U peft
!pip3 install --upgrade -q -U trl
!pip3 install --upgrade -q -U accelerate
!pip3 install --upgrade -q -U datasets
!pip3 install git+https://github.com/huggingface/transformers@v4.49.0-Gemma-3

! pip install git+https://github.com/google-ai-edge/ai-edge-torch
! pip install ai-edge-litert
! pip install mediapipe


!pip install huggingface_hub
from huggingface_hub import snapshot_download
import shutil

# 🔧 Ayarları yap
model_name = "username/model_repo"  # ← fine-tuned model
local_dir = "/content/merged_model"

# 💾 Hugging Face
snapshot_download(
    repo_id=model_name,
    local_dir=local_dir,
    local_dir_use_symlinks=False  # # Don't bother with symlink, just copy it directly
)

print(f"Model downloaded: {local_dir}")

!git clone https://github.com/google-ai-edge/ai-edge-torch.git

!pip uninstall numpy
!pip uninstall torch torchvision torchaudio
!pip uninstall ai-edge-torch ai-edge-litert ai-edge-quantizer torch-xla2 safetensors
!pip install numpy
!pip install torch torchvision torchaudio
!pip install -r https://raw.githubusercontent.com/google-ai-edge/ai-edge-torch/main/requirements.txt

!pip install --upgrade numpy
!pip install --upgrade --force-reinstall ai-edge-torch
!pip install --upgrade --force-reinstall ai-edge-litert
!pip install --upgrade --force-reinstall ai-edge-quantizer
!pip install --upgrade --force-reinstall torch-xla2
!pip install --upgrade --force-reinstall safetensors


from ai_edge_torch.generative.examples.gemma3 import gemma3
from ai_edge_torch.generative.utilities import converter
from ai_edge_torch.generative.utilities.model_builder import ExportConfig
from ai_edge_torch.generative.layers.experimental import kv_cache
import torch


def _create_mask(mask_len, kv_cache_max_len):
    mask = torch.full((mask_len, kv_cache_max_len), float('-inf'), dtype=torch.float32)
    return torch.triu(mask, diagonal=1).unsqueeze(0).unsqueeze(0)

def _create_export_config(prefill_seq_lens: list[int], kv_cache_max_len: int) -> ExportConfig:
    export_config = ExportConfig()
    export_config.prefill_mask = [_create_mask(i, kv_cache_max_len) for i in prefill_seq_lens]
    decode_mask = torch.full((1, kv_cache_max_len), float('-inf'), dtype=torch.float32)
    export_config.decode_mask = torch.triu(decode_mask, diagonal=1).unsqueeze(0).unsqueeze(0)
    export_config.kvcache_cls = kv_cache.KVCacheTransposed
    return export_config


with torch.inference_mode(True):
    checkpoint_path = "/content/merged_model"
    pytorch_model = gemma3.build_model_1b(
        checkpoint_path, kv_cache_max_len=2048
    )

    export_config = _create_export_config([1024], 2048)

    converter.convert_to_tflite(
        pytorch_model,
        output_path="/content/",
        output_name_prefix="gemma3_1b_finetune",
        prefill_seq_len=[1024],
        quantize=True,
        lora_ranks=None,
        export_config=export_config
    )


from mediapipe.tasks.python.genai.bundler import llm_bundler

def build_task_bundle():
    config = llm_bundler.BundleConfig(
        tflite_model="/content/gemma3_1b_finetune_q8_ekv2048.tflite",
        tokenizer_model="/content/merged_model/tokenizer.model",
        start_token="<bos>",
        stop_tokens=["<eos>", "<end_of_turn>"],
        output_filename="/content/gemma3-1b-it.task",
        enable_bytes_to_unicode_mapping=False,
        prompt_prefix="<start_of_turn>user\n",
        prompt_suffix="<end_of_turn>\n<start_of_turn>model\n",
    )
    llm_bundler.create_bundle(config)

build_task_bundle()

Topic		Replies	Views
Model conversion issue : yusuf802/Leaf-Disease-Predictor Models	1	24	September 23, 2024
Error exporting T5 model to ONNX with optimum-cli 🤗Optimum	3	802	May 7, 2024
Optimum Exporter TFLite error 🤗Optimum	4	1368	May 18, 2023
Cannot export to ONNX with optimum.onnxruntime 🤗Optimum	0	910	February 28, 2024
How can I export a transformers model into onnx that not supported with optimum yet 🤗Optimum	9	515	August 30, 2024

Cannot export tflite using optimum for a fine-tuned gemma 3 model for task : question answering

Related topics