Fine tuning on qwen3

orkungedik · May 18, 2025, 8:25pm

Hi,

I am not able to finetune Qwen3-4B model by using parameters;

FastVisionModel.for_training(model) # Enable for training!

trainer = SFTTrainer(
model = model,
tokenizer = tokenizer,
data_collator = UnslothVisionDataCollator(model, tokenizer),
train_dataset = processed_dataset,
args = SFTConfig(
per_device_train_batch_size = 2,
gradient_accumulation_steps = 4,
warmup_steps = 5,
max_steps = 30,
learning_rate = 2e-4,
fp16 = not is_bf16_supported(),
bf16 = is_bf16_supported(),
logging_steps = 1,
optim = “adamw_8bit”,
weight_decay = 0.01,
lr_scheduler_type = “linear”,
seed = 3407,
output_dir = “outputs”,
report_to = “none”,
remove_unused_columns = False,
dataset_text_field = “”,
dataset_kwargs = {“skip_prepare_dataset”: True},
dataset_num_proc = 4,
max_seq_length = 2048,
),
)

Returns
TypeError: Unsloth: UnslothVisionDataCollator is only for image models!

Can anyone help me?

mahmutc · May 18, 2025, 8:42pm

Hi @orkungedik

It seems you’re using Unsloth to finutune the model, but I’m not sure why you’re using FastVisionModel. As far as I know, Qwen/Qwen3-4B is a text-only model.

I believe you can follow this notebook instead of your current script:
Colab Notebook

orkungedik · May 19, 2025, 3:22pm

Hi @mahmutc,

Yes it’s looking text-only but it says multi-model in the documentation. I fine-tuned 7B VL model previously. Just trying to understand can qwen3 model fine tune by images to extract data.

Topic		Replies	Views
FineTuning 7B model on 3080 laptop (16GO VRAM) issues Beginners	1	46	May 16, 2025
Fine-Tuning Qwen/Qwen2.5-Coder-0.5B: Mismatched Input and Target Batch Sizes Beginners	2	458	February 27, 2025
Is it possible to finetune *ForQA models with SFT (PEFT/QLoRA)? Beginners	2	562	January 7, 2024
Finetuning a Large Language Model Intermediate	0	82	October 23, 2024
Finetuning LLama2-70B using 4-bit quantization on multi-GPU using Deepspeed ZeRO Intermediate	1	2420	March 19, 2024

Fine tuning on qwen3

Related topics