Fine Tuning Llava 1.5 7b for Classification

kallyy · April 26, 2025, 6:29pm

Hello,
im working on fine tuning Llava 1.5 7b Classification and VQA

the typical input for the model is a question along with a list of images as answer,
the model should return 1 if the task has been well executed (the task being the question), 0 otherwise
I have a decent dataset of around 1400 entries (entry is a single object with question,answer, compliance(being 1 or 0) )…
my approach was to first fine tune the model for classification, and then use the processor.apply_chat_template to generate a justification with a prompt in case the model returned 0
to be completely honest, I’m still learning all about fine tuning models so I don’t understand 100% of the code , it’s all put together through some basic knowledge, research and the help of chatGPT
here’s the Cells I’m facing an issue with
from transformers import DataCollatorForSeq2Seq

text_collator = DataCollatorForSeq2Seq(
tokenizer=processor.tokenizer,
model=model,
label_pad_token_id=processor.tokenizer.pad_token_id,
)

def data_collator(features):
pixel_vals = [f.pop(“pixel_values”) for f in features]
imgs = [torch.stack(x if isinstance(x[0], torch.Tensor) else x[0], dim=0)
for x in pixel_vals]
batch_pixels = torch.stack(imgs, dim=0)

batch = text_collator(features)   
batch["pixel_values"] = batch_pixels
return batch

trainer = Seq2SeqTrainer(
model=model,
args=training_args,
train_dataset=train_ds,
eval_dataset=eval_ds,
data_collator=data_collator,
compute_metrics=compute_metrics,
)
trainer.train()

TypeError Traceback (most recent call last)
/tmp/ipykernel_31/2454287461.py in <cell line: 0>()

----> 2 trainer.train()

/usr/local/lib/python3.11/dist-packages/transformers/trainer.py in train(self, resume_from_checkpoint, trial, ignore_keys_for_eval, **kwargs)
2243 hf_hub_utils.enable_progress_bars()
2244 else:
→ 2245 return inner_training_loop(
2246 args=args,
2247 resume_from_checkpoint=resume_from_checkpoint,

/usr/local/lib/python3.11/dist-packages/transformers/trainer.py in _inner_training_loop(self, batch_size, args, resume_from_checkpoint, trial, ignore_keys_for_eval)
2512 update_step += 1
2513 num_batches = args.gradient_accumulation_steps if update_step != (total_updates - 1) else remainder
→ 2514 batch_samples, num_items_in_batch = self.get_batch_samples(epoch_iterator, num_batches, args.device)
2515 for i, inputs in enumerate(batch_samples):
2516 step += 1

/usr/local/lib/python3.11/dist-packages/transformers/trainer.py in get_batch_samples(self, epoch_iterator, num_batches, device)
5241 for _ in range(num_batches):
5242 try:
→ 5243 batch_samples.append(next(epoch_iterator))
5244 except StopIteration:
5245 break

/usr/local/lib/python3.11/dist-packages/accelerate/data_loader.py in iter(self)
561 # We iterate one batch ahead to check when we are at the end
562 try:
→ 563 current_batch = next(dataloader_iter)
564 except StopIteration:
565 yield

/usr/local/lib/python3.11/dist-packages/torch/utils/data/dataloader.py in next(self)
699 # TODO(Bug in dataloader iterator found by mypy · Issue #76750 · pytorch/pytorch · GitHub)
700 self._reset() # type: ignore[call-arg]
→ 701 data = self._next_data()
702 self._num_yielded += 1
703 if (

/usr/local/lib/python3.11/dist-packages/torch/utils/data/dataloader.py in _next_data(self)
755 def _next_data(self):
756 index = self._next_index() # may raise StopIteration
→ 757 data = self._dataset_fetcher.fetch(index) # may raise StopIteration
758 if self._pin_memory:
759 data = _utils.pin_memory.pin_memory(data, self._pin_memory_device)

/usr/local/lib/python3.11/dist-packages/torch/utils/data/_utils/fetch.py in fetch(self, possibly_batched_index)
53 else:
54 data = self.dataset[possibly_batched_index]
—> 55 return self.collate_fn(data)

/tmp/ipykernel_31/4101348022.py in data_collator(features)
9 def data_collator(features):
10 pixel_vals = [f.pop(“pixel_values”) for f in features]
—> 11 imgs = [torch.stack(x if isinstance(x[0], torch.Tensor) else x[0], dim=0)
12 for x in pixel_vals]
13 batch_pixels = torch.stack(imgs, dim=0)

/tmp/ipykernel_31/4101348022.py in (.0)
9 def data_collator(features):
10 pixel_vals = [f.pop(“pixel_values”) for f in features]
—> 11 imgs = [torch.stack(x if isinstance(x[0], torch.Tensor) else x[0], dim=0)
12 for x in pixel_vals]
13 batch_pixels = torch.stack(imgs, dim=0)

TypeError: expected Tensor as element 0 in argument 0, but got list

John6666 · April 27, 2025, 2:39am

Hmm… This part?

# imgs = [torch.stack(x if isinstance(x[0], torch.Tensor) else x[0], dim=0)
 imgs = [torch.stack([x] if isinstance(x[0], torch.Tensor) else [x[0]], dim=0)
# or torch.tensor ?

Topic		Replies	Views
Fine-tuned transformers model generats nonsensical results Beginners	0	217	July 10, 2024
Train modell for Question Answering Intermediate	3	314	May 6, 2024
Dimension issue when fine tuning blenderbot Beginners	1	1028	October 17, 2022
FineTuning 7B model on 3080 laptop (16GO VRAM) issues Beginners	1	52	May 16, 2025
How to test saved fine tuned bert model? Beginners	0	900	November 28, 2023

Fine Tuning Llava 1.5 7b for Classification

trainer = Seq2SeqTrainer( model=model, args=training_args, train_dataset=train_ds, eval_dataset=eval_ds, data_collator=data_collator, compute_metrics=compute_metrics, ) trainer.train()

Related topics

trainer = Seq2SeqTrainer(
model=model,
args=training_args,
train_dataset=train_ds,
eval_dataset=eval_ds,
data_collator=data_collator,
compute_metrics=compute_metrics,
)
trainer.train()