Issue on Kosmos-2 model training on new dataset

Mit1208 · February 10, 2024, 3:39pm

I hope you are well. I am trying to train Kosmos model on DoclayNet data. I have prepared data. I am getting an error in Training.

My process for training is like this:

Convert data into kosmos-2 format.
Convert it to number using processor using below code:

inputs = processor(images = test2_df['image'].to_list(), text = test2_df['text'].to_list(), bboxes = test2_df['float_val'].to_list(),padding=True, return_tensors="pt").to(device)
dataset = Dataset.from_dict(inputs)

Split dataset into train and test and using Trainer for train like this:

Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=test_dataset,
    tokenizer=processor,
    data_collator=default_data_collator,
)

Train model

I am getting error like this:

/usr/local/lib/python3.10/dist-packages/transformers/models/kosmos2/modeling_kosmos2.py in forward_embedding(self, input_ids, inputs_embeds, image_embeds, img_input_mask, past_key_values_length, position_ids)
   1150         print(image_embeds)
   1151         if image_embeds is not None:
-> 1152             inputs_embeds[img_input_mask.to(dtype=torch.bool)] = image_embeds.to(inputs_embeds.device).view(
   1153                 -1, image_embeds.size(-1)
   1154             )

RuntimeError: Index put requires the source and destination dtypes match, got Float for the destination and Half for the source.

Mit1208 · February 25, 2024, 2:27pm

Hi @ydshieh

Can you please advice to resolve this error?

Thanks.

Mit1208 · February 25, 2024, 10:27pm

The problem was in my TrainingArguments(). I set fp16=True, which was messing up tensors dtypes. I removed it from the argument and it worked.

system · February 26, 2024, 10:28am

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Kosmos-2 Fine tuning 🤗Transformers	41	1926	August 19, 2024
Issue with KOSMOS-2 encoding and decoding 🤗Tokenizers	11	470	January 26, 2024
Trainer API object detection 🤗Transformers	2	44	December 29, 2024
Num_samples = 0, dataset not being read Beginners	4	323	December 7, 2023
KeyError when training with a dictionary as a dataset. What should the dataset look like? Beginners	0	706	October 19, 2022

Issue on Kosmos-2 model training on new dataset

Related topics