Hi @ydshieh,
I hope you are well. I am trying to train Kosmos model on DoclayNet data. I have prepared data. I am getting an error in Training.
My process for training is like this:
- Convert data into kosmos-2 format.
- Convert it to number using processor using below code:
inputs = processor(images = test2_df['image'].to_list(), text = test2_df['text'].to_list(), bboxes = test2_df['float_val'].to_list(),padding=True, return_tensors="pt").to(device)
dataset = Dataset.from_dict(inputs)
- Split dataset into train and test and using Trainer for train like this:
Trainer(
model=model,
args=training_args,
train_dataset=train_dataset,
eval_dataset=test_dataset,
tokenizer=processor,
data_collator=default_data_collator,
)
- Train model
I am getting error like this:
/usr/local/lib/python3.10/dist-packages/transformers/models/kosmos2/modeling_kosmos2.py in forward_embedding(self, input_ids, inputs_embeds, image_embeds, img_input_mask, past_key_values_length, position_ids)
1150 print(image_embeds)
1151 if image_embeds is not None:
-> 1152 inputs_embeds[img_input_mask.to(dtype=torch.bool)] = image_embeds.to(inputs_embeds.device).view(
1153 -1, image_embeds.size(-1)
1154 )
RuntimeError: Index put requires the source and destination dtypes match, got Float for the destination and Half for the source.