Convert a dataset for a different model. How to know the format?

Hello!

I’m trying to convert one dataset and use it in a different model, but I honestly cannot figure out how should I transform the dataset. Specifically, I want to fine tune the hustvl/yolos-tiny with my custom dataset.

I loaded the dataset, and I wanted to use a transform to prepare the data, like this:

dataset = load_dataset(....)
dataset_t = dataset.with_transform(transforms)

And in the transform function I should restructure the examples in order to feed properly the model

def transforms(examples):
    examples["pixel_values"] = .....
    examples["objects"] = .....
    return  examples

However I cannot figure out how should I structure the examples. When I put a column that doesn’t exists, I get an error like:

YolosForObjectDetection.forward() got an unexpected keyword argument 'XXXXX'

When I try to structure “objects” as shown in here object-detection, I get a torch error

RuntimeError: Could not infer dtype of dict

I don’t know if the approach is correct and if should I focus on a specific error, or if I made a mistake in the pipeline.

My question is, is there an example or a documentation about this? Where can I know the exact specific format the dataset should have? (just like the fact that the image in tensor format must be put in the pixel_values key, I couldn’t find an official documentation about this, except for some usage example).

Thank you!

Hi ! you can find the documentation here YolosForObjectDetection

in particular the signature of the forward() method is

def forward(
    pixel_values: FloatTensor,
    labels: Optional = None,
    output_attentions: Optional = None,
    output_hidden_states: Optional = None,
    return_dict: Optional = None
):

and the docstrings for labels:

Labels for computing the bipartite matching loss. List of dicts, each dictionary containing at least the following 2 keys: 'class_labels' and 'boxes' (the class labels and bounding boxes of an image in the batch respectively). The class labels themselves should be a torch.LongTensor of len (number of bounding boxes in the image,) and the boxes a torch.FloatTensor of shape (number of bounding boxes in the image, 4) .

1 Like

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.