Image segmenation transfer learning on facebook/maskformer-swin-large-ade with custom dataset


I’m trying to do transfer learning with the facebook/maskformer-swin-large-ade model on a custom dataset. I tried to train the model following this tutorial and using the suggested dataset: notebooks/semantic_segmentation.ipynb at main · huggingface/notebooks · GitHub

Sadly, I ended up getting multiple errors and couldn’t get the training to start. What steps do I have to take to train the selected model on the already existing dataset?

After I succeed with the given dataset, which steps do I have to take to create my own one?
The data consists of pictures in the size of 8192x7808 and only one class should be recognized.
I already have the labeled data: the original images and the black and white images which mark the object positions.

Thanks in advance!


That notebook is meant for models supported by the AutoModelForSemanticSegmentation API, like SegFormer. These models predict a label for each pixel of an image (also called “per-pixel classification”).

However, MaskFormer (and related models like Mask2Former and OneFormer) don’t use per-pixel classification to solve semantic segmentation, but rather adopt a “binary mask classification” paradigm. Hence we made different notebooks for those models, which you can find here: Transformers-Tutorials/MaskFormer at master · NielsRogge/Transformers-Tutorials · GitHub.

To understand the difference between “per-pixel classification” and “binary mask classification”, I’d recommend checking out our blog post.

1 Like