Modifying ViT to include 4th channel

pmoct · September 8, 2023, 10:34am

Hello!

I have a series of image data with RGB channels, but I have also added another channel containing a segmentation output that could potentially help the model with the classification task I am working on. I am utilizing the code from the image classification github (link) to get started, but I notice that it takes 3-channel images. How would I begin to modify the code to take in a 4-channel image?

I appreciate any advice or help!

da2r-20 · October 15, 2024, 8:47am

Hi,
did you find a solution?

John6666 · October 15, 2024, 9:28am

I think you can do just the loading part of the image with a single character change, but I don’t know if the transformers library can handle this correctly…

def pil_loader(path: str):
    with open(path, "rb") as f:
        im = Image.open(f)
        return im.convert("RGBA") # 4 channels
        #return im.convert("RGB") # 3 channels

Topic		Replies	Views
How do you use segmentation image processor with more than 3 channel images? Beginners	1	298	May 13, 2024
Loading Vision Transformer Model After Changing Its Classifier Head 🤗Transformers	2	939	December 21, 2023
Serious issue regarding channel dimensions with respect to configuration during training a vision transformer Beginners	2	495	August 26, 2024
Issue with Inference API for ViT Model - "image-feature-extraction" Error Inference Endpoints on the Hub	7	846	June 7, 2024
Feature Extraction pipeline for images Beginners	0	697	August 8, 2023

Modifying ViT to include 4th channel

Related topics