Fine tuning SAM with input images 256x256

OdedRot · January 20, 2024, 9:37pm

Hello all
I am trying to fine-tune the Segment Anything (SAM) model.
I would like to load a pretrained model such as “sam-vit-base”
so that the trained encoder is frozen and only the decoder is fine tuned.
My input images are 256x256 and so are the segmentation labels.
I see that the model resizes the input to 1024x1024 and outputs 256x256 labels as default.
Is it possible to use a pretrained model for initialization but use the 256x256 images without resizing to 1024x1024. If so what are the code lines and parameters which I should use.
Once I do:
model = SamModel.from_pretrained(“facebook/sam-vit-base”)
How can I modify the image_size=256 parameter.
Also, I saw that once I do change the image_size to 256 without loading a pretrained model then the output mask reduces to 64. I want to keep the output 256.

In summary I would like to start training from a pretrained model and use an input size 256 (w/o resizing to 1024) and get output size 256.

Thank you for your assistance

Oded

Ghouloud · February 11, 2024, 7:07am

Hello,

I have successfully fine-tuned SAM from transformers with 256x256 images and masks. The masks were loaded and converted to numpy 256x256 and grayscale. I used this to get the required numpy arrays to match the fine-tune tutorial.

def load_and_resize_and_grayscale_images_from_dir(directory, new_shape, threshold=0.0005):
    images = []
    for filename in os.listdir(directory):
        if filename.endswith(".png") or filename.endswith(".jpg"):
            img_path = os.path.join(directory, filename)
            # Remove the first two characters ("._") from the filename... somehow the mac version kept this
            #img_path = os.path.join(directory, filename[2:])
            img = imread(img_path)
            resized_img = resize(img, new_shape, preserve_range=True, anti_aliasing=True)
            grayscaled_img = rgb2gray(resized_img)
            # Apply thresholding to convert grayscale values to 0's and 1's
            thresholded_img = (grayscaled_img > threshold).astype(np.int32)
            images.append(thresholded_img)
    return np.array(images)

def load_and_resize_images_from_dir(directory, new_shape):
    images = []
    for filename in os.listdir(directory):
        if filename.endswith(".png") or filename.endswith(".jpg"):
            img_path = os.path.join(directory, filename)
            # Remove the first two characters ("._") from the filename... somehow the mac version kept this
            #img_path = os.path.join(directory, filename[2:])
            img = imread(img_path)
            resized_img = resize(img, new_shape, preserve_range=True, anti_aliasing=True).astype(np.uint8)
            images.append(resized_img)
    return np.array(images)

OdedRot · February 13, 2024, 9:31pm

Thank you for your reply but I don’t see how this feedback is related to my query.
As I explained, I do not want to resize the image from 256x256 and I want the ViT encoder to accept it as is.

Ghouloud · February 23, 2024, 5:32pm

Hey again, sorry for the delayed response. Is there a specific reason why you don’t want the encoder to resize it to 1024? Since SAM was trained on 1024x1024 images, it is to my understanding that this is necessary. If you’re looking for a smaller model, I think TinySAM may help?

OdedRot · February 24, 2024, 9:29am

The interpolation effects edges, especially for small object which is what I am trying to segment in the image. But if this is a limitation of the ViT architecture I was not aware of then I get it :).

Ghouloud · May 21, 2024, 8:53am

have you looked at the function post_process_mask() transformers/src/transformers/models/sam/image_processing_sam.py at v4.41.0 · huggingface/transformers · GitHub ?

Topic		Replies	Views
SAM image size for fine-tuning Intermediate	5	6167	April 3, 2024
SAMModel output size different to the input Intermediate	2	232	June 6, 2024
How to fine-tune Segment Anything Model (SAM) with multiple points Models	3	3628	February 16, 2024
What is minimum image size is required to fine-tune SAM model? Models	0	117	May 24, 2024
Finetune SAM for instance segmentation to output segmenatation masks along with label names Models	0	229	April 30, 2024

Fine tuning SAM with input images 256x256

Related topics