SAMModel output size different to the input

Hello all,
First of all, sorry if I have placed this in the wrong location/topic area or if I have not structured this correctly. I did check the guidelines but I wasn’t sure where to post questions/queries.

Unfortunately I can’t share the code, but essentially its very similar to the guides: basic, fine-tuning.

I have seen similar comments on the forums before, but they don’t resolve my issue.

Essentially I do the following:

  1. Load pretrained SAM vit-base model
  2. Fine tune it on custom data (images of size 1280 x 720px with their corresponding masks) and feed them in by patches of 256 x 256px.
  3. Create model with my fine tuned state_dict
  4. Create processor from pretrained vit-base
  5. Create my inputs from validation image (1280 x 720px) using the processor. No prompt given.
  6. Evaluated the generated input and returned an output (size 256 x 256px).

Here is my query. How do I get the output from my evaluation do be the same size as my input? From what I’ve seen of other peoples work the image stay the same shape (at least ratio) after it is evaluated. What is making my model output only 256 x 256.

Thank you in advanced. Regards

P.S. I’ve also seen that the input (created by the processor) has shape 1024 x 1024.


You can interpolate the masks to the same size as your input, see my reply here: SAM image size for fine-tuning - #2 by nielsr

Hi, thanks for the reply, but this doesn’t quite solve my issue as its only interpolating to a larger size. The detail in the output has already been lost.