SegformerImageProcesser only supports uint8 masks

I have a semantic segmentation problem with several hundred classes. It appears that SegformerImageProcesser needs to be able to trivially convert masks to an 8-bit PIL. If I pass in an RGB pil, then I get pixel_values.shape = (3,512,512), and labels.shape = (512,512,3). That makes me think this isn’t an intended usage.

AFAIK the only transform I need to apply to the mask is a resize, so I can easily do that on my own, but it seems like an odd limitation.

Am I misunderstanding something here?