Why TrOCR processor has a feature extractor?

Hi @nielsr ,
I followed the step-by-step of TrOCR TrOCR-Doc. However, I faced a problem when running this line of code:

pixel_values = processor(images=image, return_tensors="pt").pixel_values

The error information is like:

Traceback (most recent call last):
  File "./trocr_test_base_printed.py", line 14, in <module>
    pixel_values = processor(images=image, return_tensors="pt").pixel_values
  File "/data/***/anaconda3/envs/hug_face/lib/python3.6/site-packages/transformers/models/trocr/processing_trocr.py", line 117, in __call__
    return self.current_processor(*args, **kwargs)
  File "/data/***/anaconda3/envs/hug_face/lib/python3.6/site-packages/transformers/models/vit/feature_extraction_vit.py", line 141, in __call__
    images = [self.normalize(image=image, mean=self.image_mean, std=self.image_std) for image in images]
  File "/data/***/anaconda3/envs/hug_face/lib/python3.6/site-packages/transformers/models/vit/feature_extraction_vit.py", line 141, in <listcomp>
    images = [self.normalize(image=image, mean=self.image_mean, std=self.image_std) for image in images]
  File "/data/***/anaconda3/envs/hug_face/lib/python3.6/site-packages/transformers/image_utils.py", line 149, in normalize
    return (image - mean) / std
ValueError: operands could not be broadcast together with shapes (384,384) (3,) 

I guess the problem is the version of transformers and the feature extractor, but I didn’t find the detailed version information. I’m now using the transformers 4.12.3

Could you help me about that?
Many thanks