Hi @nielsr ,
I followed the step-by-step of TrOCR TrOCR-Doc. However, I faced a problem when running this line of code:
pixel_values = processor(images=image, return_tensors="pt").pixel_values
The error information is like:
Traceback (most recent call last):
File "./trocr_test_base_printed.py", line 14, in <module>
pixel_values = processor(images=image, return_tensors="pt").pixel_values
File "/data/***/anaconda3/envs/hug_face/lib/python3.6/site-packages/transformers/models/trocr/processing_trocr.py", line 117, in __call__
return self.current_processor(*args, **kwargs)
File "/data/***/anaconda3/envs/hug_face/lib/python3.6/site-packages/transformers/models/vit/feature_extraction_vit.py", line 141, in __call__
images = [self.normalize(image=image, mean=self.image_mean, std=self.image_std) for image in images]
File "/data/***/anaconda3/envs/hug_face/lib/python3.6/site-packages/transformers/models/vit/feature_extraction_vit.py", line 141, in <listcomp>
images = [self.normalize(image=image, mean=self.image_mean, std=self.image_std) for image in images]
File "/data/***/anaconda3/envs/hug_face/lib/python3.6/site-packages/transformers/image_utils.py", line 149, in normalize
return (image - mean) / std
ValueError: operands could not be broadcast together with shapes (384,384) (3,)
I guess the problem is the version of transformers and the feature extractor, but I didn’t find the detailed version information. I’m now using the transformers 4.12.3
Could you help me about that?
Many thanks