Why TrOCR processor has a feature extractor?

MACong · November 25, 2021, 4:30am

Thanks for your reply.

I tried a local colorful image with 3 dimensional, it work!! THANKS!!

However, when I tried the IAM image, it has the above-mentioned error. Even I tried the exact step-by-step guideline, it also has the above-mentioned error. Have you tried the step-by-step codes? Or do you have any idea how to handle the binary image input? I considered to repeat the 1 channel to 3 channel, but i’m not sure whether this is okay or not.

The step-by-step code is:

>>> from transformers import TrOCRProcessor, VisionEncoderDecoderModel
>>> import requests
>>> from PIL import Image

>>> processor = TrOCRProcessor.from_pretrained("microsoft/trocr-base-handwritten")
>>> model = VisionEncoderDecoderModel.from_pretrained("microsoft/trocr-base-handwritten")

>>> # load image from the IAM dataset
>>> url = "https://fki.tic.heia-fr.ch/static/img/a01-122-02.jpg"
>>> image = Image.open(requests.get(url, stream=True).raw).convert("RGB")

>>> pixel_values = processor(image, return_tensors="pt").pixel_values
>>> generated_ids = model.generate(pixel_values)

>>> generated_text = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]

Topic		Replies	Views
Extract visual and contextual features from images Models	5	4463	August 27, 2021
Get original image from trocr processor Intermediate	1	682	October 10, 2022
Processor while fine-tuning TrOCR on IAM 🤗Transformers	0	217	November 28, 2023
Finetuning TrOCR on the IAM dataset 🤗Transformers	1	1139	August 11, 2022
Error finding processor's image class. Loading based on pattern matching with feature extractor 🤗Transformers	11	12750	October 27, 2023

Why TrOCR processor has a feature extractor?

Related topics