I see this error when trying to configure a few examples for this Donut derived model:
Unrecognized feature extractor in ivelin/donut-refexp-combined-v1. Should have a `feature_extractor_type` key in its preprocessor_config.json of config.json, or one of the following `model_type` keys in its config.json: audio-spectrogram-transformer, beit, chinese_clip, clip, clipseg, conditional_detr, convnext, cvt, data2vec-audio, data2vec-vision, deformable_detr, deit, detr, dinat, donut-swin, dpt, flava, glpn, groupvit, hubert, imagegpt, layoutlmv2, layoutlmv3, levit, maskformer, mctct, mobilenet_v1, mobilenet_v2, mobilevit, nat, owlvit, perceiver, poolformer, regnet, resnet, segformer, sew, sew-d, speech_to_text, swin, swinv2, table-transformer, timesformer, unispeech, unispeech-sat, van, videomae, vilt, vit, vit_mae, vit_msn, wav2vec2, wav2vec2-conformer, wavlm, whisper, xclip, yolos
config.json does refer to
donut-swin for encoder
model_type. Not sure what the issue is.
The model loads and works fine for training and inference. Here is a working Gradio space with links to training and inference notebooks with more details.