Extract visual and contextual features from images

Felix92 · August 20, 2021, 12:58pm

Hi Niels,

thanks for your answer i will check this but have you any recommendation to rebuild this with transformers lib without the timm model :

github.com

roatienza/deep-text-recognition-benchmark/blob/master/modules/vitstr.py

'''
Implementation of ViTSTR based on timm VisionTransformer.

TODO: 
1) distilled deit backbone
2) base deit backbone

Copyright 2021 Rowel Atienza
'''

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import torch 
import torch.nn as nn
import logging
import torch.utils.model_zoo as model_zoo

from copy import deepcopy

This file has been truncated. show original

?

Topic		Replies	Views
Why TrOCR processor has a feature extractor? Beginners	8	1454	November 25, 2021
Image Features as Model Input Beginners	2	940	November 18, 2020
Using trasnsformer to get image features 🤗Transformers	3	3378	March 20, 2024
Img2seq model with pretrained weights Beginners	7	1248	November 18, 2021
Get output embedding of FeatureExtractor 🤗Transformers	1	716	April 20, 2021

Extract visual and contextual features from images

Related topics