How to feed transformers with Keypoints data?

Saugatkafley · August 21, 2024, 4:34pm

Hi, I am learning about transformers in Images and Videos. I wanted to know how a sequence of key point(Facial and Hand landmarks) data can be fed into a transformer model. I want to train a transformer model for Sign Language Translation (Automatic Video 2 text translation).

I am also looking for efficient KeyPoint extraction models to run on a CPU that can be used to preprocess images and videos for dataset creation.

Topic		Replies	Views
Image Features as Model Input Beginners	2	928	November 18, 2020
Multimodal Transformers with signal inputs Beginners	0	90	May 9, 2024
Help Making a lookup table transformer Beginners	0	337	June 15, 2023
Equivalent for ignore token for Vision Transformers? Intermediate	0	614	May 12, 2022
Resources for Sign Language Translation Beginners	0	1650	August 18, 2020

How to feed transformers with Keypoints data?

Related topics