Explicit inputs_embeds and vocab_size=1 in GPT2

Jeremy9959 · February 12, 2024, 6:42pm

Hi all,

If I plan to use a GPT2 Base Model and call the forward method with an explicit inputs_embeds,
does that mean I can set vocab_size=1? So for example if, by some preprocessing,
I’ve converted my length L text T into a 1 x L x 768 tensor E, then am I right that model(inputs_embeds=E) combines my explicit E with the positional embeddings in model.wpe but ignores model.wte?

If this is not right, what is the relationship between vocab_size and explicit embeddings?

Thanks!

Topic		Replies	Views
How to input word2vec embeddings to gpt2 model? 🤗Transformers	0	635	May 17, 2022
Equivalent of `inputs_embeds` for `FlaxGPT2Model` 🤗Transformers	0	252	August 12, 2021
GPT2.generate() with custom inputs_embeds argument returning tensor (1max_length) instead of (batch_sizemax_length) Intermediate	0	556	April 19, 2022
Use external embeddings 🤗Transformers	0	372	July 13, 2022
Fine tune vocab size of pre-trained Causal Language Model Intermediate	2	1844	October 17, 2022

Explicit inputs_embeds and vocab_size=1 in GPT2

Related topics