Transformers - Codex - context length of 4096 Tokens

BalajiAJ · September 23, 2021, 7:23am

How does Codex, a descendant of GPT-3 allow a context length 4096 tokens while GPT-3 allows only 2048, using Transformers architecture.

I have gone through the OpenAI Codex paper, but couldn’t find any information related to it. Could anyone tell how this token limit was increased and what was the technique used?

Topic		Replies	Views
Token indices sequence length is longer (Python) 🤗Transformers	0	344	April 13, 2023
How to properly tokenize and pack sequences with EOS token handling for GPT-2 fine-tuning in Hugging Face Transformers? Beginners	2	689	August 21, 2024
How do I search for sentence transformer models by context window/token length/word piece count? 🤗Hub	0	666	March 27, 2024
Understanding how token batches and fine-tuning interact Beginners	0	441	March 22, 2022
Why do Pipelines allow more than 512 tokens? Beginners	1	631	April 4, 2023

Transformers - Codex - context length of 4096 Tokens

Related topics