Compute VRAM size for Text2Text text generation

langueneers · December 1, 2024, 12:15pm

Hello,

How would I go about calculating / estimating the VRAM that I need to fine-tune models such as flan-t5-large or mt5-large for seq2seq text generation while taking into consideration the input and output sequence size? let us assume I have an input size of 1000 and a target text size of 2000. The model I want to use is MT5-large. How do I compute the amount of GPU memory needed with a minimal batch_size of 1? How would this increase with every increment to the batch-size?
I tried using mt5-small on a dual-GPU setup with 2 x 20 GB but had a memory overflow despite mixed precision and batch-size of 1 due to the number of prediction steps. When I reduced the target text size to a fee dozen tokens only, everything worked perfectly. A few dozen tokens are simply not enough…

Topic		Replies	Views
How to predict the memory requirements for a given model? Models	0	744	June 9, 2022
Memory Usage for Inference Depending on Size of Input Data 🤗Transformers	1	4428	September 18, 2023
Est Package for 140 Hours of Audio with F5 TTS – Need Advice for a Newbie Beginners	1	66	March 25, 2025
How estimate VRAM needed for prompt according to prompt's size (inference and fine tuning) Beginners	1	1250	September 22, 2023
Using Batch Encodings 🤗Transformers	0	689	July 12, 2022

Compute VRAM size for Text2Text text generation

Related topics