Why eval_accumulation_steps takes so much memory

nbroad · May 8, 2024, 1:26am

Notebooks don’t clear memory very well, so if you run things multiple times it may result in OOM.

You probably need to add a logit preprocessing step because it is saving the embeddings for each token rather than a single pooled one for each sample. This results in hundreds of times more memory consumed.


def preprocess_logits_for_metrics(logits, labels):
    if isinstance(logits, tuple):
        # Depending on the model and config, logits may contain extra tensors,
        # like past_key_values, but logits always come first
        logits = logits[0]
    # logits should be [bs, seq_len, hidden_size]
    return logits[:,0,:] # return CLS embedding


    # if doing mean pooled, you'll need more complicated logic


trainer  = Trainer(
...,
preprocess_logits_for_metrics=preprocess_logits_for_metrics
)

Topic		Replies	Views
Training tokenizer takes too much RAM 🤗Tokenizers	1	1320	February 21, 2022
Huggingface distilbert-base-uncased-finetuned-sst-2-english runs out of ram with only a few kb? Beginners	0	373	May 12, 2022
Evaluation step very slow 🤗Transformers	1	847	February 21, 2024
Embedding Vectors taking up large amounts of memory Models	0	642	October 26, 2022
ViT Trainer throws out of memory error when I'm using save and eval strategy as epoch but works with steps strategy Beginners	0	222	November 28, 2022

Why eval_accumulation_steps takes so much memory

Related topics