Why eval_accumulation_steps takes so much memory

Sohaib9920 · May 5, 2024, 10:22pm

I am using this script to get embeddings of the ~13000 essays of persuade 2.0:

def parse_args():
    parser = ArgumentParser()

    parser.add_argument("--model_name", type=str, default="BAAI/bge-base-en-v1.5")
    parser.add_argument("--csv_path", type=str)
    parser.add_argument("--text_col", type=str)
    parser.add_argument("--max_length", type=int, default=512)
    parser.add_argument("--batch_size", type=int, default=128)
    parser.add_argument("--output_path", type=str)

    return parser.parse_args()


def main():
    args = parse_args()

    tokenizer = AutoTokenizer.from_pretrained(args.model_name)
    model = AutoModel.from_pretrained(args.model_name, add_pooling_layer=False)
    model.eval()
    
    targs = TrainingArguments(
        ".",
        report_to="none",
        per_device_eval_batch_size=args.batch_size,
        fp16=True
    )

    ds = Dataset.from_pandas(pd.read_csv(args.csv_path))

    # strip whitespace from end
    ds = ds.map(
        lambda x: {args.text_col: x[args.text_col].strip()}, num_proc=4
    )

    def tokenize(batch):
        return tokenizer(
            batch[args.text_col],
            padding=False,
            truncation=True,
            max_length=args.max_length,
            return_length=True
        )

    with targs.main_process_first(desc="dataset map pre-processing"):
        ds = ds.map(tokenize, batched=True, num_proc=4)
        
    trainer = Trainer(model=model, args=targs, tokenizer=tokenizer)

    embeddings = trainer.predict(ds).predictions[0][:, 0]

    embeddings = torch.nn.functional.normalize(
        torch.tensor(embeddings), p=2, dim=1
    ).cpu()

    torch.save(embeddings, args.output_path)


if __name__ == "__main__":
    main()

Source: @nbroad in his amazing notebook

I used 2x A5000 with 64 batch size but when I predict, the gpu memory consumption keeps increasing and after ~50% prediction I get OOM error. This is same even for small batch sizes. The reason suggested by @sgugger was that we are accumulating so many predictions so I used eval_accumulation_steps=10 to transfer these predictions to cpu but then It crashes after occupying 60 gb ram. Same with increasing/decreasing eval_accumulation_steps.

Question:
My question is where this 60 gb data coming from? My prediction tensor of size (13000, 768) will be maximum 1 gb then why ram and gpu are being overloaded?

nbroad · May 6, 2024, 10:52pm

How do you run your code? In a notebook?

Sohaib9920 · May 7, 2024, 10:00am

I am running above script inside the code block of notebook. To be more specific, I am running this notebook and get stuck when predicting embedding for 13000 persuade essays using above script (currently commented in notebook):

Same when using kaggle gpus or bigger rent gpus. No problem when using small data. What’s strange is why 60 gb ram is taken when using eval_ accumulation_steps and gpu OOM otherwise for full 13000 essays.

nbroad · May 8, 2024, 1:26am

Notebooks don’t clear memory very well, so if you run things multiple times it may result in OOM.

You probably need to add a logit preprocessing step because it is saving the embeddings for each token rather than a single pooled one for each sample. This results in hundreds of times more memory consumed.


def preprocess_logits_for_metrics(logits, labels):
    if isinstance(logits, tuple):
        # Depending on the model and config, logits may contain extra tensors,
        # like past_key_values, but logits always come first
        logits = logits[0]
    # logits should be [bs, seq_len, hidden_size]
    return logits[:,0,:] # return CLS embedding


    # if doing mean pooled, you'll need more complicated logic


trainer  = Trainer(
...,
preprocess_logits_for_metrics=preprocess_logits_for_metrics
)

Sohaib9920 · May 8, 2024, 11:37am

Thank you so much. It solves the problem .

system · May 8, 2024, 11:37pm

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Training tokenizer takes too much RAM 🤗Tokenizers	1	1320	February 21, 2022
Huggingface distilbert-base-uncased-finetuned-sst-2-english runs out of ram with only a few kb? Beginners	0	373	May 12, 2022
Evaluation step very slow 🤗Transformers	1	851	February 21, 2024
Embedding Vectors taking up large amounts of memory Models	0	642	October 26, 2022
ViT Trainer throws out of memory error when I'm using save and eval strategy as epoch but works with steps strategy Beginners	0	222	November 28, 2022

Why eval_accumulation_steps takes so much memory

Related topics