Continous increase in Memory usage

John6666 · December 1, 2024, 11:28am

The only way to do this from Python is to offload the torch model and tensors to the CPU from as appropriate a scope as possible, delete the objects themselves in detail, and then call gc and empty_cache() after making sure that the tensors are not being referenced from anywhere. Be careful, as there are cases where tqdm and other such tools are implicitly referencing them.
In other words, this is the current approach. If this doesn’t work, something is wrong. You should suspect a bug or a problem with the library.
Another method is to separate the execution of the model into a separate script and execute it in a sub-process. This way, the OS will manage the memory, so it is more forceful than Python. However, it is not clean and it takes time.

@not-lain This could be a tricky VRAM leak problem.

Topic		Replies	Views
Continuous Memory Usage increasing 🤗Transformers	0	82	November 26, 2024
Clear GPU memory of transformers.pipeline Beginners	6	24120	March 19, 2025
Is there a way to terminate llm.generate and release the GPU memory for next prompt? DeepSpeed	1	161	February 4, 2025
Wav2vec2 not releasing memory after batch Models	1	469	May 22, 2023
Unable to free whole GPU memory even after ``del var; gc.collect; empty_cache()`` 🤗Transformers	8	587	September 26, 2024

Continous increase in Memory usage

Related topics