RAG batch size on GPU

Hello, when I finetune my RAG-based model on my 4xV100 box i’m having some issues with OOM on my GPUs. I can only use a batch size of 1 for both train and eval and still fit the examples into the GPU memory. The GPUs have about 16GB of memory each and a batch size of 1 uses between 11-15GB of memory depending on the other params i’m using. This could just be the nature of the model, but I want to make sure that I’m not doing something wrong that is blowing up the memory. My knowledge dataset is much smaller than the default indexes. I am using the finetuning script in examples/research-projects/rag/finetune_rag.sh. Thank you for your help.