RAG model doesn't have grad_fn on loss

Dear authors of the rag model,

Recently, I am using RAG to finetune on a qa dataset. However, my loss doesn’t have grad_fn, which means it can not be backpropagated, after it is computed by the ‘forward’ function.

My code snippet is as follows.

config = RagConfig.from_pretrained(rag_example_args.rag_model_name, index_name="custom", passages_path=passages_path, index_path=index_path,
                                       cache_dir=cache_dir, reduce_loss=True)
retriever = RagRetriever.from_pretrained(rag_example_args.rag_model_name, config=config)
tokenizer = RagTokenizer.from_pretrained(rag_example_args.rag_model_name, config=config)
model = RagSequenceForGeneration.from_pretrained(rag_example_args.rag_model_name, retriever=retriever,config=config).to(device)
model.train()

# load the finetune dataset
dataset = load_and_cache_examples(fine_tune_args.max_source_length, fine_tune_args.max_target_length, fine_tune_args.train_set_path, tokenizer)
dataloader = DataLoader(dataset, batch_size=fine_tune_args.train_batch_size)

# begin to train
optimizer = AdamW(model.parameters(), lr=fine_tune_args.learning_rate)

for _ in range(fine_tune_args.num_epochs):
    for batch in dataloader:
        optimizer.zero_grad()
        batch = tuple(t.to(device) for t in batch)
        loss = model(input_ids=batch[0], attention_mask=batch[1], labels=batch[2]).loss

        loss.backward()
        optimizer.step()
        model.zero_grad()

Can you help me?
Thank you very much!