I’m trying to fine-tune a LongFormer model (allenai/longformer-base-4096 · Hugging Face) on a single GPU (RTX 3090). However, having lots of data will result in a very long training time.
I’m asking for a way to train the model with FP16 precision (reducing overall load) but I’m not able to do it without the standard Trainer class. Is it possible? How I can do that with a standard training loop?
for epoch in range(self.epochs):
self.model.train()
total_loss, total_val_loss = 0, 0
for step, batch in enumerate(self.train_dataloader):
self.model.zero_grad()
outputs = self.model(batch[0].to(self.device),
attention_mask = batch[1].to(self.device),
token_type_ids = batch[2].to(self.device),
labels = batch[3].to(self.device))
outputs.loss.backward()
torch.nn.utils.clip_grad_norm_(self.model.parameters(), 1.0)
optimizer.step()
scheduler.step()