Output tensor must have the same type as input tensor

Hello everyone, i have 4 gpus RTX 3080 with 10 GiB each and im trying to fine tune mistral 7B v2.0 localy, i tried to optimize as much as i can…(Accelerate with DeepSpeed, 4bit quantization, LoRa and all that stuff) but now i am getting this input and output tensors are not of the same type error
my csv dataset has one column called Text, which includes question-answer pairs
Can you suggest a fix ?
this is the error :
"Loading extension module cpu_adam…
Time to load cpu_adam op: 2.2301530838012695 seconds
Parameter Offload: Total persistent parameters: 20189184 in 417 params
INFO | 2024-06-06 17:33:31 | autotrain.trainers.common:on_train_begin:231 - Starting to train…
0%| | 0/20 [00:00<?, ?it/s]
(myenv) rag@PC-RAG:~/finetune$ ERROR | 2024-06-06 17:34:26 | autotrain.trainers.common:wrapper:120 - train has failed due to an exception: Traceback (most recent call last):
File “/home/rag/finetune/myenv/lib/python3.10/site-packages/autotrain/trainers/common.py”, line 117, in wrapper
return func(*args, **kwargs)
File “/home/rag/finetune/myenv/lib/python3.10/site-packages/autotrain/trainers/clm/main.py”, line 28, in train
train_sft(config)
File “/home/rag/finetune/myenv/lib/python3.10/site-packages/autotrain/trainers/clm/train_clm_sft.py”, line 98, in train
trainer.train()
File “/home/rag/finetune/myenv/lib/python3.10/site-packages/trl/trainer/sft_trainer.py”, line 361, in train
output = super().train(*args, **kwargs)
File “/home/rag/finetune/myenv/lib/python3.10/site-packages/transformers/trainer.py”, line 1859, in train
return inner_training_loop(
File “/home/rag/finetune/myenv/lib/python3.10/site-packages/transformers/trainer.py”, line 2203, in _inner_training_loop
tr_loss_step = self.training_step(model, inputs)
File “/home/rag/finetune/myenv/lib/python3.10/site-packages/transformers/trainer.py”, line 3147, in training_step
self.accelerator.backward(loss)
File “/home/rag/finetune/myenv/lib/python3.10/site-packages/accelerate/accelerator.py”, line 2007, in backward
self.deepspeed_engine_wrapped.backward(loss, **kwargs)
File “/home/rag/finetune/myenv/lib/python3.10/site-packages/accelerate/utils/deepspeed.py”, line 175, in backward
self.engine.step()
File “/home/rag/finetune/myenv/lib/python3.10/site-packages/deepspeed/runtime/engine.py”, line 2169, in step
self._take_model_step(lr_kwargs)
File “/home/rag/finetune/myenv/lib/python3.10/site-packages/deepspeed/runtime/engine.py”, line 2075, in _take_model_step
self.optimizer.step()
File “/home/rag/finetune/myenv/lib/python3.10/site-packages/deepspeed/utils/nvtx.py”, line 15, in wrapped_fn
ret_val = func(*args, **kwargs)
File “/home/rag/finetune/myenv/lib/python3.10/site-packages/deepspeed/runtime/zero/stage3.py”, line 2060, in step
self._post_step(timer_names)
File “/home/rag/finetune/myenv/lib/python3.10/site-packages/deepspeed/utils/nvtx.py”, line 15, in wrapped_fn
ret_val = func(*args, **kwargs)
File “/home/rag/finetune/myenv/lib/python3.10/site-packages/deepspeed/runtime/zero/stage3.py”, line 1986, in _post_step
self.persistent_parameters[0].all_gather(self.persistent_parameters)
File “/home/rag/finetune/myenv/lib/python3.10/site-packages/deepspeed/runtime/zero/partition_parameters.py”, line 1121, in all_gather
return self._all_gather(param_list, async_op=async_op, hierarchy=hierarchy)
File “/home/rag/finetune/myenv/lib/python3.10/site-packages/deepspeed/utils/nvtx.py”, line 15, in wrapped_fn
ret_val = func(*args, **kwargs)
File “/home/rag/finetune/myenv/lib/python3.10/site-packages/deepspeed/runtime/zero/partition_parameters.py”, line 1465, in _all_gather
self._allgather_params_coalesced(all_gather_nonquantize_list, hierarchy, quantize=False)
File “/home/rag/finetune/myenv/lib/python3.10/site-packages/deepspeed/runtime/zero/partition_parameters.py”, line 1769, in _allgather_params_coalesced
h = dist.all_gather_into_tensor(allgather_params[param_idx],
File “/home/rag/finetune/myenv/lib/python3.10/site-packages/deepspeed/comm/comm.py”, line 117, in log_wrapper
return func(*args, **kwargs)
File “/home/rag/finetune/myenv/lib/python3.10/site-packages/deepspeed/comm/comm.py”, line 305, in all_gather_into_tensor
return cdb.all_gather_into_tensor(output_tensor=output_tensor, input_tensor=tensor, group=group, async_op=async_op)
File “/home/rag/finetune/myenv/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py”, line 451, in _fn
return fn(*args, **kwargs)
File “/home/rag/finetune/myenv/lib/python3.10/site-packages/deepspeed/comm/torch.py”, line 213, in all_gather_into_tensor
return self.all_gather_function(output_tensor=output_tensor,
File “/home/rag/finetune/myenv/lib/python3.10/site-packages/torch/distributed/c10d_logger.py”, line 75, in wrapper
return func(*args, **kwargs)
File “/home/rag/finetune/myenv/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py”, line 2948, in all_gather_into_tensor
work = group._allgather_base(output_tensor, input_tensor, opts)
TypeError: output tensor must have the same type as input tensor

ERROR | 2024-06-06 17:34:26 | autotrain.trainers.common:wrapper:121 - output tensor must have the same type as input tensor"