How big is each record? How big is it after tokenization?
Are you using a data-loader? What batchsize is it using?
What happens if you you try to train using only 10 records?
How big is each record? How big is it after tokenization?
Are you using a data-loader? What batchsize is it using?
What happens if you you try to train using only 10 records?