I’d like some help with QARAC, my research project on creating language models that encode logic and consistency.
I’ve recently ported the code from Tensorflow to PyTorch, since I need to train three models together against a combination of four objectives, and PyTorch appears to be more suitable for this than TensorFlow. I thought it would be sensible to test the training script on my own laptop before spending lots of computing resources and money on training it. When I did so, I found that single batch of data took over 5 minutes to process. This suggests to me that even with GPUs or TPUs, training this model would be intractable as it stands, and also that there are likely to be significant inefficiencies in my code.
I’d really appreciate it if somebody could go over the code with me and try to help me spot any problems with it.