Iām using a GPTLMHead model in pytorch.
Is it possible , i add autocast() in the forward function in GPTLMHead and change the training process followed the Automatic Mixed Precision ā PyTorch Tutorials 1.8.1+cu102 documentation
Yes, itās possible.
I train the gpt model using V100 with mixed precision training. But it happens that in the inference time, the generated token is mess with the autocast() open.when i turn off it, everything is ok.