What is your use-case that you are using Transformers but not Transformers models? If you want to use the HF Trainer alongside with your own PyTorch model, I recommended to subclass the relevant classes, similar to PretrainedModel
And to use your own PretrainedConfig alongside of it.