ViT problem with GPU usage require image to be numpy

I can share my github code, but privately for the moment and you can reproduce the environment with the data. If you agree please let me know your mail.