Hi everyone,
I am curious to know, how the Transformers library works under the hood and its architecture. Specifically I’m looking for a way to run existing models on HF on my special hardware architecture directly (without any retraining or using runtimes like ONNX). Any documentation for this matter would be awesome.