How do I upstream a brand-new hardware backend to 🤗 Transformers/Optimum?

halflucifer · July 1, 2025, 10:33am

Hi everyone!

I’m part of the team behind xThought, an upcoming AI accelerator hardware for LLM inference. We’ve already built a working Transformers extension (xthought_transformers) that ships:

XThoughtConfig + XThoughtModel classes
Basic safetensor export + runtime bindings
Examples and tests

We’d like to integrate xThought with the HFT ecosystem and open-source the stack so a wider community can use it—hence my question about the correct path to upstream our backend into the main Hugging Face GitHub repositories.

I’ve studied the Optimum ecosystem and noticed most vendor backends live in separate repos like optimum-nvidia, optimum-intel, optimum-executorch, etc. That looks like the right pattern for us, so the plan is to open-source an optimum-xthought repo that mirrors those structures and then PR any Transformers-side hooks if required.

What’s the very first step? Should I open an RFC-style issue in the main Optimum repository, or is there another channel where I can submit my own optimum-xthought GitHub repo for an initial review?

If anyone has gone through this journey (e.g. Optimum-NVIDIA/-Executorch/-TPU maintainers ), I’d really appreciate your war stories and any gotchas you hit along the way.

Thanks a ton!

John6666 · July 1, 2025, 10:48am

Hmm… For now I found it.

If you would like to see new hardware and features be supported in Optimum, or you are interested in joining us to work at the intersection of software and hardware, please reach out to us at hardware@huggingface.co

halflucifer · July 2, 2025, 2:37am

@John6666 Thank you for bringing this to my attention! I’ll reach out to them via email shortly.

Thanks!

Felicitywood · July 2, 2025, 7:44am

halflucifer:

I’m part of the team behind xThought, an upcoming AI accelerator hardware for LLM inference. We’ve already built a working Transformers extension (xthought_transformers) that ships:

XThoughtConfig + XThoughtModel classes

Basic safetensor export + runtime bindings

Examples and tests

We’d like to integrate xThought with the HFT ecosystem and open-source the stack so a wider community can use it—hence my question about the correct path to upstream our backend into the main Hugging Face GitHub repositories.

I’ve studied the Optimum ecosystem and noticed most vendor backends live in separate repos like optimum-nvidia, optimum-intel, optimum-executorch, etc. That looks like the right pattern for us, so the plan is to open-source an optimum-xthought repo that mirrors those structures and then PR any Transformers-side hooks if required.

What’s the very first step? Should I open an RFC-style issue in the main Optimum repository, or is there another channel where I can submit my own optimum-xthought GitHub repo for an initial review?

If anyone has gone through this journey (e.g. Optimum-NVIDIA/-Executorch/-TPU maintainers ), I’d really appreciate your war stories and any gotchas you hit along the way.

Thanks a ton!

Btw, how do you manage ci and compatibility testing for xThought in optimum?

Topic		Replies	Views
:rocket: Optimum Transformers: accelerated NLP pipelines with Infinity speed 🤗Transformers	4	664	March 25, 2022
Intel OpenVINO backend 🤗Transformers	1	1126	November 1, 2021
Containerizing Huggingface Transformers for GPU inference with Docker and FastAPI 🤗Transformers	0	2968	October 5, 2021
Containerizing transformers with Docker and FastAPI 🤗Transformers	1	2047	August 28, 2020
(Tips) Optimizing Underutilized Resources Inference Endpoints on the Hub	0	265	November 15, 2023

How do I upstream a brand-new hardware backend to 🤗 Transformers/Optimum?

Related topics