Who’s running open-source LLMs in enterprise production, and how?

iknowjerome · October 31, 2025, 2:33pm

Hi everyone,

I wasn’t sure which category this best fits in, so I’m posting here since it’s about production deployment.

I’m trying to understand how enterprise teams are deploying open-source LLMs in real production environments.

If you’re running models internally or on your own infrastructure, I’d love to hear about your setup:

I’m especially interested in enterprise use cases that have actually gone live versus those that stayed in prototype.

Feel free to share deep technical details or architecture notes if you can.

Thanks for taking the time to share your experience.

Topic		Replies	Views
Productionizing HuggingFace Transformers? Beginners	1	3237	September 12, 2022
How do Inference Endpoints fit into larger solution? Inference Endpoints on the Hub	0	419	June 17, 2023
Performance of hosted inference API Beginners	0	297	February 16, 2021
Is there anything built with open source llms that is not experimental? Spaces	0	381	August 15, 2023
Problems with "Transformers in production" service Inference Endpoints on the Hub	6	1030	November 7, 2022