Generate() without python for inference

Hello everyone,
As always a huge thank you in advance to the HF team / community for such an amazing set of resources. I am looking at doing some inference in a relatively resource constrained environment. I am familiar with the various model optimizations out there (OpenVINO, ONNX runtimes, etc.) - a huge shout-out again to HF for some really great resources/tutorials there. I would like to be able to run beam search (or potentially multinomial sampling) - right now I am using the amazing generate() function and everything it provides. Since we can get the models “out of python” using ONNX/OpenVINO/etc. - was wondering if there are any best practices / documentation on getting the generate() function “out of python” as well? Looking at C and/or Rust for the current application.
Thank you as always!!