Hi,
I wonder is it possible to export the entire optimum pipeline (e.g. generation) for serving on Triton model server? Ideally, the pipeline would include tokenization and decoding.
Thanks!
Hi,
I wonder is it possible to export the entire optimum pipeline (e.g. generation) for serving on Triton model server? Ideally, the pipeline would include tokenization and decoding.
Thanks!
Hey @changlan,
No, thats currently not possible you would have to write the pre- & post processing yourself using the PythonModel backend of TRTION