`serving` signature in TensorFlow Serving blogpost

christopher · August 5, 2021, 6:02pm

Hi everyone!

I am currently working through @jplu 's blogpost on serving a HuggingFace model with TF-Serving, in which he overwrites the model’s serving method to change the signature of the traced graph input to accept embeddings.

That, in turn, led me to discover that this serving signature is part of all TF models (Models — transformers 4.7.0 documentation)

Can someone explain to me how exactly this serving method is used by the model server? I can’t find it referenced in the rest of the tutorial and I wasn’t succesful in finding my way around the codebase.

Is that redefined signature used at all in the tutorial? I might be mistaken, but it seems to me that the requests (both REST and gRPC) to the TF-server use the output of the tokenizer, not those of an embedding layer.

lewtun · August 5, 2021, 8:03pm

cc @Rocketknight1

christopher · August 9, 2021, 8:17am

I missed the fact that the serving method is explicitly exported as a metagraph in the SavedModel for all TF models (see here)

EDIT:
One thing that’s still not completely clear to me are the redefined inputs. The tutorial changes the serving method to accept token embeddings, yet the requests to the model use the output of the tokenizer, so token ids. What am I not seeing?

Topic		Replies	Views
Is that possible to embed the tokenizer into the model to have it running on GCP using TensorFlow Serving? 🤗Tokenizers	4	3252	January 12, 2023
Flan-T5 with Tensorflow-Serving 🤗Transformers	0	422	October 9, 2023
Batch inference using tfserving/kfserving 🤗Transformers	0	529	December 22, 2020
Instances in tensorflow serving DialoGPT-large model Beginners	4	1226	January 17, 2022
What is best way to serve huggingface model with API? Beginners	11	43407	August 29, 2023

`serving` signature in TensorFlow Serving blogpost

Related topics