Instances in tensorflow serving DialoGPT-large model

mgreen11m · January 11, 2022, 7:07pm

Hi. I have managed to successfully use pretrained DialoGPT model in tensorflow serving (thanks to Merve). Rest API is up and running as it should. The issue occurs when I try to send the data to it. When I try to pass in the example input (available at Huggingface’s API documentation under conversational model section). I get an error “error”:“Missing ‘inputs’ or ‘instances’ key”. This is the part where I get confused.
In the tutorial I watched on youtube it’s stated that “instances” are bassically inputs (video section is available on: tf serving tutorial | tensorflow serving tutorial | Deep Learning Tutorial 48 (Tensorflow, Python) - YouTube), but something else is written in the tensorflow serving documentation (available on: RESTful API | TFX | TensorFlow).So my question is, how can I get the values that need to be passed into the “instances” key, what are they? Are they in any file in the model’s huggingface repository? Thank you for answering in advance. (And sorry because you have to copy paste URLs, but platform wouldn’t let me use more than 2). This is the code I’ve written in node.js to test the Rest API:

const axios = require('axios');

const ai_url = "http://localhost:8601/v1/models/dialogpt:predict";

const payload = {
    "instances": [],
    "inputs": {
        "past_user_inputs": ["Which movie is the best ?"],
        "generated_responses": ["It's Die Hard for sure."],
        "text": "Can you explain why ?"
    }
}
console.log(JSON.stringify(payload));

axios.post(ai_url, { data: payload })
    .then(function (response) {
        console.log("data: " + JSON.stringify(response.data));
        console.log(ai_response);
    })
    .catch(function (error) {
        console.log(error);
        console.log(error.response.data.detail);
        console.log(error.response);
    });

```This text will be hidden

merve · January 12, 2022, 9:53am

Hello
Here you can find a neat tutorial on how to use Hugging Face models with TF Serving. As you guessed, instances are your examples you want your model to infer.

batch = tokenizer(sentence)
batch = dict(batch)
batch = [batch]
input_data = {"instances": batch}

Your payload input works just fine in Inference API btw. My guess is you could put your inputs to instances part and it would work just fine. (maybe try as a list if it doesn’t) Something like:

batch = [{"inputs": {
        "past_user_inputs": ["Which movie is the best ?"],
        "generated_responses": ["It's Die Hard for sure."],
        "text": "Can you explain why ?"
    }}]
input_data = {"instances": batch}

Let me know if it doesn’t work.

mgreen11m · January 12, 2022, 9:39pm

Hi
Thanks for helping me out. I figured, I would use the huggingface’s node.js tokenizers library to tokenize the input. Unfortunately my node.js code gets the “TypeError: failed downcast to function” error befor I get to try your soultion. I get the error even when I’m using the example code from the node.js tokenizers library:

import { BertWordPieceTokenizer } from "tokenizers";

const wordPieceTokenizer = await BertWordPieceTokenizer.fromOptions({ vocabFile: "./vocab.txt" });
const wpEncoded = await wordPieceTokenizer.encode("Who is John?", "John is a teacher");

Also, is the vocabFile in the example the vocab.json file or the merges.txt file from the model repository? And in what order should I tokenize inputs? In order like in conversation or “past_user inputs” first, then “generated_responses” and lastly “text” input? This is the part of the code where I get the error at tokenization:

let { BPETokenizer } = require("tokenizers");
let merges = 'path_to_file';
const ai_url = "http://localhost:8601/v1/models/dialogpt:predict";
async function api_request() {
    const sentence = "Can you explain why ?";
    const wordPieceTokenizer = await BPETokenizer.fromOptions({ vocabFile: merges });
    const wpEncoded = await wordPieceTokenizer.encode(sentence);   //the error occurs in this line
    console.log(wpEncoded);
    batch = wpEncoded;
    console.log("Encoded sentence: " + wpEncoded);
}

Thank you for helping me in advance.

mgreen11m · January 13, 2022, 8:58am

Update: for the tokenization issue I figured the best solution would be to just use python for that, so I’m going to make a Rest API for tokenization and call it in my node.js code. Going to paste code for it here when it’s done. I just need the help with the question in what order to tokenize messages I mentioned in the previous post. Thank you for helping me in advance

merve · January 17, 2022, 10:08am

Hello, from what I get, looking at the documentation of DialoGPT implementation in Hugging Face:

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch


tokenizer = AutoTokenizer.from_pretrained("microsoft/DialoGPT-medium")
model = AutoModelForCausalLM.from_pretrained("microsoft/DialoGPT-medium")

# Let's chat for 5 lines
for step in range(5):
    # encode the new user input, add the eos_token and return a tensor in Pytorch
    new_user_input_ids = tokenizer.encode(input(">> User:") + tokenizer.eos_token, return_tensors='pt')

    # append the new user input tokens to the chat history
    bot_input_ids = torch.cat([chat_history_ids, new_user_input_ids], dim=-1) if step > 0 else new_user_input_ids

    # generated a response while limiting the total chat history to 1000 tokens, 
    chat_history_ids = model.generate(bot_input_ids, max_length=1000, pad_token_id=tokenizer.eos_token_id)

    # pretty print last ouput tokens from bot
    print("DialoGPT: {}".format(tokenizer.decode(chat_history_ids[:, bot_input_ids.shape[-1]:][0], skip_special_tokens=True)))

You can just tokenizer user input, append tokenized input at the end of chat_history_ids (if this is the first turn of conversation, it is none I guess) and keep adding it in each turn.

Topic		Replies	Views
ERROR: The size of tensor a () must match the size of tensor b () at non-singleton dimension 1 Inference Endpoints on the Hub	0	898	January 11, 2024
ValueError: You have to specify either decoder_input_ids or decoder_inputs_embeds 🤗Transformers	3	1760	November 14, 2023
What is the correct format conversational models in the inference api Beginners	0	159	July 31, 2024
Python HF Not Working Beginners	1	29	July 1, 2025
How to use Inference API to perform speech recognition Beginners	1	209	October 12, 2024

Instances in tensorflow serving DialoGPT-large model

Related topics