Two texts inputs for Text Classification in Inference API?

mplaza · December 14, 2021, 12:59pm

Hello!,

I’ve trained a model to classify if two texts have a relation between them. Therefore I need to introduce two independent strings as input.
As I’m using the tokenizer I can introduce two strings as input for the model using code, however I cannot reproduce that functionality in “Hosted Inference API”.

tokenizer = AutoTokenizer.from_pretrained("my-model")
model = AutoModelForSequenceClassification.from_pretrained("my-model")

input = tokenizer('I love you','I like you',return_tensors='pt')
model(**input)

When calling tokenizer with two separated strings the token type id distinguishes between text 1 and 2:

{'input_ids': tensor([[ 101, 1045, 2293, 2017,  102, 1045, 2066, 2017,  102]]), 'token_type_ids': tensor([[0, 0, 0, 0, 0, 1, 1, 1, 1]]), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1]])}

However using just one string and [SEP] this is not the case, and the behaviour is the same as in “Inference API”

{'input_ids': tensor([[ 101, 1045, 2293, 2017,  101, 1045, 2066, 2017,  102]]), 'token_type_ids': tensor([[0, 0, 0, 0, 0, 0, 0, 0, 0]]), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1]])}

Would it be possible to add a second input text to “Hosted Inference API” in order to reproduce the same behaviour that code allows?

Thanks!

merve · January 2, 2022, 10:30am

Hello,

You’re right that text classification hosted widget only takes one text and there are text classification models taking multiple text inputs. I opened an feature request to have a more flexible behavior. In the meanwhile if you want this for demonstration purposes, you can create a Space based on your model.

Topic		Replies	Views
Multiple text classification using hugging face with gradio app 🔒 Gradio	1	1167	September 14, 2022
Pipeline text classification with two sequences for each example 🤗Transformers	2	754	February 24, 2022
Multi-class classification and Hosted inference API 🤗Hub	0	659	October 27, 2021
Hosted Inference API: Error loading tokenizer Can't load config 🤗Transformers	2	1024	July 16, 2020
Querying NER model with a list of strings using the inference api Beginners	0	290	January 3, 2022

Two texts inputs for Text Classification in Inference API?

Related topics