Text Length FinBert - Serverless Inference Endpoint

thanksfinance · May 29, 2022, 9:43pm

Hi guys, I’m trying to send a long text (longer than 512) to a FinBert inference endpoint deployed on a serveless inference endpoint on aws.
I’m receiving the following error: “The size of tensor a (639) must match the size of tensor b (512) at non-singleton dimension 1”.

I have a list of text that I would like to classify without splitting them, how can I fix?

Thank you in advance

philschmid · May 30, 2022, 6:32am

The model has a max_sequence_length of 512. You can provide truncation=True as parameter, e.g.

{
  "inputs": "Hi, I recently bought a device from your company but it is not working as advertised and I would like to get reimbursed!",
  "parameters": {
   "truncation": True
  }
}

thanksfinance · November 4, 2022, 11:30pm

Hi @philschmid
I am testing several pre-trained models that I find on the hub for Text-Classification. Many have a max_length=512. I deploy them to SageMaker Endpoint Serverless and invoke them from a lambda.

Months ago you suggested me to use the truncation parameter… Now… I was wondering…
if the text is longer, is it truncated? Do I then lose the information about the “excess” part or is each batch sentence evaluated and a result for the whole document processed?

Is there a way to define a preprocessing operation to chunk the sentence in order to get a better evaluation?

Something like:

tokenizer = BertTokenizer.from_pretrained(MODEL_ID)
tokens = tokenizer.encode_plus(text, add_special_tokens = False, return_tensors='pt')
input_id_chunks = tokens['input_ids'][0].split(510)
mask_chunks = tokens['attention_mask'][0].split(510)
chunksize = 512

input_id_chunks = list(input_id_chunks)
mask_chunks = list(mask_chunks)


for i in range(len(input_id_chunks)):
    input_id_chunks[i] = torch.cat([
        torch.Tensor([101]), input_id_chunks[i].float(), torch.Tensor([102])
    ])
    mask_chunks[i] = torch.cat([
        torch.Tensor([1]), mask_chunks[i].float(), torch.Tensor([1])
    ])
    
    pad_len = chunksize - input_id_chunks[i].shape[0]
    if pad_len > 0 :
        input_id_chunks[i] = torch.cat([
            input_id_chunks[i], torch.Tensor([0] * pad_len)
        ])
        mask_chunks[i] = torch.cat([
            mask_chunks[i], torch.Tensor([0] * pad_len)
        ])

Can I also know where I find the parameters that I can pass to my inference endpoint as input? Is there a resource that you can link?

marshmellow77 · November 5, 2022, 7:22am

Hi @thanksfinance. Yes, if you use the truncation parameter the text will be truncated and you will lose the “excess” part.

However, in text classification this is rarely a problem because the model is often able to determine the class with just using the first 512 tokens. Do you see a significant deterioration in your metrics when using the truncation parameter?

If yes, you might indeed want to do some preprocessing. I’m not entirely clear what your code does, but this thread discusses a similar issue and potential options how to solve this challenge.

Topic		Replies	Views
NLP Truncation Parameter for Serverless Endpoint Beginners	0	293	November 4, 2022
Truncation of input data for Summarization pipeline Amazon SageMaker	4	2654	November 16, 2021
ClientErro:400 when using batch transformer for inference Amazon SageMaker	11	2237	January 13, 2022
SageMaker Model \| How to set Truncation within Config? Amazon SageMaker	3	785	September 25, 2023
My input sentence is very long(more than 512). What should I do when I want to fintune model about classify?Thanks Beginners	3	1118	September 3, 2021

Text Length FinBert - Serverless Inference Endpoint

Related topics