Multiple texts as inputs to Transformers models

I would like to use multiple texts as inputs to a model, let’s say I have a dataset with 10 columns each column is a text (sentence or two), how can I fit all these inputs to the model and do a classification for example ?
I can see it’s possible to just concatenate all texts in one, but seems that for me, I need a very large data to be apple to achieve good accuracy.
Maybe using multiple models (BERT) in parallel, taking last hidden state, concatenate them and classify ? But the problem is that there’s so many values order of 30 texts.

Any idea how to tackle this ?

You should take the same approach as Extractive text summarization :

Concatenate all your sentences, separated with a special token (CLS for example), then use the CLS token representation to do classification.

From the Presumm paper

1 Like

Hi @colanim, thank you for your reply.
I understand what you suggested, the problem is that I don’t have only texts as inputs I have also some floats values, is converting this values to text would be sufficient ?

I see…

I never encountered this case myself, but maybe you can directly input the float values in the last classifier ?
Since it’s not text, there is no need for BERT to encode it (?)

Hi @Zack

I don’t know whether you’ve tried / considered the multimodal toolkit (blog post, github)- takes in tabular data (text, numbers, categorical data) and can use them as inputs to develop models.

Haven’t tried it myself, but looks quite promising.