What are some recommended pretrained models for extracting semantic feature on single sentence?

zuujhyt · December 8, 2020, 2:32pm

Hi, I am more a CV guy and recently get interested in doing a nlp project.

In this project, one part might involve extracting sentence-level semantic representation from a pretrained model.

In computer vision, one standard way to extract feature of an image or a video snippet could be

using Resnet pretrained on Imagenet or I3D pretrained on Kinetics datasets, respectively.

I want to do the similar thing but in nlp domain. I wonder if there are some recommended models pretrained on specific dataset for me to try?

As far as my limited understanding, models trained on datasets which aim to to tell if two sentences are semantically equal could be a direction (e.g. QQP, STS-B ). But it needs a pair of sentences, my case is just feeding one sentence (or one block of sentences), not in a pair format. Any suggestion? Thanks!

Jung · December 12, 2020, 4:59am

Hi! IMO, Bert could be comparable to ResNet as the baseline. (you can use last_hidden_state variable of BertModel just like the global-pooled features of ResNet) Then, newer models like Roberta and many more could be comparable to EfficientNet etc.

BramVanroy · December 12, 2020, 8:29am

Seems like you are looking for the Sentence Transformers library which trains Siamese BERT (etc.) networks on NLI data. That means that you can indeed pass one sentence to get a sentence embedding. They also have a few finetuned models that use cross-encoders instead. Those are obviously slower but lead to better performance on downstream tasks such as STSb.

zuujhyt · December 12, 2020, 4:04pm

Thanks for reply. And it seems sentence-BERT , LaBSE, and Universal Sentence Encoder are other some choices for sentence embeddings.

Jung · December 14, 2020, 3:57am

Benchmark-wise speaking, I have some new idea : since SuperGLUE is one of the most difficult (multi-)task on language understanding. And since T5 is the current SOTA on this benchmark so we can also try embedding vectors from T5.

Previously, this may not be straightforward to extract (since T5 is encoder-decoder), but the latest master version of Huggingface now contains T5 encoder’s only model which we can directly extract the vector of the pretrained model. (Thanks to @agemagician) … So this is interesting choice IMO

Topic		Replies	Views
Any BERT model recommendation needed for getting feature of structured sentences Beginners	0	400	June 8, 2022
Info regarding sentence-transformers Models	8	1678	August 26, 2020
Can we train Sentence transformer model for Sequence classification 🤗Transformers	5	6559	June 14, 2023
Generating sentence embeddings from pretrained transformers model Intermediate	1	1090	January 22, 2021
Transformer vs Sentence-Transformer for text classification Intermediate	0	2208	March 12, 2024

What are some recommended pretrained models for extracting semantic feature on single sentence?

Related topics