Open-sourcing better cross-encoders for STILTS and better IR?

Hi @nreimers,

I find your research on bi-encoders and models on sbert.net super helpful. Based on your research I understand that cross-encoders generally perform better than bi-encoders, while their main disadvantage is computational speed.

I’m very interested in deepening my research in cross-encoders, but I noticed that you’ve only published comparatively few cross-encoders here: cross-encoder (Sentence Transformers - Cross-Encoders).

My question: Could you consider to publish improved cross-encoders, either trained on your paraphrase data or the ‘all’ data from the FLAX event (‘all-mpnet…’ etc.)?

I feel like this would have great added value for the HF- and research-community, because:
- Improved cross-encoders trained on more diverse data could be great improved STILTS for sequential transfer learning applications. (see here https://arxiv.org/pdf/1811.01088.pdf)
- Your bi-encoders are probably already good STILTS, but I imagine that cross-encoders would be even better. Using these intermediate models for task-specific fine-tuning would probably be a super easy way for people to get improved performance on many tasks - just by taking your cross-encoder as the base model instead of BERT-base etc.
- Having high-performance cross-encoders would also be useful for implementing BM25 & cross-encoder reranking for information retrieval applications etc.

Could you consider to published improved cross-encoders?
(Maybe there are technical reasons why your paraphrase or ‘all’ data cannot be used for cross-encoders and that’s the reason why non are published with this data?)

Best,
Moritz

Hi,
Happy to hear that :slight_smile:

Better cross encoders that are trained on larger datasets are on my agenda. However, training is not so straightforward. For bi-encoders, you use the other examples in a batch as negative.
For cross-encoders, you have to create the negative pairs. Here, the creation of the negative pairs plays an extremely important role.

I hope I will soon be able to train these models. But setting up the training etc takes some effort.

Best
Nils

1 Like

Great, happy to hear that this is on your agenda, this will be a great addition to the hub!