The Category of Model-based Translation Evaluation Methods

Hi there!

Recently I want to find whether there is an existing or related model category that suits the use of model-based metric/quality estimation(QE) methods, e.g. COMET/TransQuest.

The model architecture mainly contains two parts: a fine-tuned pretrained language model like BERT/XLM-R, and a designed multi-layer perceptron (MLP). The final output of metric/QE model is a single scalar value.

I noticed that BLEURT applies BERTForSequenceClassification model as initialization. However, I find that the implementation only contains one linear layer inside MLP module. For some approaches like COMET, this module may contains several linear modules, and activations are applied between any adjacent two of them.

Anyone got a clue? Thanks!