Demand on Text Regression Pipeline/Application

Hi there,

Recently I’m working on the related resources on publishing my model at Huggingface Hub. However I find out that my model doesn’t fit the existing model hub as there is not a pipeline for text regression task.

Specifically, the text regression task aims at giving a scalar value after receiving a text sequence, where the model architecture consists of a pretrained language model (e.g. BERT), a pooling layer (e.g. average pooling) and a multi-layer perceptron (e.g. 1 or more linear submodules, and tanh activation is arranged inside).

The most related pipeline is Text Classification, where the mainly difference lies in its output. The related methods deliver a scalar in a discrete space (e.g. positive or negative for binary-classification on sentiment analysis).

For my demand, I need to predict the translation quality by giving out a scalar value (e.g. BLEURT):

  1. The output is in a continuous space;
  2. The multi-layer perceptron consists of multiple linear modules, existing Text Classification pipeline/model generally consists only one linear module.

Related topics can be referred here.