BERT (CamemBERT) for Sequence Classification maps any sequence to the exact same encoding

Hi,

I’m trying to fine-tune sequence regression using BERT in French i.e. CamembertForSequenceClassification

I simply load the model and tokenizer passing num_labels=1 to actually run a regression task (my labels are actually either 0 or 1, I tried either as classification task with 2 labels or 1 label regression with the same issue)

import transformers

model = transformers.CamembertForSequenceClassification.from_pretrained("camembert-base", num_labels=1)
tokenizer = transformers.AutoTokenizer.from_pretrained("camembert-base", num_labels=1)

def tokenize_bert(texts):
    self = model
    tokenizer_kwargs = {
        "truncation": True,
        "max_length": 512,
        "padding": True,
        "return_tensors": "pt",
    }

    return tokenizer(
        texts,
        **tokenizer_kwargs,
    )

But when I’m generating, all output logits are exactly the same, and I figured out the encoder was outputting the exact same representation for any sequence:

  1. I check that tokenized sequence are not the same:

  2. I check that embeddings are actually different:
    image

  3. But then I notice that the encoder output is exactly the same for all input.
    image

Any idea how I could fix my training (more than 50k steps, with mostly default parameters) to make it work?