Sst2 dataset labels look worng

Hello all,

I feel like this is a stupid question but I cant figure it out

I was looking at the GLUE SST2 dataset through the huggingface datasets viewer and all the labels for the test set are all -1.

They are 0 and 1 for the training and validation set but all -1 for the test set.

Shouldn’t the test labels match the training labels? What am I missing?

GLUE is a benchmark, so the true labels are hidden, and only known by its creators.

One can submit a script to the official website, which is then run on the test set. In that way, one can create a leaderboard with the best performing algorithms.

Thank you!