Fine tuning model for stack exchange

Hi everyone, I am trying to build a BERT model which takes stack exchange answers with it’s scores as training data to predict whether or not an answer is good or bad during the testing. Any idea how I would build this model or any useful articles which might help? The first problem I would like to solve is how not to use labels, but use scores. Thank you :smile: