How do I finetune Llama-3-8B to predict a float value?

Hi, my task is to train a model on a statement with a masked token in which I have samples for with a “rating” value from -1 to 1 on how good the sample answer is for the statement. I want to finetune Llama-3-8B to predict the values when given a statement and a sample answer for it but since I am trying to predict a float value, will I have to change anything for the model? All the examples I’ve seen so far are completion tasks and outputting strings. Should I just instruct it to rate it from a value from -1 to 1 as part of the instructions or should I change the last layer to regression but how would I do that?

1 Like