Llama 2 fine tuning general questions (tokenizer, compute_metrics, labels))

I was working on Flan-T5 for weeks and everything was good. Now changing to Llama 2 with many unfamiliar changes, I know one is seq2seq and another is decoder. I have 3 general questions and thanks in advance for advices:

  1. I saw different ways: One way is to add some special token like
    instruction = f"[INST] {sample[‘Instruction’]}: Question: {sample[‘Question’]} [/INST]"
    response = f"Answer: {cleaned_response}"
    sample[“text”] = instruction + response + tokenizer.eos_token
    But without tokenization, and using SFTTrainer. The other way is to provide text preprocessing, so obtaining ‘input_ids’, ‘attention_mask’, and using Trainer instead of SFTTrainer. Are these two approach both valid? How one is different from the other?

  2. It seemed to okay to use the default loss, but when I tried to use a compute_metric (e.g., customized metric like Rouge), it always got erros. When I printed the predictions and labels from predictions, labels = eval_pred, I got decimals and negative (see below), which seem to be wrong (Tried the same thing in Flan-T5, it gave integers)
    [[[ -5.887479 7.898579 3.145651 … -2.376532 -2.9236283
    -4.825788 ]
    [ -5.887479 7.898579 3.145651 … -2.376532 -2.9236283
    -4.825788 ]
    [ -6.1026587 6.603288 2.057882 … -6.1850896 -5.1317277
    -2.7486215 ]

    [ -3.0410564 -0.925742 1.1579411 … -2.830701 -3.4802294
    -2.2513897 ]
    [ -3.037336 -0.927284 1.1546779 … -2.8302677 -3.476878
    -2.2484875 ]
    [ -2.997214 -0.9113009 1.1302392 … -2.8121407 -3.4593928
    -2.2332683 ]]

  3. In Flan-T5, we defined clearly question and answer, so input_ids will be tokenized question and labels will be answers. For Llama2, the examples I saw put all the things together. Does that mean during fine-tuning, it learns by itself (self supervised learning) and input_ids and labels are the same? When model is developed, using a testing dataset, input_ids and labels are different?

Thanks again, many confusions!