Inquiry for adding a new layer for transformer model

Hi there,

I am new to the transformer.
When I am fine-tuning for bloom-560m for phishing emails, I try to give the whole email for one label:

def tokenizeInputs(inputs):
tokenized_inputs = tokenizer(inputs[“email”], max_length = 512, truncation=True)
word_ids = tokenized_inputs.word_ids()
label = inputs[“label”]
labels = label# phishing or not
tokenized_inputs[“labels”] = [labels]
return tokenized_inputs

so it should have the whole email with one label right?

but after training when I try to get the output:

inputs = tokenizer(
#“HuggingFace is a company based in Paris and New York”,
‘Thank you Katie.\nI will be with David as well.\n’,
add_special_tokens=False, return_tensors=“pt”
)
#inputs = tokenizer(example[“email”])
with torch.no_grad():
logits = model_tuned(**inputs).logits
print(logits)
predicted_token_class_ids = logits.argmax(-1)
print(predicted_token_class_ids[0])
# Note that tokens are classified rather then input words which means that
# there might be more predicted token classes than words.
# Multiple token classes might account for the same word
predicted_tokens_classes = [model_tuned.config.id2label[t.item()] for t in predicted_token_class_ids[0]]
predicted_tokens_classes

result is like:
[0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
I got the result for each word but not the whole email.
I have tried to search the topic and found few helpful.

Could you guys advise me on this? Thanks. :grinning: