Meta Llama-3 prompt sample

I am trying to ask Llama-3 model to read a document and then answer my questions, but my code seems does not generate any output. Can someone tell me what’s wrong with the code? I appreciate it.

Code:
from huggingface_hub import login
login(token=‘my_token’)
import transformers
import torch

model_id = “Meta-Llama-3-8B”
pipeline = transformers.pipeline(
“text-generation”,
model=model_id,
model_kwargs={“torch_dtype”: torch.bfloat16},
device=“cuda”,
)

notes_path = ‘note1.txt’
with open(notes_path, ‘r’, encoding=‘utf-8’) as file:
cal_notes = file.read()

prompt = f"“”<begin_of_text><start_header_id>system<end_header_id>
{cal_notes}<leot_id><start_header_id>user<end_header_id>
My question.<leot_id><start_header_id>assistant<end_header_id>“”"

outputs = pipeline(prompt, max_new_tokens=512)
generated_text = outputs[0][‘generated_text’]

print(generated_text)

Can you switch to doing question answering, like in the example in this page? What 🤗 Transformers can do

I think if you set the pipeline type as mentioned above, and then provide the document as context, it will make it easier for you than needing to specify all of the separating tokens, which can be model specific. It will also make sure that there is a Q&A head on the model rather than an LM head, which will probably give you better results overall.