How do I send system prompts using inference api serverless, llama3 8b instruct model

betaalpha · April 25, 2024, 3:51am

I’m using lama 3 8 b model, using the inference api serverless, like this:

import requests

API_URL = "https://api-inference.huggingface.co/models/meta-llama/Meta-Llama-3-8B-Instruct"
headers = {"Authorization": "Bearer mytoken"}

def query(payload):
	response = requests.post(API_URL, headers=headers, json=payload)
	return response.json()
	
output = query({
	"inputs": "Can you please let us know more details about your ",
})
print(output)

everything works great, but how do I use a system prompt, & do I need to send it with every query or only once, also is it possible to have it remember conversations or do I’ve to send old ones with every query, finally, how do I make it to complete or to chat, like on the huggingface chat ui

marcoguerrap · April 28, 2024, 3:38pm

Hey I found this post that helped me:

marcoguerrap · April 29, 2024, 12:27pm

Meta Llama 3 | Model Cards and Prompt formats for the appropriate tags of Llama 3

Topic		Replies	Views
How to add a system prompt on HF Inference API for Llama 8B Instruct? Beginners	0	295	April 25, 2024
Meta-llama / Meta-Llama-3-70B-Instruct is not available as a serverless API Models	10	1596	September 28, 2024
Llama 2 don't reponse prompt invokes Models	0	404	February 9, 2024
Llama 2 repeats its prompt as output without answering the prompt 🤗Transformers	3	3618	September 30, 2024
Trying to understand system prompts with Llama 2 and transformers interface 🤗Transformers	9	45707	October 19, 2024

How do I send system prompts using inference api serverless, llama3 8b instruct model

Related topics