Hi all, my first time around here, so please bear with me if I missed any rules/recommendations.
I have been learning to use and build projects using ollama with local model mistral:7b-instruct v3. This particular use case, I am asking the model to impersonate accessibility auditor to generate a report. I have used the model in other contexts as well.
Mistral continues to include the full context on the response with thousands and thousands of lines of "[3,1027,781,2744… ", no matter how I request on the prompt to exclude this. I can’t seem to understand if this is model related or ollama API related. This context increases the size of any logs by 10s of MBs and complicates troubleshooting.
Appreciate any help to understand how I can request the model/ollama to skip including the context in its response to my prompts.
ai.config.ts
export const AI_CONFIG: AIConfig = {
api: {
baseUrl: “http://localhost:11434”,
endpoints: {
generate: “/api/generate”
//embeddings: “/api/embeddings”,
},
},
model: {
name: “mistral:7b-instruct”,
parameters: {
chunkSize: 6000,
promptTimeout: 60000
},
},
prompts: {
},
// ...
retry: {
attempts: 3,
backoff: {
initial: 1000,
multiplier: 1.5,
maxDelay: 10000,
},
},
Ollama call:
private static async callOllama(prompt: string): Promise {
const body = {
prompt,
model: AI_CONFIG.model.name,
options: {
num_ctx: 8192,
},
stream: false,
};
const url = AI_CONFIG.api.baseUrl + AI_CONFIG.api.endpoints.generate;
const response = await fetch(url, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify(body),
});
if (!response.ok) {
const errorText = await response.text();
throw new Error(
`Ollama request failed [${response.status}]: ${errorText}`
);
}
return await response.text();
}
Prompt:
You are an expert accessibility auditor. Based on the aggregated axe-core results provided below, generate a comprehensive accessibility analysis report.
YOU MUST OUTPUT ONLY A SINGLE, STRICTLY VALID JSON OBJECT AND NOTHING ELSE. Do not include any extra text, commentary, or keys (such as “model”, “created_at”, “done”, “context”, or markdown formatting like code fences).
The JSON object MUST follow EXACTLY this structure:
{
“aggregatedAnalysis”: {
“contextualSummary”: “
“prioritizedIssues”: [
{
“issue”: “”,
“wcagReference”: “<guideline reference(s)>”,
“remediation”: “”
}
]
}
}
ONLY OUTPUT THE JSON. NOTHING ELSE.
If you cannot produce output in this format exactly, output nothing.
Below is the aggregated axe-core accessibility report data:
<>
[LOG] Running AI Enhanced Analysis…
Raw AI response: {“model”:“mistral:7b-instruct”,“created_at”:“2025-02-20T01:11:46.979168Z”,“response”:" It seems like you have provided a JSON object that contains a list…“context”:[3,1027,781,2744,1228,1164,8351,3503,3800,5554,2610,29491,17926,1124,1040,15322,1369,6824,29474,29501,3059,3671,4625,4392,29493,9038,1032,16081,3503,3800,6411,3032,29491,4372,4593,1119,11848,1115,1032,3460,29493,20238,4484,10060,2696,1163,1476,4978,5285,1396,29493,1989,15526,29493,1210,8916,29491,29473,781,781,21966,4593,1032,10060,2696,1137,5436,1384,1431,1042,1066,1224,546…