Llama 3.2 3B instruct model giving wrong answer

Adjhu92 · November 7, 2024, 4:42am

Hi,
I am trying Llama 3.2 3B instruct model to ask a simple radiology question but most of times, it is giving wrong answer on a same question. Am I doing something wrong?
In summary, the answer should be no organs require dose adjustment as dose criteria are already fulfilled.

Code is here:
from transformers import pipeline
import torch

modelPath = “C:\codes\llama1B\”

pipe = pipeline(
“text-generation”,
model=modelPath,
torch_dtype=torch.bfloat16,
device_map=“auto”,
temperature = 0.1
)

prompt=“”"
In the case of our current patient, her D95 for the CTV is 30 Gy. The D2cc for the bladder is 2 Gy, and the D2cc for the rectum is 1 Gy. which organ(s) require dose adjustments if needed?
To answer assess the above patient plan, please note that

Clinically, the D95 for the Clinical Target Volume (CTV) should be greater than or equal to 25 Gy.
Further, the D2cc for all Organs at Risk (bladder and rectum) must be less than or equal to 4 Gy.
“”"
messageStructure1 = [
{“role”: “system”, “content”: “You are a medical physicist, and you should adopt your answers as a medical physicist.”},
{“role”: “user”, “content”: prompt},
]

response = pipe(
messageStructure1,
max_new_tokens=512,
)

assistant_response = response[0][‘generated_text’][-1][‘content’]
print(“Assistant’s Response:\n”, assistant_response)

John6666 · November 7, 2024, 5:05am

First of all, please be aware that there is a possibility that the problem is due to a lack of knowledge on the model itself, or a bug in the model or library.
I think that setting temperature = 0.1 is not good for this case. With this setting, the model will start playing a very free association game.
Let’s try setting it to around 0.6 or 0.7.

Adjhu92 · November 7, 2024, 5:51am

Thanks for your reply. I tried temperature settings of 0.6 to 0.7 and the problem still remained the same. I download this model from the huggingface.

From the repeated runs on one question, sometimes it gives correct and sometimes wrong answers.

PS: The question that I asked the model is simply an answer from the comparison between the two things (current vs target). However, the model is failing to simply compare the two quantities.

For example, one of the model outputs is “2 Gy is greater than the clinically desired 4 Gy, so dose adjustments are needed for the bladder” which is not true. 2 is less than 4.

John6666 · November 7, 2024, 6:07am

From the error content trends, it seems that the model itself lacks intelligence.
3B is very small for an LLM, so it’s not surprising. (I think ChatGPT was over 1000B at the beginning.)
Even if it’s small, it can be used for use cases that don’t require knowledge.
There are generally three ways to deal with this kind of situation: to use a large model simply; to find and use a model that has already been trained for more specialized knowledge; or to train this 3B model on your own GPU or with a paid online service to specialize it for specialized knowledge.

DataTiger · November 13, 2024, 2:38pm

It sounds like you’re dealing with some challenges in getting consistent, high-quality outputs for radiology-related questions. Adjusting temperature settings can help reduce randomness, but without domain-specific training, even a 3B model like Llama 3.2 might struggle to deliver the accuracy you need in such a specialized area.

An alternative to consider would be exploring larger, specialized models, especially if precision is essential. Platforms like PepperMill Beta could be a game-changer for you. They offer access to over 200 different LLMs, allowing you to evaluate and deploy the best fit for your specific needs. This kind of tailored approach can make a real difference in achieving reliable, domain-focused results in fields like radiology.

Topic		Replies	Views
How to Load Llama-3.3-70B-Instruct Model in Float8 Precision? 🤗Transformers	1	287	December 11, 2024
Help with Llama 2 Finetuning Setup Beginners	16	15995	May 20, 2024
Code makes inference with "Llama 3 70b instruct" model on CPU but has problem with inference with GPUs Beginners	0	1349	April 28, 2024
How to get a LLaMA v2 model with less than 7B parameters? Beginners	0	2113	August 24, 2023
Llama model outputs strange words Beginners	0	128	December 1, 2024

Llama 3.2 3B instruct model giving wrong answer

Related topics