Mistral or LLaMA?

capnchat · April 25, 2024, 8:26pm

Which model is better for a chatbot fine-tuned on healthcare data?

Meta-Llama-3-8B-Instruct
Mistral-7B-Instruct-v0.2

We have been getting great results with Mistral and were about to initiate our final training, but now Meta has released this new version, and so I am hoping people can offer their two cents to aid our decision.

Thank you in advance!

singhay · April 30, 2024, 6:44pm

In ditto position, how has your experience been with Mistral ?

capnchat · May 1, 2024, 2:11am

@singhay, Mistral has been really great.

We’ve been using Mistral-8x7B-Instruct-v0.1 model to preprocess our training samples through fireworks.ai API for $0.50/1M tokens and it’s been great. Cheaper than GPT-4, fireworks.ai is great, and it’s been great at formatting JSON.

We’ve then been taking our preprocessed samples and fine-tuning Mistral-7B-Instruct-v0.2 using together.ai, and our first two test trainings blew us away. We’re almost finished preprocessing our entire dataset and are about to fine-tune a model using 1M samples, so we’re really excited!

I plan on doing a smaller test fine-tuning with LLaMA 3 8B Instruct after we do our full training because I can’t see any major benefit from using it that justifies changing our strategy since we’re really focused on just releasing our version 1 model at this point. But we plan on doing some side-by-side tests using a smaller dataset with like 50K samples so we can consider using LLaMA for our version 2 model.

One thing I will say about Mistral is that they don’t have a designated syntax that I’m aware of for a system prompt, but we emulate one by including two messages (user and assistant) at the start of our messages array where roles and system-prompty stuff can be established, and it works great.

system · May 1, 2024, 2:12pm

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Mistral or LLaMA for hardware design bot? Models	0	26	July 24, 2024
Which Open Source LLM is suitable for training? Mistral-7B or Llama2-7B? Models	0	957	April 21, 2024
Base or Instruct version of LLM for fine tuning? Beginners	1	2948	August 8, 2024
Which method to choose for a chat bot that answers questions about a specific service? Beginners	0	156	February 22, 2024
HELP! Which model to choose for my task?! Beginners	2	428	November 15, 2024

Mistral or LLaMA?

Related topics