If the appropriate configuration file is placed in the repository, apply_chat_template()
should work fine. If not, it will probably be ChatML equivalent.
Hello I’m implementing a framework for fine-tuning various LLMs using the TRL library’s SFTTrainer. I have a question about how chat templates work:
When using SFTTrainer with datasets in the standard formats (with “messages” array or “prompt”/“completion” fields), does the trainer automatically apply the tokenizer’s chat_template? The documentation suggests it does.
For models whose tokenizers don’t have a chat_template attribute set (or it’s empty), what template does SFTTrainer apply by def…
opened 06:08PM - 16 Jan 24 UTC
closed 05:22PM - 17 Jan 24 UTC
Hi! I am interested in using the `SFTTrainer` for instruction-tuning. Following … [the docs](https://huggingface.co/docs/trl/main/en/sft_trainer#dataset-format-support), I can see that I can provided examples in the following format to have the trainer format things for me:
```json
{"prompt": "<prompt text>", "completion": "<ideal generated text>"}
{"prompt": "<prompt text>", "completion": "<ideal generated text>"}
{"prompt": "<prompt text>", "completion": "<ideal generated text>"}
```
The docs also say:
> The [SFTTrainer](https://huggingface.co/docs/trl/main/en/trainer#trl.SFTTrainer) will then format the dataset for you using the defined format from the model’s tokenizer with the [apply_chat_template](https://huggingface.co/docs/transformers/main/en/chat_templating#templates-for-chat-models) method.
My question and confusion is, what does the trainer do if the tokenizer has no `chat_template`, as is the case with the [base llama model](https://huggingface.co/meta-llama/Llama-2-13b-hf/blob/main/tokenizer_config.json)?
DataCollatorForCompletionOnlyLM
```python
from trl import SFTTrainer, DataCollatorForCompletionOnlyLM
from transformers import AutoTokenizer
from datasets import load_dataset
# Load Dataset and tokenizer
dataset = load_dataset('prince-canuma/tinyOrca', split='train')
tokenizer = AutoTokenizer.from_pretrained("prince-canuma/Damysus-2.7B-Chat")
This file has been truncated. show original