Subject: Seeking Insights on Zephyr Chatbot Training with Ebooks – Your Thoughts?
Hi everyone,
I’m currently working on training a Zypher chatbot using multiple ebooks and would love to get your insights or suggestions on my approach. Here’s what I’ve done so far:
- Base Model Choice: I’m using
HuggingFaceH4/zephyr-7b-beta
as my foundation. - Training Parameters: I’ve enabled
merge_adapter
andauto_find_batch_size
for efficiency. - Training Data Format: My data is in a
train.csv
file, has one column ‘text’ and each row contains up to 2048 tokens from various books - serially from one page then the next.
I’m curious about a couple of things:
- Would incorporating the book title in each row of my training data be beneficial for context, or might it introduce unwanted biases / contaminate?
- Are there any common pitfalls or challenges I should be aware of in this kind of training setup?
Any advice or feedback based on your experiences would be greatly appreciated!
Thanks in advance!
Ujjwal Singh