Autotrain Zypher Chatbot with multiple ebooks

ujjwalsinghw · December 14, 2023, 5:52pm

Subject: Seeking Insights on Zephyr Chatbot Training with Ebooks – Your Thoughts?

Hi everyone,

I’m currently working on training a Zypher chatbot using multiple ebooks and would love to get your insights or suggestions on my approach. Here’s what I’ve done so far:

Base Model Choice: I’m using HuggingFaceH4/zephyr-7b-beta as my foundation.
Training Parameters: I’ve enabled merge_adapter and auto_find_batch_size for efficiency.
Training Data Format: My data is in a train.csv file, has one column ‘text’ and each row contains up to 2048 tokens from various books - serially from one page then the next.

I’m curious about a couple of things:

Would incorporating the book title in each row of my training data be beneficial for context, or might it introduce unwanted biases / contaminate?
Are there any common pitfalls or challenges I should be aware of in this kind of training setup?

Any advice or feedback based on your experiences would be greatly appreciated!

Thanks in advance!
Ujjwal Singh

latestgbapps · March 2, 2024, 6:01am

Incorporating book titles in your training data could provide valuable context for the chatbot, potentially improving response relevance. However, be cautious of potential biases. Common pitfalls include overfitting to specific book patterns, maintaining consistency across diverse sources, and ensuring data quality. Experimenting with training parameters and continuous evaluation are crucial for optimizing the chatbot’s performance.

Topic		Replies	Views
Zephyr tags in repsonse, after fin etuning Beginners	1	44	December 2, 2024
Fine-Tuning a Language Model with Data Extracted from Multiple PDFs for a Chat Interface 🤗Transformers	2	2607	November 5, 2024
A new Lang Chain Chat BOT for Educational Purpose: ChatterPY Beginners	0	68	August 30, 2024
Chat with a PDF Beginners	7	24160	March 13, 2024
Repost: Wikipedia (or something else) text to input output Beginners	3	273	November 18, 2024

Autotrain Zypher Chatbot with multiple ebooks

Related topics