Hello Hugging Face Community,
I am engaged in an ambitious project with a large and intricate English novel. The narrative of this novel is complex, with elements on one page often intricately linked to content in distant chapters. My goal is to enhance a Large Language Model’s (like GPT-3.5/GPT-4 or LLAMA2) understanding of this text, enabling it to accurately respond to detailed queries, particularly those that involve nuanced interrelationships.
My initial approach involved a Retrieval-Augmented Generation (RAG) setup using LLamaIndex, VectorDB, and a knowledge graph. While this proved somewhat effective, it was also time-consuming and resource-intensive due to the need for scanning multiple text chunks for each query.
I am now considering fine-tuning or pre-training a model specifically with my novel to improve its contextual understanding and recall. My queries are as follows:
- Fine-Tuning vs. Pre-Training for Novel-Specific Adaptation: In enhancing a model’s ability to understand and recall detailed plot elements and their connections within my novel, how effective is fine-tuning a model like GPT-3.5/GPT-4/llama2/mixtral? Alternatively, would pre-training be a more appropriate approach, despite its higher resource demands?
- Effectiveness of Pre-Training Smaller LLMs: Would pre-training smaller language models be an effective strategy for this purpose? If so, what are the trade-offs compared to using larger models?
- Focused Learning on Specific Chapters: If I aim to have the model learn a specific chapter of about 10,000 tokens, would fine-tuning enable the model to precisely memorize and recall details from this chapter?
- Limitations and Expectations: Considering the memory constraints of current LLMs, to what extent can fine-tuning aid in accurately answering questions that require understanding complex interrelations throughout the novel?
- Alternative Strategies: Are there other approaches or combinations, such as merging fine-tuning with a retrieval method, that I should consider to enhance the model’s understanding and question-answering accuracy?
- Practical Considerations: What are the practical aspects (such as computational resources and time investment) of fine-tuning versus pre-training a model for this kind of task?
I seek your insights, experiences, and advice on the most effective approach to achieve profound understanding and efficient question-answering capabilities for my novel. Any guidance or suggestions you can provide would be immensely valuable.
Thank you in advance for your assistance and insights.