Building Own Knowledge Base LLM

somesh1 · April 5, 2024, 9:29am

Hello

I want build my own knowledge base Language Model (LLM), utilizing over 40GB of data including books and research papers. I’m eager to hear your suggestions and insights on how to approach this endeavor.

Specifically, I’m seeking guidance on:

Approaches for constructing the LLM: What methodologies or frameworks would you recommend for building a robust LLM using my dataset?
Data preprocessing techniques: How should I preprocess the data to ensure optimal performance and efficiency in training the model? Any specific tools or libraries you suggest for this task?
Fine-tuning or RAG models: Would fine-tuning existing models or implementing RAG (Retrieval-Augmented Generation) models be beneficial for this project? If so, what are some best practices or resources to consider?

lofti1 · April 6, 2024, 3:31am

Hi! Currently searching for the same solution

Topic		Replies	Views
Looking for a solution on training my own LLM Beginners	2	2315	April 29, 2024
Chatbot with a knowledge base & mining a knowledge base automatically Beginners	7	3727	November 28, 2021
Training existing llm on my data Beginners	0	500	June 17, 2023
Structure-agnostic Knowledge Graph Extracting LLM 🤗Datasets	0	733	February 6, 2024
Need to add my domain "knowledge" to pretrained LLM Beginners	1	496	April 9, 2024

Building Own Knowledge Base LLM

Related topics