Have you considered using a QA model?
Prompt: Tell me about X.
QA model retrieves relevant text chunks based on 'Tell me about X'
Text chunks get put in a new prompt that then is fed to the LLM: f'{original prompt} using this context: {relevant_text_chunks}'