Best free options if you want to train a language model on a small set of private documents?

I have about a hundred PDF files, can i download a model that understands English and then feed it the docs, so I could talk to the model about them?

I would look at vector databasing with RAG for this.

Basically you turn all your docs into Embeddings and store them. You then retrieve documents that are similar to your query. Finally you take an off the shelf model such as Mistral 7B and you give it the relevant documents as context and then ask it questions on the retrieved docs.

Not sure I understand this? I need to install a database?

you may find this useful