Conceptual question about model training

I’m trying to understand the AI ecosystem. I have a general concept question about the models.

First, I understand that to train an LLM a data center and a quantity of computing resources are necessary that are not available to anyone. Is this statement true?

If true, are all models in hugginface derivatives of other pre-trained models like Mistral or Llama? That is, large models modified with fine tuning, RAG, quantization or other techniques.