I read a lot about foundation model and large language model.
However, I dont find a clear definition what exactly is a foundation model. Is large language model and foundation model the same thing?
I read a lot about foundation model and large language model.
However, I dont find a clear definition what exactly is a foundation model. Is large language model and foundation model the same thing?
from On the Opportunities and Risks of Foundation Models,
We introduce the term foundation models to fill a void in describing the paradigm shift we are witnessing; we briefly recount some of our reasoning for this decision. Existing terms (e.g., pretrained model, self-supervised model) partially capture the technical dimension of these models, but fail to capture the significance of the paradigm shift in an accessible manner for those beyond machine learning. In particular, foundation model designates a model class that are distinctive in their sociological impact and how they have conferred a broad shift in AI research and deployment. In contrast, forms of pretraining and self-supervision that technically foreshadowed foundation models fail to clarify the shift in practices we hope to highlight.
Note that the term “foundation model” is a bit controversial or at least ambiguous. vs. Large language model is “just” a language model that’s large so rather unambiguous in my opinion =)
relevant links courtesy of @lewtun :
https://twitter.com/tdietterich/status/1558256704696905728?s=20&t=u1dEPJrWFex9p7MdZLjOYA
https://twitter.com/ylecun/status/1558395878980861952?s=20&t=u1dEPJrWFex9p7MdZLjOYA
A large language model (LLM) is any statistical model of language built on “large” swaths of data and “a lot” of parameters. “Large” and “a lot” is compared to previous approaches just a few years ago; think terabytes of training data rather than megabytes.
A foundation model is a term introduced by Stanford and the Stanford Center for Research on Foundation Models (CRFM). Although the paper isn’t entirely clear, characteristics of a foundation model as introduced in this paper include:
Other terms to consider in this family include “Large Neural Network”, “Base Model”, “Pretrained Model”, “Parent Model”, “Large self-supervised model”, “Self-supervised multimodal model” – and more!