Petals dropped. Now you can use large models on a single gpu it says.
Reminds me of the “more expensive setup is better” argument that makes people buy new graphics cards and cpus.
But what we need to define is application, not numbers.
Just like “With iPod, Apple has invented a whole new category of digital music player that lets you put your entire music collection in your pocket and listen to it wherever you go” and not “we invented a X GB mp3 player with Y KB cache”.
Application focus, not specs focus.
My goal is to have a model that gives accurate answers to questions about a document and that has the “decency” to admit that the answer cannot be found in those documents.
There are several options out there with local llms, yet none of them can be configured.
If a reply is bad, all you can do is choose another model. People hope that bigger was better, so they try to stuff huge and even bigger(petals) models into their computer, but what do those models really do?
They contain VAST corpuses on all kinds of topics. I assume, you won’t need 99% of those billions of parameters in your entire lifetime.
THIS is where you need to start: limit models by application. If you only search in english, don’t get a model that also contains urdu.
If you only talk about computer science, don’t get that model that contains psychology.
Now the problem is that there are no models available that are specific - and good at what they do there.
To summarize, we need models that are specializable, by defining requirements and creating a new model based on that, which is much faster, smaller and gives you the right results.
EDIT: Another issue is what if the answers don’t satisfy you, would a human give better answers? Is it the model, what is it? You only have a few parameters like temperature and that’s it. Besides, a huge issue is model entitlement, the dreaded “This is not morally okay, and I will not”…Shut up, you are living on MY computer, I tell you what you do and don’t do.
Yet I feel like running against windmills, nobody debates that anywhere and people just report about the great, now 10 times more parameter model as if that saved humanity.