Hi folks,
I have a LLaVa style custom architecture that I’d like to run inference on with the generate
interface.
I looked around for docs or HOWTOs on what to implement but came up empty. I did look at the LLaVa codebase and try to imitate some of the functions but I feel more like I’m copying code instead of actually understanding what all need to be implemented and why.
I’m looking for some implementation cookbooks / docs or a better documented implementation that I reason through.
Thanks!