RuntimeError: The size of tensor a (48) must match the size of tensor b (64) at \nnon-singleton dimension 0"}

Seems unresolved issue of TGI?

rdaya
In case anyone runs into this, the trick (a bad one) is to set USE_FLASH_ATTENTION=false