While running the llama and falcon pipelines, I found that Llama-2 is over 30x slower than falcon for the same size (7b).
Is this something that is to be expected?
While running the llama and falcon pipelines, I found that Llama-2 is over 30x slower than falcon for the same size (7b).
Is this something that is to be expected?