Finally got RTX 6000 Ada and trying to compare with 3090. I downloaded 12 billion language model from Open Assistant. And I get the following results:
7.8 tokens/sec - 3090
5.3 tokens/sec - 6000 Ada
Although if i use only pytorch with simple model 6000 Ada is much faster.
-GIGABYTE LGA1151-v2 H370 Aorus
-MSI 3090 Ventus 24gb
-PNY RTX 6000 Ada 48 gb
My software features: