Cuda version conundrum

Hello,

Transformers relies on Pytorch, Tensorflow or Flax. I typically use the first.

In any case, the latest versions of Pytorch and Tensorflow are, at the time of this writing, compatible with Cuda 11.8.

Lucky me, for Cuda 11.8 is supposed to be the first version to support the RTX 4090 cards.

Well, not fully, apparently:

MapSMtoCores for SM 8.9 is undefined.  Default to use 128 Cores/SM
MapSMtoCores for SM 8.9 is undefined.  Default to use 128 Cores/SM
MapSMtoArchName for SM 8.9 is undefined.  Default to use Hopper
GPU Device 0: "Hopper" with compute capability 8.9

I believe the 4090 to be an Ada Lovelace, not a Hopper.

Will the fact that the card is not correctly identified by Cuda have any effect in resource utilisation and/or performance?

Is there anything we could do about that?

Does anyone know if Torch works with a more recent Cuda? Or can the MapSMtoCores and MapSMtoArchName variables be somehow hard-coded? Or is this completely irrelevant?

Best,

Ed

For Pytorch, the nightly version is compatible with Cuda 12.1, which fully supports the card and simplifies things considerably.