Difference between enable_model_cpu_offload and device_mode

Whats the difference between the two? My understanding is with device_mode"auto" if gpu is available that will be used first and if the model is larger than once GPU fills up rest will be offloaded to the CPU. Does enable_model_cpu_offload achieve the same goal?

1 Like