Using loaded model with accelerate for inference

You can’t use disk offload on CPU, this is only supporter on GPU for now.

2 Likes