Accelerate Utilities
I think that’s the correct workaround. If you want to treat VRAM, RAM, and disk as a single entity, you should use the Accelerate library.
Environments where RAM is less than VRAM are not commonly expected, so this issue is not often reported, but excessive RAM consumption during model loading can occasionally become an issue.