I’ve recently been trying to work with large models and discovered that using device_map="auto"
is an easy way to delegate loading large models onto GPU devices without me having to use .to
with a lot of other configuration.
I want to learn a bit more about how to design device maps but it seems like the documentation doesn’t have it (I tried following the hyperlink from “Handling Big Models for Inference” but it leads to a 404 page).
I tried taking a look at the source code for modeling_utils.PretrainedModel.from_pretrained
but am having some trouble getting a better grasp.
Is there anywhere else that I could look to learn more about how device mapping works?