Why are embedding / pooler layers excluded from pruning comparisons?

Thanks for the tip about torch.sparse: from the docs it seems to use the COO format which should also work well :grinning_face_with_smiling_eyes:

And thanks for clarifying the reason for encoding the CSR format by hand - when I find a solution to the torch > 1.5 issue, I’ll expand the text accordingly!