GPT-J-6B - Fine Tuning

Hello,

I am trying to understand the GPT-J architecture, am having a question on the fine tuning code.
While we are doing fine tuning, are we unfreezing any layers specifically in the network or does it happen internally. Could anyone briefly explain how fine tuning woks for GPT-J. Thank you

Regards,
Balaji