Convert Conv1D to nn.Linear

IdoAmit198 · April 16, 2023, 4:45pm

Hi everyone,

I’m working on a research regarding GPT2 and want to test few ideas which apply for linear layers.

What’s the problem? GPT2 model consist of Conv1D layers instead of nn.Linear.

Now I know that Conv1D is said to be just like linear, but transposed, and still I’d love if someone can help me out to convert Huggingface pre-trained GPT2 model (the one with the Conv1D) to equivalent GPT2 model with nn.Linear instead.

Thanks for any coming help!

#transformers #models #research

NamburiSrinath · January 20, 2024, 6:34pm

Hi @IdoAmit198,

Did you figure out a way? Because I also want to apply some function I’ve written on this model but the function currently works only on Linear classes!

Any help is appreciated

meztech · May 12, 2024, 6:39pm

This is as simple as it seems. Just set the weights of a linear layer to the transposed weights of a Conv1D layer and set the bias of the linear layer to be the same as the bias of the Conv1D.

It’s as simple as:

test_in = torch.rand((1,1024))
test_c1d = <some pretrained Conv1D layer of size (1024,1024)>
test_lin = torch.nn.Linear(1024,1024)
test_lin.weight = torch.nn.Parameter(test_c1d.weight.T)
test_lin.bias = test_c1d.bias

test_out_c1d = test_c1d(test_in)
test_out_lin = test_lin(test_in)

Show that results are equivalent:

test_out_c1d == test_out_lin

Topic		Replies	Views
Applying movement-pruning on GPT2 🤗Transformers	1	1213	July 16, 2023
Some clarification on Conv1D Beginners	1	1717	May 1, 2024
Help converting model weights from polycoder gpt-neox 🤗Transformers	1	440	August 11, 2022
Converting a Hugging face model to pytorch, ONNX, or TensorRT Beginners	0	553	May 24, 2024
How were the GPT2 pretrained tensorflow models created? 🤗Transformers	1	377	July 20, 2020

Convert Conv1D to nn.Linear

Show that results are equivalent:

Related topics