How To Change Output Shape Of Multi Head Self Attention Output To A Shape That Can Be Fed To Convolution Layer

FujiAILearner · July 17, 2024, 12:39am

Hello, I encountered an error like this:

The output of MHSA (multi-head self-attention) is as follows:

torch.Size([20, 197, 768])

20 for batch size
197 for sequence length (previously 196, after adding the class token it became 197)
768 for embedding dimension

I want to reshape it to fit the format below in order to feed it to a convolutional layer:

torch.Size([batch_size, channel, width, height])

I’ve attempted to achieve this by adding a new dimension using the following approach:

torch.unsqueeze(1) torch.transpose(1, 3)

This successfully allows feeding to the convolutional layer. However, I’m unsure if this approach is correct, so please correct me if it’s not.

Currently, I’m trying a different approach:

new_size = int(math.sqrt(sequence_length))
torch.transpose(1, 2).view(batch_size, embed_dim, new_size, new_size)

This resulted in an error stating that the shape is invalid for an input of size (some_number). This is because the sequence length (197) doesn’t square perfectly, yielding a decimal number, and the view function expects an integer input, the square operation yields 16 after converting to an integer, but batch_size * 768 * 16 * 16 does not equal batch_size * 197 * 768, leading to the error

Is my analysis correct? And how can I resolve this issue? and is there any better approach?

Topic		Replies	Views
Reformer - attention data format Intermediate	1	399	June 29, 2023
Output embedding from each self-attention head from each encoder layer Intermediate	0	410	February 28, 2022
Multi-Head Attention in Transformers Models	2	148	January 12, 2025
Weight and shape different than the number of channels in input Intermediate	0	276	April 4, 2024
TimeSeriesTransformer - mat1 and mat2 shapes cannot be multiplied 🤗Transformers	12	3698	July 18, 2024

How To Change Output Shape Of Multi Head Self Attention Output To A Shape That Can Be Fed To Convolution Layer

Related topics