Implementing a custom Attention Transformer

iakarshu · September 3, 2021, 4:54am

Hello everyone, currently I am trying to implement a custom attention transformer, whose attention is given on Page No. 4 of this link. They have used hugging face for the implementation, and I am not sure about how to go for approaching this problem, and how to use hugging face to implement custom attention. Can anybody guide me, about how to go about implementing this? Thanks,

lewtun · September 3, 2021, 8:24pm

Hey @iakarshu my best guess is that the authors implemented DocFormer from scratch, so as far as I can tell you can’t do some clever subclassing of an existing model to tweak the attention layers.

Having said that, you could look at the implementation of LayoutLMV2 which seems to share a similar approach and you can use this template to get all the basic modeling files.

Do you know if AWS open-sourced the pretrained weights of DocFormer? Without them, you might need a lot of compute to build a useful model.

Hope that helps!

iakarshu · September 4, 2021, 3:02am

Hey @lewtun, thanks a lot for sharing this, maybe then I would focus on implementing it from scratch, and learn from the implementation of LayoutLMV2, thanks a lot for that. And for the computation, I have some resources, which means NVIDIA DGX to work, and I am searching about the open-source Docformer code, but I am not getting it. I mailed the author and they refrained from sharing the code, so I don’t think that they have open-sourced it. Again, thanks a lot for replying.

lewtun · September 6, 2021, 8:35am

Hey @nielsr is DocFormer currently on your roadmap for transformers?

@iakarshu is thinking about having a go at implementing and pretraining it (because the authors didn’t release code or weights), so I thought it would be good to double-check that you don’t do the same work twice

nielsr · September 6, 2021, 9:48am

No it’s not on my list, seems interesting.

However, if there are no pre-trained weights available (and even no code), then there’s a low chance for me to add it to the library.

iakarshu · September 6, 2021, 10:14am

@nielsr @lewtun thanks a lot, then I would do it, and would ask the community if i get stucked, thanks a lot, I shall begin my coding then

Topic		Replies	Views
Reproduce attention is all you need Beginners	0	495	June 25, 2022
Original transformers model implementation Beginners	2	999	June 1, 2022
How to use transformer attention model when the input is features Beginners	1	1253	October 12, 2020
Train a transformer from scratch 🤗Transformers	0	445	August 9, 2021
Convert models to Longformer Intermediate	3	2200	February 1, 2021

Implementing a custom Attention Transformer

Related topics