[Proposal] Copy Pasting modeling_bart.py

valhalla · September 15, 2020, 2:05pm

I agree with Iz here.
As long as the code which the model inherits is in single file, then it’s readable and personally for me don’t make much sense to copy paste.

And if the models share the code then adding some new functionality (gradient checkpoiting, new heads) to the base model gives me the same functionality for free for other sub-classed models. This actually helps more with experimentation. For example I wanted to try MBart for seq classification and as it inherits from BART completely all I had to do was just subclass BartForSequenceClassfication.
Without this, I would have to copy paste the head, test it again which slows down things.

Another example, I wanted to experiment with Camembert with EncoderDecoder and as it inherited from Roberta which was already in EncoderDecoder , it was very simple and fast change without requiring much extra code and tests as the tests also came free with base model.

And IMO in some cases such refactoring might even introduce unintended effects, IIRC after refactoring longformer to remove Roberta abstraction a major slowdown was introduced, @beltagy might remember this.

Topic		Replies	Views
Pegasus Questions 🤗Transformers	29	3945	July 5, 2021
Pegasus Model Weights Compression/Pruning Models	14	4257	February 15, 2023
@sshleifer Progress Update Aug 4 -> Aug 19 🤗Transformers	5	502	August 19, 2020
[Announcement] Model Versioning: Upcoming changes to the model hub Models	34	15060	December 4, 2020
Fine-tuning Pegasus Models	33	10134	October 14, 2021

[Proposal] Copy Pasting modeling_bart.py

Related topics