Fine-tuning BERT-based language model to overcome gender-bias

ArmanYekekhani · July 4, 2021, 6:57pm

Fine-tuning BERT-based language model to overcome gender-bias

Transformer-based language models such as BERT, and GPT-x has pushed the state-of-the-art on many tasks including language modeling. However, thorough examination [1] of these models has shown their biases toward specific genders, events, etc. For instance, in the sentence “David is driving a car”, almost even attention is paid to different parts. Nonetheless, this does not hold for “Mary is washing the dishes”, in which model is considerably focused on the name “Mary”. We think that by refining the loss function and modifying the training schedule, this artifact might be resolved. We expect a nearly bias-free model to output the same probability for plausible candidates of a masked token.

[1] Do Neural Language Models Overcome Reporting Bias? (https://aclanthology.org/2020.coling-main.605.pdf)

2. Language

The model will be trained in English.

3. Model

We intend to use the bert-base-uncased model.

4. Datasets

We aim to use common datasets used for pre-training of BERT, BookCorpus, and English Wikipedia.

Possible links to publicly available datasets include:

<BookCorpus: https://github.com/soskek/bookcorpust>
the link to the Wikipedia dataset is omitted due to new user’s limitations.

5. Training scripts

We had a similar experience of fine-tuning a BERT model for the task of sentiment analysis and language modeling, and we hope by small modifications to our previous codes training for the new objective will be possible.

6. Challenges

There are two central challenges in this project: First, we need to devise a decent loss function for the training to effectively minimize the bias. Second, if redefinition of the bias did not help, we need to train the model from scratch with newly added tokens, plus the new loss function.

7. Desired project outcome

We want to test the model on the task of masked language modeling and hope to see bias-free predictions for masked tokens.

jalipo · July 5, 2021, 6:14am

you have addressed an interesting subject in which I’m utterly enthusiastic to participate.
as you mentioned earlier, it is good to add a new token for genders to tackle the reported bias.

aminaminimehr · July 5, 2021, 6:32am

In my point of view, there seems to be an unfilled gap in such areas which must be scrutinized by a great devotion in this discipline, I assert my motivation to this cutting edge topic with great interest to see its results and further applications.

BramVanroy · July 5, 2021, 8:46am

I am very curious whether this could be solved on the model-side rater than on the data side. It is a decades-old saying “garbage goes in, garbage goes out”. If your data contains biased data, the model will contain and even amplify such bias. You are suggesting that this problem can be programmatically solved without a hit on the actual performance of the model. To me that sounds counter-intuitive (the model performs “so well” because it can reproduce the bias in the data) but I would love to be proven wrong.

Good luck!

ArmanYekekhani · July 5, 2021, 11:17am

Glad to hear that mate !

ArmanYekekhani · July 5, 2021, 11:39am

Hey Bram, thank you for your interest! I agree that these models are good since they adapt well to the data. There are two options on the model-side: First, fundamental changes in the neural structure of the net. Second, modification of training schedule. The first option is not reasonable as current models already perform well. For the training schedule, the combination of recently developed self-supervised learning techniques and a new loss function looks promising.

NightMachinery · September 11, 2022, 11:02pm

Does anyone know a code example of fine-tuning the Flax BERT model? The notebook in the official repo is not runnable. I tried fixing it, but it gets stuck on jit-ing.

Topic		Replies	Views
Is masking still used when finetuning a BERT model? Beginners	1	1322	July 29, 2020
Questions about my first code on fine-tuning BERT model for text-classification Beginners	0	1508	April 26, 2022
Calculating accuracy during fine-tuning the BERTForMaskedLM 🤗Transformers	6	2767	October 1, 2020
Usind a fine-tuned sentence completion model in a Masked LM task 🤗Transformers	2	379	September 23, 2021
Fine-tuning BERT Model on domain specific language Models	1	1798	January 5, 2021

Fine-tuning BERT-based language model to overcome gender-bias

Fine-tuning BERT-based language model to overcome gender-bias

2. Language

3. Model

4. Datasets

5. Training scripts

6. Challenges

7. Desired project outcome

Related topics