Finetune BERT for information extraction

erfangc · June 6, 2022, 3:44am

Hello,

What is the best way to finetune/pre-train a model using BERT/T5 or BART (or something else) that would convert text to extracted JSON:

My name is John → { name: John }

Assuming my data comes in the form of:

# train.tsv
text,extraction
name:   Jill,name:Jill
My name is Jack,name:Jack

So far I’ve tried:

model_checkpoint = "google/mt5-small"
model = AutoModelForSeq2SeqLM.from_pretrained(model_checkpoint)
tokenizer  = AutoTokenizer.from_pretrained(model_checkpoint)

# ... skipping for brevity

args = Seq2SeqTrainingArguments(...)

data_collator = DataCollatorForSeq2Seq(tokenizer, model=model)

trainer = Seq2SeqTrainer(
    model,
    args,
    train_dataset=tokenized_datasets["train"],
    eval_dataset=tokenized_datasets["test"],
    data_collator=data_collator,
    tokenizer=tokenizer,
    compute_metrics=compute_metrics,
)

trainer.train()

Is it a good idea to use an existing Seq2Seq model? Or is there a way to transfer the weights of the Transformer layers and discard the heads?

The reason I ask is b/c - while the Encoder weights are valuable … I feel like the Decoder weights or LMHeads are actually counterproductive - since they’ve been pre-trained on a natural language and not JSON (or any structured language) … and so my intuition would be to to re-initialize them randomly and train their weights from scratch

Topic		Replies	Views
How to finetune a bert model to a Summarizer Beginners	2	4983	March 7, 2022
Fine-tune, or train from scratch? Beginners	6	3462	September 16, 2020
BART from finetuned BERT Intermediate	2	472	September 9, 2021
Custom tokenizer: finetune model or retrain model? 🤗Transformers	1	929	March 8, 2024
Fine-tuning BERT for code translation Beginners	0	770	July 7, 2023

Finetune BERT for information extraction

Related topics