Example for TAPAS MaskedLM fails to run

pafitis · July 26, 2021, 5:16pm

Hi,

Apologies if this subthread is incorrect for such suggestions.

I am trying to get this example posted to work but it seems that there are a few issues.

Example:

from transformers import TapasTokenizer, TapasForMaskedLM
import pandas as pd

tokenizer = TapasTokenizer.from_pretrained('google/tapas-base')
model = TapasForMaskedLM.from_pretrained('google/tapas-base')

data = {'Actors': ["Brad Pitt", "Leonardo Di Caprio", "George Clooney"],
        'Age': ["56", "45", "59"],
        'Number of movies': ["87", "53", "69"]
}
table = pd.DataFrame.from_dict(data)

inputs = tokenizer(table=table, queries="How many [MASK] has George [MASK] played in?", return_tensors="pt")
labels = tokenizer(table=table, queries="How many movies has George Clooney played in?", return_tensors="pt")["input_ids"]

outputs = model(**inputs, labels=labels)
last_hidden_states = outputs.last_hidden_state

Particularly, there seems to be a mismatch in dimensionality between inputs and labels causing the following ValueError:

ValueError: Expected input batch_size (32) to match target batch_size (34).

I think this arises from the EOS and CLS tags but this is a quick guess.

Additionally, if I adjust the labels/ inputs to have equal dimensions then I get a second error when retrieving the last hidden state:

AttributeError: 'MaskedLMOutput' object has no attribute 'last_hidden_state'

Could you please advise on what the expected behaviour is?

Thanks

Topic		Replies	Views
Unable to convert Huggingface model to torchscript Beginners	1	472	June 10, 2023
Proper use of TAPAS model? Beginners	2	840	May 5, 2022
Convert TAPAS tf checkpoint to PyTorch 🤗Transformers	0	597	July 17, 2020
Using .generate with TAPAS as encoder in EncoderDecoder Intermediate	4	611	January 18, 2022
ValueError: You have to specify either decoder_input_ids or decoder_inputs_embeds 🤗Transformers	3	1763	November 14, 2023

Example for TAPAS MaskedLM fails to run

Related topics