Proper use of TAPAS model?

I’m trying to use the TAPAS model (specifically, google/tapas-base) to generate an embedding for a table from Wikipedia. I have data that looks like this:

table_data = {'column_header': ['name',
                             'nationality',
                             'birth_date',
                             'article_title',
                             'occupation'],
        'content': ['walter extra',
                       'german',
                       '1954',
                       'walter extra\n',
                       'aircraft designer and manufacturer'],
}

And I’m generating a representation like this:

tokenizer = transformers.TapasTokenizer.from_pretrained("google/tapas-base")
model = transformers.TapasModel.from_pretrained("google/tapas-base")
model.to(device)
df = pd.DataFrame(columns=table_data['column_header'], table_data=[table['content']])
inputs = tokenizer(
     table=df,
     padding="max_length",
     return_tensors="pt"
)
with torch.no_grad():
output = model(**inputs)
encoding = output.pooler_output.squeeze(dim=0).cpu()

Also I’m getting the following warning:

TAPAS is a question answering model but you have not passed a query. Please be aware that the model will probably not behave correctly.

but I think that’s ok, since I just want to use TAPAS to generate an embedding for a table– I’m not doing question-answering.

However, I’m observing that the generate representation (encoding in my code) is not very useful, it’s actually significantly less useful for my task than just passing the table to bert-base as a string. This doesn’t seem right, and I think I must be using TAPAS incorrectly somehow, but I can’t see what I’m doing wrong. Can someone familiar with TAPAS take a look at my usage and let me know if anything looks out of order?

cc @nielsr might be able to help here

@nielsr do you have any idea what I’m doing wrong?