Logit Bias for Transformers? Suppressing unwanted tokens in output

leobg · March 3, 2023, 10:03am

For GPT-3, there’s the logit_bias parameter. It allows you to control how likely or unlikely the model is to pick a particular token for the output.

Can I do something similar with transformers, particularly a T5 model?

I am trying to make the T5 translate human language into the syntax used by an API.

For example:

Human: Get all news stories from 3 days ago.
API: stories[‘D-3’]

My problem is that, even after training with 9 million of examples, the model keep making up the slug part of the output.

For example:

Correct output: stories[‘D-3’]
Actual output: news stories[‘D-3’]

I think I could solve that problem if I could just effectively reduce the model’s output vocabulary to the tokens allowed by the API. In the above example, neither a space nor the word “news” would be allowed. So the model would have to pick the next best tokens, and would, I guess, come up with the correct stories slug.

Now, I cannot limit the actual vocabulary, or train the model from scratch. Because I rely on the model’s vast vocabulary from its pretraining for understanding my human language inputs. I need the model to understand human language. It’s only its output that I want to restrict down to the simple syntax that my API supports.

So what envision doing is…

Run all of my training data’s labels through the tokenizer, to build a set of tokens allowed in the output.
Subtract this set of allowed tokens from the set of tokens in the model’s vocabulary, to get a set of tokens that should be suppressed in the output.
Tell the model to apply a logit_bias of, say, -10 to any of the tokens in the suppressed_tokens set.

Any ideas how to do this?

ignacioiglesias · March 22, 2023, 2:02pm

To forbid tokens from appearing you can use a logits processor as described here in the docs. For something similar to the logits_bias parameter in GPT-3, I proposed a extension to the logits processor in this feature request. Feel free to react/comment in the issue. Hope this helps!

Topic		Replies	Views
Exclude words from GPT-2 generate( ) 🤗Transformers	3	1751	April 26, 2023
Getting probability distributions of T5 outputs 🤗Transformers	0	1157	August 30, 2022
Custom Decoding Strategy Beginners	0	458	December 6, 2023
Dynamic decoder token masking 🤗Transformers	0	237	February 13, 2023
Untrained T5 model outputting logits that argmax to the decoder_input_ids Beginners	0	499	September 28, 2022

Logit Bias for Transformers? Suppressing unwanted tokens in output

Related topics