Implimentation of Stopping Criteria List

abmfahimshahriar · January 23, 2024, 12:14pm

I was trying to use code from different replies and it did not work for me. So I had to check out the tensors and figure out the issue with the if check. For the stop_ids it was a tensor of shape [[259, 13, 13 ]] but it was checking with input_ids of shape [13, 13]. That’s why it never matches. Also using the wrong length of the stop_ids. You have to use the first element of the stop_ids. The following worked for me:

stop_list = [" \n\nQuestion:", " \nHuman:", " \n\n", ]
stop_token_ids = [tokenizer(x,  return_tensors='pt', add_special_tokens=False)['input_ids'] for x in stop_list]
stop_token_ids = [LongTensor(x).to(device) for x in stop_token_ids]


class StopOnTokens(StoppingCriteria):
    def __call__(self, input_ids: LongTensor, scores: FloatTensor, **kwargs) -> bool:
        for stop_ids in stop_token_ids:
            print(f"Testing {input_ids[0][-len(stop_ids[0])+1:]} against {stop_ids[0][1:]}")
            if eq(input_ids[0][-len(stop_ids[0])+1:], stop_ids[0][1:]).all():
                return True
        return False


stopping_criteria = StoppingCriteriaList([StopOnTokens()])

here is the output:
Testing tensor([13, 13], device=‘cuda:0’) against tensor([13, 13], device=‘cuda:0’)
and for the double new lines my model stops generating.
Note: I am using TheBloke/Llama-2-7B-GPTQ for text gen and tokenization.

Topic		Replies	Views
Stopping Criteria List - OPT model Beginners	1	1919	October 14, 2022
How to stop after generating "###" in transformers? Beginners	0	846	May 3, 2023
How to set stopping criteria in model.generate() when a certain word appears 🤗Transformers	3	3612	February 18, 2024
Generate function and stopping criteria - stop when generated entire word (continue if subtoken merely part of word) Beginners	0	2134	March 3, 2023
Implementing StoppingCriteria for Code Generating Transformers 🤗Transformers	2	2952	January 4, 2024

Implimentation of Stopping Criteria List

Related topics