Increasing Token Limits for long strings for knkarthickMEETING_SUMMARY

dcarlmark · November 9, 2022, 8:57pm

Cross posting from the Discord since I’m not exactly sure where this question should go

knkarthick/MEETING_SUMMARY · Hugging Face link for context

I’ve been trying to work with this model, but I can’t seem to figure out how to increase the token limit for a single string sample. I have very limited experience with the high level pipelines and customizing them, so this probably has a simple solution that is going over my head.

Whenever I try to run the example that begins with “Hi, I’m David and I’m supposed to be an industrial designer. Um, I just got the…” on my local machine using the summarization pipeline object I get a warning that says

Token indices sequence length is longer than the specified maximum sequence length for this model (3398 > 1024). Running this sequence through the model will result in indexing errors

and then a corresponding error as the code fails with an IndexError: index out of range in self error.

The thing that confuses me though is when I run the same example through the text box API thing on the actual model page, it is able to generate a summary. Is there some way of dynamically increasing the maximum sequence length past 1024 tokens? Is it through passing custom parameters? Or a custom tokenizer? Or a custom model? If it is through parameters when do I pass them? Is it when I initialize the summarizer object? Do I pass it every time I call the summarizer to summarize text? I Any help or guidance here would be greatly appreciated.

Topic		Replies	Views
Exception from Summarization Network Beginners	0	420	February 9, 2023
Truncating sequence -- within a pipeline Beginners	7	5786	May 3, 2024
Getting error even after setting the max_length Beginners	1	2046	November 30, 2023
Token indices sequence length is longer than the specified maximum sequence length 🤗Tokenizers	4	23049	February 15, 2023
Is summary of 1024 tokens not useless? 🤗Transformers	1	665	July 1, 2022

Increasing Token Limits for long strings for knkarthickMEETING_SUMMARY

Related topics