How to enter longer prompt words

kun1017 · January 13, 2025, 12:48pm

i want used sd model, but the prompt limit to 77 tokens? how can i enter longer prompt

John6666 · January 13, 2025, 1:46pm

There are two ways to use long prompts: using the community pipeline or embedding. There are several community pipelines, but if you want to use long prompts, you can use lpw_stable_diffusion or lpw_stable_diffusion_xl.
The embedding method is a little difficult, so I recommend the community pipeline.

github.com/huggingface/diffusers

Overcoming the 77 token limit in diffusers

opened 04:38AM - 27 Jan 23 UTC

closed 03:03PM - 06 Mar 23 UTC

jslegers

stale

## Description of the problem CLIP has a 77 token limit, which is much too sm…all for many prompts. Several GUIs have found a way to overcome this limit, but not the `diffusers` library. ## The solution I'd like I would like `diffusers` to be able to run longer prompts and overcome the 77 token limit of CLIP for any model, much like the [AUTOMATIC1111/stable-diffusion-webui](https://github.com/AUTOMATIC1111/stable-diffusion-webui/) already does. ## Alternatives I've considered * I tried reverse-engineering the prompt interpretation logic from one of the other GUIs out there (not sure which one), but I couldn't find the code responsible. * I tried running the [BAAI/AltDiffusion](https://huggingface.co/BAAI/AltDiffusion/tree/main) in `diffusers`, which uses [AltCLIP](https://github.com/FlagAI-Open/FlagAI/tree/master/examples/AltCLIP) instead of CLIP. Since AltCLIP has a `max_position_embeddings` value of 514 for its text encoder instead of 77, I had hoped I could just replace the text encoder and tokenizer of my models with those of BAAI/AltDiffusion to overcome the 77 token limit, but I [couldn't get the BAAI/AltDiffusion to work in diffusers](https://github.com/huggingface/diffusers/issues/2135) ## Additional context This is how the AUTOMATIC1111 overcomes the token limit, according to their documentation : > Typing past standard 75 tokens that Stable Diffusion usually accepts increases prompt size limit from 75 to 150. Typing past that increases prompt size further. This is done by breaking the prompt into chunks of 75 tokens, processing each independently using CLIP's Transformers neural network, and then concatenating the result before feeding into the next component of stable diffusion, the Unet. > > For example, a prompt with 120 tokens would be separated into two chunks: first with 75 tokens, second with 45. Both would be padded to 75 tokens and extended with start/end tokens to 77. After passing those two chunks though CLIP, we'll have two tensors with shape of (1, 77, 768). Concatenating those results in (1, 154, 768) tensor that is then passed to Unet without issue. >

kun1017 · January 19, 2025, 2:27am

Thanks, I’ll try it

system · January 22, 2025, 12:37pm

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Issue with max_length 🤗Transformers	1	2486	September 27, 2020
Provide CLIP embeddings directly to diffuser Beginners	0	336	August 5, 2023
Providing embeddings directly to the diffusion pipeline 🧨 Diffusers	0	361	August 4, 2023
How to set minimum length of generated text in hosted API Beginners	2	1603	March 10, 2021
What happens if I exceed the number of tokens that fit in the widget? Models	0	47	December 3, 2024

How to enter longer prompt words

Related topics