Hi! I would like to use the speculative decoding available in generate() with the assistant_model parameter, for editing a piece of text. However, rather than using another model, I want to use a static string: the text that I am editing. This way, I can add the strong assumption that the output text will be very similar to the input text, and the model’s generations will only be edits of portions of the original text. Is this supported in generate()?
1 Like