Hi!
Speculative decoding is a hot topic right now!
Here is a blog post I wrote proposing mentored decoding, a new variant of speculative decoding. It maximizes the probability to accept the draft tokens while maintaining the Kullback-Leibler divergence between the resulting distribution and the target distribution below a constant D. Speculative decoding can be seen as a special case of mentored decoding for D = 0.
Feedback welcome!
- Blog post
- Summary in a Twitter thread
Thanks to @joaogante who got me interested in speculative decoding with his blog post.