Chapter 6 questions

In our case, we consider hug because it is a strict substring of “hugs”. The notion of strict substring is only used here to select the initial tokens for this toy example (in a real use case, we will use a BPE algorithm for example). Then we calculate their frequency of appearance, independently of the fact that they are a strict substring or not.

How to do the :pencil2: Try it out! Compute the start and end indices for the five most likely answers.?

Hi @SaulLu,

I agree with @dipetkov on his comment about including “hug” in the frequencies.

I didn’t get the point from your comment, why we include “hug” in the frequencies from “hugs” and “hug”?

Thanks a lot