How do I know which tokenizer to choose?
Example 1.
"The dog’s ran into the church. "
model 1: [ The, dog’s, ran, into, the, church]
model 2: [ The, dog, 's, ran, into, the, church]
This provides 2 different meaning to a model. How do I know to choose a tokenizer that store the whole word or breaks down the parts of a word?