I don't see his point. Doesn't renormalizing token counts essentially eliminate ...

lopuhin · on Nov 27, 2019

Good point, assuming some extent of collapse is crucial, and the question is if different perplexities due to tokenization can happen in principle. You are right that in "Alice" vs. "A|lice" example we get the same perplexity after re-normalization, I can't come up with an example where it would be different right now.

kwrobel · on Nov 27, 2019

I agree. Perplexities (probability of a text) can be compared using different tokenization after normalization.