Hacker News new | past | comments | ask | show | jobs | submit login

Somewhat related, I recently read about the undocumented tokenizer that comes with the python re module. https://www.reddit.com/r/ProgrammerTIL/comments/4tpt03/pytho...



Word of warning: The Scanner class does not adhere to the maximal munch rule, so you shouldn't use it to match keywords the way that it's done in the code in the linked reddit thread. If you replace the word "foobar" in the input with "truebar", it will then be tokenized as the keyword true followed by the identifier "bar", which is obviously not what you want.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: