Hacker News new | past | comments | ask | show | jobs | submit login

The matches must be non-overlapping, because Tarzan is contained in "Tarzan". Therefore, the input

  "Tarzan", she said.
contains two matches for the regex

  "Tarzan"|Tarzan.
The first match is at character 0, for the "Tarzan" branch of the regex. The second match is at character 1 for the Tarzan branch of the regex.

If matches can be overlapping, then the inner Tarzan is matched in spite of being surrounded in quotes, and the capture register is bound and all.

This works not as a property of the regex (what it matches), but the regex combined with a scanning algorithm that extracts non-overlapping matches.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: