Hacker News new | past | comments | ask | show | jobs | submit login

It's true, "sure" and "shore" are not pronounced exactly the same, and accents absolutely can vary, which is part of why Beider-Morse produces multiple encodings for each word. But the goal of Soundex-style phonetic encoding systems isn't to perfectly encode a word with a precise alphabet like the IPA. Rather, they intentionally introduce fuzziness so that words (really, names) that are pronounced similarly will be encoded the same way.

Perhaps "sure" and "shore" was a bad example; it's tricky to come up with these! And you're right that the encodings that happen to overlap for those words are technically "incorrect" pronunciations; again, these Soundex-style encoders are designed for surnames, not general English words. Some Storyteller users are testing out a version of Storyteller using this encoder to see if it makes any improvements (so far it seems like it's not worse, but not necessarily better!), but I won't be surprised if it doesn't end up making it into Storyteller long term.

Mostly I wrote this piece not to advocate for using BMPM to support forced alignment, but as a way to express the emotional journey that I found myself on as I learned more about these systems and where they came from.






Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: