Hacker News new | past | comments | ask | show | jobs | submit login

I did mean hamming. I group by length as insertions and deletions are not expected. I convert to a one-hot vector so I can xor to get the differences.

Thanks for the suggestions. I hadn't thought about splitting the the strings into 3 roughly equal sizes though, that would definitely help narrow the search pool. FWIW, the strings use the protein alphabet (20 values instead of 4).




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: