Hacker News new | past | comments | ask | show | jobs | submit login

The value in BLAST wasn't in its (very fast) alignment implementation but in the scoring function, which produced calibrated E-values that could be used directly to decide whether matches were significant or not. As a postdoc I did an extremely careful comparison of E-values to true, known similarities, and the E-values were spot on. Apparently, NIH ran a ton of evolution simulations to calibrate those parameters.

For the curious, BLAST is very much like pairwise alignment but uses an index to speed up by avoiding attempting to align poorly scoring regions.




BLAST estimates are derived from extreme value theory and large deviations, which is a very elegant area of probability and statistics.

That's the key part, I think, being able to estimate how unique each alignment is without having to simulate the null distribution, as it was done before with FASTA.

The index also helps, but the speedup comes mostly from the other part.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: