Hacker News new | past | comments | ask | show | jobs | submit login

You are doing regex matching in the Cascading code, but splitting on a character in the pangool code. The latter is obviously much faster. I don't know that that's the reason for the difference you observe, but it certainly can't hurt to fix that and make the user-supplied code more comparable.



Indeed that regex was problematic because it had a bug itself. We replaced that line by RegexSplitter and updated the benchmark page. Please shout if you notice something else wrong. Thanks.


Just for clarify, split() java function is using regexp for the split as well. The code of String.split() is:

return Pattern.compile(regex).split(this, limit);

The benchmark seems fair to me.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: