well.. extracting links etc is super fast with lxml's xpath. It is written in C,...

truncate · on Oct 18, 2015

I was using beautifulsoup with lxml backend I believe. I should have mentioned earlier. There were some other graph manipulation stuff too, like favoring links with more inlinks, keeping web crawler polite but still busy by looking at other domains. This is more expensive that extracting links I guess. I had a submission deadline, but whatever I tried in that time with Python didn't work. It was just easier to write faster code in Go (except maybe where regex are involved, now I remember I used some Go markup parser instead that is now in their library).