Hacker News new | past | comments | ask | show | jobs | submit login

interesting, how would a self-hosted open source version make money tho in order to support itself and continue to upgrade?

Is this even a realistic business model? Seems like this is what Scrapy is doing and what Import.io is doing. Make the tool free in order to get free marketing and then charge people willing to pay money to extract data.

Meanwhile I see Mozenda charging like 5 cents for each page extracted, do you think this is a fair model or does it not matter?




So for Scrapy and Portia, they are both free as in beer, specifically because we believe in the power of open source. Scrapy actually precedes Scrapinghub and was certainly not developed as a marketing tool.

Charges come with large scale crawls (above certain limits on our platform), additional products like Crawlera (our smart downloader that routes requests from a crawl through a pool of IP addresses to avoid bans), datasets, and for us to handle complex crawls for companies outsourcing to us.

Our model is that there is something for everyone whether you are looking to dip your toes into web scraping (free), use it occasionally (usually journalists) or dependent on web crawling for your business.


>Scrapy actually precedes Scrapinghub

Right. I had first come across Scrapy, while browsing the web for Python software tools, some years ago, on the site of a company in Uruguay called Insophia. It was in the list of products developed by them, and that they worked on. Scrapinghub came later.


by proposing paid hosting and support for companies that don't want the burden to manage it themselves ? There could be some additional features with the paid version also


I'm wondering about this but how realistic do you expect someone to pay when it's already free? I would imagine only large enterprise users so essentially you are supporting free users by charging enterprise users that may pay you for support.

Horrible industry imho when you have to give away things for free just to be competitive. I just don't get why people would expect software to be free.


We don't expect anyone to pay for Scrapy or Portia!

We provide the best Platform (as a Service) to run Scrapy or Portia spiders, and will soon be supporting most standard web scraping technologies. This is free for light users, but we charge for people who need extra or dedicated computing or network resources.

We also provide help to startups or enterprise orgs looking to get help in building a web data harvesting system (more than just parsing pages!), either by building it ourselves or by helping our partners train their engineers in using our technologies.

This has worked so far, and we're very healthy from a revenue perspective – more than doubling every year for a few years now, and good enough to grow to become the largest fully distributed company outside of the US.

We're pretty happy with being a brand that gives to the community, it tends to get repaid 10x in the long run.


how much are you generating in terms of revenue? are you venture funded?


Would need to Kickstart or Patreon it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: