I'm beginning to think that there is a niche for a peculiar kind of a search engine. A search engine for static almost-none to none JavaScript pages. It would penalize pages for ad-network usage.
I would really like to not have in search results most sites that try to monetize on my attention. I want raw facts and opinions. No click-bait to grab my attention or feed my internal cave man with rage. No ad-networks or data extraction operations. Just pages put there by people that want to share knowledge and ideas. I mostly find it on pages that lack ads and often are pure HTML - no CSS and no JS. At least in areas that interest me.
Maybe there is a place for a search engine that would index only pages like that? It certainly would be easier than competing with Google on indexing whole of the attention-whoring Internet.
I had that feeling of discovering Internet again when I used tor and surfed hidden websites for the first time and read beginner's wikis, opinions pieces such as The Matrix, etc.
I am not interested in most of the "deep web" but what you say sounds interesting. Could you please provide link to that Matrix thing? And other pieces you found interesting?
http://zqktlwi4fecvo6ri.onion/wiki/index.php/Main_Page is the wiki I stumbled upon when I first accessed hidden websites, the matrix rant is the first link, but it's not in the form of what I remember (PS: I do not endorse the content, it's mostly a critic of our society's mechanisms).
> It certainly would be easier than competing with Google on indexing whole of the attention-whoring Internet.
Probably not, actually; the kind of pages you describe would almost always be leaf nodes on the web graph, so your spider would need to walk "through" the attention-whoring parts to get to them, whether you kept records of doing so or not. (And it'd be very inefficient to not.)
I don't know about that - I find that I get a lot of my information from sites that have user generated content such as Medium, reddit, and of course HN. I think it would be extremely hard to fit in sources like that to your search engine without letting in what I will admit is garbage. Would be very cool if it did manage to though!
I would really like to not have in search results most sites that try to monetize on my attention. I want raw facts and opinions. No click-bait to grab my attention or feed my internal cave man with rage. No ad-networks or data extraction operations. Just pages put there by people that want to share knowledge and ideas. I mostly find it on pages that lack ads and often are pure HTML - no CSS and no JS. At least in areas that interest me.
Maybe there is a place for a search engine that would index only pages like that? It certainly would be easier than competing with Google on indexing whole of the attention-whoring Internet.