It's a natural monopoly. It has already de facto prohibited everyone else's own ...

lazyjones · on Dec 15, 2020

I disagree on the "natural" part. Robots.txt that put other search engines at a disadvantage aren't the norm, they're, just like in the early years, some websites supporting only Netscape and MSIE, a direct consequence of Google's current market share and might change once there is a good reason (like DDG growing into a significant player).

If a collection like commoncrawl with bulk downloads was more useful and thus used more often, even Google would have a good reason to use it.

octoberfranklin · on Dec 15, 2020

> Robots.txt that put other search engines at a disadvantage aren't the norm, they're, just like in the early years

It's not just robots.txt, it's also cloudflare and IP-based throttling. And it is very, very commonplace: http://gigablast.com/blog.html

ColinHayhurst · on Dec 17, 2020

Not commonplace https://news.ycombinator.com/item?id=25373909