Blocking aggressive crawlers - whether or not they have anything to do with AI - makes complete sense to me. There are growing numbers of badly implemented crawlers out there that can rack up thousands of dollars in bandwidth expenses for sites like SourceHut.
Framing that as "you cannot have our user's data" feels misleading to me, especially when they presumably still support anonymous "git clone" operations.
I still maintain that since we already have this system (it's called "looking up your ISP and emailing them") where if you send spam emails, we contact your ISP, and you get kicked off the internet...
And the same system will also you get banned from your ISP if you port scan the Department of Defense...
why are we not doing the same thing against DoS attackers? Why are ISPs not hesitant to cut people off based on spam mail, but they won't do it based on DoS?
> why are we not doing the same thing against DoS attackers?
The first D in DDoS stands for "distributed", meaning it comes from multiple different origins, usually hacked devices. If we start throwing off every compromised network, we'd only have a few (secure) networks left. Probably network equipment vendors would quickly have to redo their security so it actually protects people.
I agree; blocking aggressive crawlers that are badly behaved, etc, is what is sense. The files that are public are public and I should expect anyone who wants a copy can get them and do what they want with them.
Framing that as "you cannot have our user's data" feels misleading to me, especially when they presumably still support anonymous "git clone" operations.