Hacker Newsnew | past | comments | ask | show | jobs | submit | frogperson's commentslogin

I think it would be really cool if someone built a reverse proxy just for dealing with these bad actors.

I would really like to easily serve some markov chain non-sense to Ai bots.


perhaps Iocaine [1] is what you're looking for. See the demo page [2] for what it serves to AI crawlers.

1. https://iocaine.madhouse-project.org/

2. https://poison.madhouse-project.org/



This site blocked me right away, seems quite agressive

Seems like a good way to waste tons of your bandwidth. Almost every serious data pipeline has some quality filtering in there (even open-source ones like FineWeb and EduWeb). And the stuff Iocaine generates instantly gets filtered.

Feel free to test this with any classifier or cheapo LLM.


When a new truck is $80k, it has to do everything becuase its an only vehichle. If they made $20k-$30k trucks, then its alot easier to justify it as a second vehichle that isnt required for long trips.


That's very true. I bought a quad-cab midsize truck and it feels like the ultimate compromise:

- Not amazing at hauling people

- Only OK payload

- Not the best gas mileage

- Too expensive (but still cheaper than other midsize trucks -- $36k)

With how much everything costs this truck really _had_ to be a compromise. It had to be able to do everything. I'd have much rather had an old crappy truck and then a normal family car, but those seem to have all been priced out.


We need a crowd sourced list like adgaurd, but for bots. Id love to block all those ips at the firewall.


The only way you can block these "AI" scrapers is a combination of IP filtering (https://spur.us/) and Fingerprinting (https://abrahamjuliot.github.io/creepjs/).

Things like browserbase are easy to block with this. It's a losing battle though, personally moved entirely to real environments for https://browser.cash/developers


So that would be at least: GCP, Azure, Alibaba, AWS, Huawei, AT&T, BT, Cox... it's a long list.

User Agents then? No, because that would be: Chrome and Safari.

It's an uphill battle, because the bot authors do not give a shit. You can now buy bot network from actual companies, who embed proxies in free phone games. Anthropic was caught hiding behind Browserbase, and neither of the companies seems to see problem with that.


Block GCP, AWS, Azure, and various datacenter prefixen, and you're pretty much golden. There are scant few legitimate reasons a human being's traffic would originate from those hosts.


You can run virtual desktops in the cloud, like AWS's Workspaces, sold as a business rather than developer offering. AWS does publish the IP range those clients use, and I assume other similar offerings out there do the same.


I am working from a cloud desktop but I am only visiting corporate approved resources from that cloud desktop and I believe that is the case of most cloud desktop users as the whole point is to have a clear separation of duties.


Correct, but I don't think it's a safe assumption that approved resources wouldn't have a reason to block requests from the cloud.


I'm sure people who can afford to run virtual desktops in the cloud can also afford a phone/laptop/desktop to access sites that block those virtual desktops in the cloud.


I'm thinking more along the lines of people using virtual desktops assigned by their job, and those sites are part of their work. I don't feel like punting to BYOD is a good solution.



A large portion of those addresses will be valid residential IP addresses running malware on compromised Windows machines.


I'm still shocked they have any sales at all. You couldn't pay me to drive around in one of Musk's Hitler-mobiles. There is no way I would pay money for one of his cars or support him in any way.


Used Model 3s are a steal though


They are a relative steal compared to previous prices. If I was going to buy a used EV as a commuter vehicle, I buy an even cheaper one to minimize risk from parts and repair costs. I don't know if it's deserved but I see a lot of Tesla service horror stories.


What shocks me even more is that TSLA is still about 10% up YoY. What path to (hyper-)profitability are investors seeing?


Musk is shilling the robots.


This is fascism, the rules coudlnt be clearer. Enemies bad, friends good. You dont need laws or logic for that.


Can you point me to any examples of russia doing something good or helping anyone except billionaires? No? Then their reputation is well deserved.


Could this be solved with an EULA and some language that non-human readers will be billed at $1 per page? Make all users agree to it. They either pay up or they are breaching contract.

Is this viable?


Say you have identified a non-human reader, you have a (probably fake) user agent and an IP address. How do you imagine you'll extract a dollar from that?


Most of my scraper traffic came from China and Brazil. How am I going to enforce that?


> Is this viable?

no

for many reasons


Trump wasnt satisfied bankrupting his own casinos, he's determined to bankrupt the entire city.


https://liberux.net/ looks promising as well.


WARNING: This is a Kickstarter device still, and needed funding to even create a proof of concept device last time it was discussed (extensively). It's a Flagship phone device and price, but with only the oldest of pans on how it's actually going to deliver some on of the promises.


Tgis is straight up fascism. The united states is a facist country. I'm disgusted at how It turned so easily.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: