Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'll try to provide my thoughts on each of the issues you've mentioned, let me know if there's something I missed.

On using blake2b: I chose blake2b as I was looking to use a hash function that is small in implementation, readily available and already optimized. With WebAssembly the solver can achieve (close to native) speeds and be least be an order of magnitude or two closer to optimized GPU algorithms.

Using specialized hardware, image tasks (and even more so audio tasks which must be present for accessibility reasons) have the same issue that they can be solved by GPU algorithms (i.e. machine learning, in which even a low percentage success rate would already be enough). If you search on GitHub you will find there are more ML captcha cracking repos than captcha implementations - they are probably even easier to get started with than adapting GPU miner code.

Image/Audio Captcha vs ML is an arms race that can be beat for split seconds of compute (even on CPU) or cheap human labeling: it's just as broken. FriendlyCaptcha optimizes for the end user (privacy + effort + accessibility) by not engaging in the arms race - I think it makes a better trade-off. Like the sibling comment pointed out the captcha solving can happen entirely in the background so that hopefully it doesn't even make the user wait.

As for rate limiting/difficulty adjustment: it's not perfect and it could lead to problems if you share the IP with a spammer (and let's be realistic: even with a million users on one IP there won't be tens of users signing up to some forum per minute). Also normal captchas have problems here though: users from these locales already get presented with much more difficult+frequent recaptcha tasks (I also doubt they are localized: American sidewalks are harder to label if you've never seen one in real life). Setting a reasonable upper limit to difficulty may be good enough here.

On not using blake2b: I have considered mutating the hashing algorithm every day randomly to make writing an optimized solver for it all that more difficult - but that would mean one could no longer self-serve the JS+WASM and be done with it. I won't rule it out for FriendlyCaptcha v2 if this does ever become a real problem.

Swapping out the hash function should be easy (the puzzles are versioned to allow for this). If you have a different function in mind and someone implements it in Assemblyscript (so we also have a JS fallback) then we can definitely consider it.



Thanks for your detailed response.

I've seen all the projects claiming to have broken ReCAPTCHA—often using Google's own ML services, hilariously—but it's unclear to me how broken image/audio CAPTCHAs are in practice (and the number of GitHub repos doesn't seem like a good measure to me). If they really are completely broken, then why are they still so widely used? If they really are completely broken by ML, how do human CAPTCHA-solving services stay in business?

FriendlyCaptcha optimizes for the end user (privacy + effort + accessibility)

Good point. I am concerned though that burning CPU cycles on the proof-of-work uses battery life if the end-user is on mobile, without getting their getting any choice in the matter. What if, given an informed choice, they would have preferred an image CAPTCHA? (On the other hand, that could use more cellular data. Might be good to run the numbers on this too.)

even with a million users on one IP there won't be tens of users signing up to some forum per minute

I think this is a bad choice of threat model. "Some forum" would likely be better off with simpler measures, like a hidden honeypot textbox: https://dev.to/felipperegazio/how-to-create-a-simple-honeypo...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: