This is exactly why when we wrote the CAPTCHA code that protects the creation of SpiderOak 2gb free backup accounts (well commented GPLv3 code here -- https://spideroak.com/code ) we took the approach of not making them 100% human readable.
Supposedly it greatly increases the cracking difficulty at the relatively small expense that humans may have to click Next a small percentage of the time. If we notice abuse, it will be easy to tweak the code (it's just a python gimp script) to produce a radically different captcha for a few hours work.
Last week, I was going to register a new web service (forgot which one) and after the third captcha I couldn't get right, I just quit and decided to use a competitor instead.
What I'm trying to say is that ideally you shouldn't use a solution that is less user friendly and possibly infuriating for your future customers/users.
Indeed. It's a balance. Three failures is way too hard. Ours is a fairly simple captcha -- 5 letters/numbers, and our logs show that somewhere around 94% of attempts are correct, and we see few abandoned signups during the captcha answering phase. We could probably improve that further, though.
The primary defense against OCR is to make the segmentation attack hard -- pushing the characters together somewhat. With more tweaking we could probably get closer to a sweet spot of just enough overlap. Not even all of the characters would have to overlap to be effective.
But it's a moot point, since anyone who really wants to defeat captchas en masse, can just go mechanical turk, or even better just setup their own 'porn/warez' sites etc to show your captchas and have random internet users solve it for them.
There's no defense against that... Which makes captchas just a big irritating bag of fail.
I think of it as similar to a home security system. Of course there are ways around it. Chances are that the effort involved means that a burglar will go rob a neighbor's house instead though.
Perhaps captchas make more sense in capital intensive industries with clear avenues for abuse. In the case of SpiderOak, we'd prefer to avoid making the free backup accounts an attractive prospect for warez distribution. YMMV.
I don't follow. Even the best spam bots don't solve every CAPTCHA. If it's a miss (either because they got it wrong or because it's actually unsolvable), they'll just try again, no?
I would think actual humans have time that is more valuable than zombified Windows boxes.
Supposedly it greatly increases the cracking difficulty at the relatively small expense that humans may have to click Next a small percentage of the time. If we notice abuse, it will be easy to tweak the code (it's just a python gimp script) to produce a radically different captcha for a few hours work.