Hacker News new | past | comments | ask | show | jobs | submit login
Captchas To Keep Idiots Out Of Comment Threads (dangerousminds.net)
233 points by Udo on Aug 10, 2011 | hide | past | favorite | 202 comments



Here’s my test for whether you’re qualified to post to Hacker News.

Pair the following five comments with their corresponding definitions:

1. “Raganwald is a blowhard whose pathetic attempts to score karma reveal him as an insecure dweeb who can’t get over being kicked around in grammar school. He and his mumblings should be should be flushed down the toilet bowl like the turds that they are.”

2. “What, raganwald is talking about beauty in code? Have you seen some of his code? Ignore him.”

3. “Raganwald sounds a lot like a Ruby fanbody, and we all know how those people think."

4. “Anyone who has spent that much time on Java clearly has no taste in software and cannot be relied upon for sound reasoning. Ignore raganwald.”

5. "Of course raganwald would say that there’s something wrong with Waterfall, he’s a Certified Scrum Master, he’s just pimping his own credentials.”

And the definitions:

A. Fallacy: Ad Hominem Abuse

B. Fallacy: Circumstantial Ad Hominem

C. Fallacy: Tu Quoque (“You Also”)

D: Fallacy: Guilt by Association

E: Not a Fallacy: Insults

Example: 1-E, This is an insult, but not fallacious.


I know people smarter than the vast majority of HN users (how can this really be judged, sure, but I'd place money on it, certainly far, far smarter than me) who wouldn't know those, and I know many, many people who are damn smart (just not fitting them into the "smarter than most people here", but definitely smart enough to belong here if they wanted to) who also wouldn't know them.


I beg your pardon, I was trying to crack a funny. It seems it is impossible to parody an extremist belief without somebody mistaking it for the general article. I was attempting to parody the fundamentalist debating pedant point of view.

Poe’s Law strikes again: http://rationalwiki.org/wiki/Poes_Law


The depressing thing is I can imagine people genuinely making the exact suggestion you made, with no humour at all. Sorry!


Er, HN already does limit access via arbitrarily-chosen tests of user quality. They just chose different tests. :)

I don't think it'd be a 100% bad idea to implement some sort of knowledge-based testing, wire it up to varying levels of initial access/privs on a site, and see what happens, as an experiment. User privs could then evolve once the user had actually posted or commented or rated things.


What are those tests? I haven't joined HN in ages (and then, only once), so I'm not up to date on the current procedures.


You have to come up with a username AND password!


Or remember your Facebook/Twitter/etc username/password.


If it's any consolation, he got me last week.

http://news.ycombinator.com/item?id=2846306


A sign of great writing: Half the readers think it's serious, the other half think it's a joke. Well done.


I'm disappointed - I think your idea is brilliant (although I wouldn't go with those examples)


Although my tongue was in my cheek, I think there’s at least a grain of truth in each of those examples.


It's possible to recognize that a fallacy is a fallacy without knowing the name. Asking "Is this a valid criticism?" kicks in the part of the brain that thinks about things.

If the person answers "Duh," you put them in some kind of post jail and make them earn their way out. I couldn't have named all those, but I recognized them as false criticisms immediately.


Oh crap. I caught your joke about pope-approved condoms the other day, but this time you had me going.

I found this puzzle both entertaining and educational. I guess that means I'm not qualified to post here ;-}


I laughed - don't worry.


They could look it up....

I have a really dumb spam filter on my blog which simply asks that the user "Type the word 'humour', but with American spelling". I've had a couple of complaints that this request is unfair to non-native speakers. My response is that you don't have to know it, you just have to know how to look it up.

(And yes, I saw that this stuff wasn't serious, but I thought it was a worthwhile point to make in any case. CAPTCHAs don't need to be things that users can pass without outside help.)


Never mind if people are smart or stupid. The important thing for a forum is if the participants are willing and able to provide thoughtful and interesting comments. Simple quizzes like the above (joke or not) might filter out a lot of lazy trolls up front. Everybody with a genuine interest in the forum should easily be able to find the answers using Google, and might even find it fun to learn a bit.

More importantly, after having made a bit of effort to be allowed in, the users might value the forum more.


Really I think you could just get away with testing to see if the user knows the difference between an Ad Hominem and a standard insult.

There is little more annoying than someone who thinks they see logical fallacies where none exist and take the conversation from bad (real arguments intermixed with old fashioned insults) to worse (nothing but an argument from fallacy)

"I don't claim to know what an Ad Hominem is." would also be a passing response in my book...


Showing a person's argument to be fallacious doesn't show there position is wrong. Many it seems fail to grasp this point.


<3 it. Reminds me of all the good reads on the multitudes of logical fallacies out on the net:

http://duckduckgo.com/?q=list+of+logical+fallacies


1E 2C 3D 4A 5B Do I pass?


I thought you were trying to be really clever by responding in hex.


    42 75 74 20 79 6f 75 20 73 68 6f 75 6c 64 20 68
    61 76 65 20 73 65 65 6e 20 74 68 61 74 20 73 6f
    6d 65 20 6f 66 20 74 68 65 20 63 6f 64 65 73 20
    61 72 65 20 6e 6f 6e 2d 70 72 69 6e 74 61 62 6c
    65 20 63 68 61 72 61 63 74 65 72 73 20 3b 2d 29




I think he's referring to your spelling. ('Tu' vis. 'To')


WRT 1. you're saying that

>"He and his mumblings should be should be flushed down the toilet bowl //

So arguably you're making the case to ignore his [implied] argument based on, in the antecedent sentence, his character. Sounds ad hominem to me.

Just saying.


It's possibly an ad hominem, but not likely to me. http://plover.net/~bonds/adhominem.html


is this a trick question?


I can't know but, after reading the comment, it ocurred to me that identifying "idiots" with bad spellers is fallacious.


Anyone else think it is someone else posting, before seeing it is self-referenced?

Very measured writing about yourself.

Copacetic shield.


So people who are detail-oriented grammarians are necessarily good commenters?

Since when?

The assumption here is that you can tell a comment is going to be stupid by testing whether or not the person understands homonyms. The only thing you check with that is language ability. I'd argue you'd be better off searching for common internet memes and cliches instead of original thinking -- since that's a sign of a lazy or uninteresting mind. The author is kind enough to inadvertently supply us with one "...If Fox Nation implemented something like this, they’d have zero commenters..." But you could test stuff like this based on any preconceived viewpoint, such as Obama-socialist, Obamacare, etc. The use of shortcuts and blindly-repeated jokes and phrases is a good sign that you're not going to be getting much from the comment.

Not sure how you'd code it, though, but I'm certain you could come up with some semantic magic given enough input text. I'd imagine you'd use n-grams and some Bayesian logic. You'd have to have a pre-existing corpus of the person's writing, though.


They are homophones not homonyms - normally I wouldn't be so pedantic, but in a thread around the idea of people having prove they know meanings/spellings/etc, it seems suitible.


It's suitable, not suitible - normally I wouldn't be so pedantic, but in a thread around the idea of people having to prove they know meanings/spellings/etc, it seems apropriate.


I see what you did there... genius.


No you didn't, it was supposed to be turtles all the way down and you fucked it up!


Don't you think it's ironic then that you misspelled "suitable"?


See Muphry's Law: http://www.editorscanberra.org/muphrys-law/ :)

edit: Why can't I use an apostrophe in a URL? HN seems to be stripping it out, so couldn't link to the Wikipedia page without a crufty "%27" showing up...


Why can't I use an apostrophe in a URL? HN seems to be stripping it out, so couldn't link to the Wikipedia page without a crufty "%27" showing up...

Interesting. It looks like ASCII character 27 is supposed to be valid within URIs according to both RFC 3986 and RFC 1738 (via [0]). Maybe it's a simple component of a system for preempting SQLi on other sites via links from HN?

[0] http://stackoverflow.com/questions/1547899/which-characters-...


Pretty ironic, yeah. Though I'm generally terrible with misspellings, because once I've thought of what to write my mind drifts off, I usually gaze out of the window, and my hands often follow their muscle memories rather than what I want to type. So in fact, I actually often type their/there wrong, although any time I do I catch it as soon as I've done it.


No harm, no foul. :) I was amused by it actually, and a thought crossed my mind that you did it on purpose.

Anyway, it just reiterates that, while funny, this method of captcha is pretty much useless.


Gosh, a correct use of the term "ironic", can the ghost of Morissette be being laid to rest?.


Thanks, I was very careful with that one.


Also ironic that the grandparent used a hyphen instead of the proper em dash. (- vs. —)


I actually doubted that was the correct word -- but didn't take time to look it up. Thanks.


I see an unfair prejudice against the vocal and valuable Indian and Eastern-European HN contingent.


Good grammar and basic wordsmithery make idiocy much easier to manage.

Say: George F. Will vs. Rush Limbaugh.


As is often the case with these "I'm so smart and other people aren't" posts, the author makes what I bet is an unintended mistake. I assume they want each option to have only one solution. The last example ("It is poor form to ____ you temper in a discussion group"), either option (lose or loose) works. The verb form of 'loose' means 'to set free', and temper has come to mean an agitated state of mind. So you can say, "It is poor form to lose (be deprived of) your temper (calm state of mind) in a discussion group."

But you can also say, "It is poor form to loose (let free from restraint) your temper (agitated state of mind) in a discussion group."


One would loose one's temper \on\ a discussion group, not \in\ it. I enjoy the perspectives of the ESL folks on HN, though, so I don't want to see any filtering by grammar nitpicking.


I like the ESL comment: that's cute. But to your broader point I ask, "Why?" If I can comment in an online discussion group or I can post on an online discussion group, why is the relation of my temper any different than the relation of my post?

The OED is useful here: it literally defines in as equal to on in the second definition of the prepositional form of 'in'. There's an extended history of the relation between 'in' and 'on', which includes the Latinate source of the Saxon term, its evolution in Old and Middle English, and its use today. Suffice it to say, the distinction is not a trivial one, and 'in' and 'on' are often interchangeable.


While "on" sounds much more natural to me, I think "in" works as well. But definitely not because on and in are interchangeable.

You can simply say: "He let loose his temper." so...

It is poor form to sleep in a meeting. It is poor form to scream in a meeting. It is poor form to loose your temper in a meeting.


Actually, I realize I squeezed in let without noticing. Try this...

---

"on" sounds much more natural to me, and I think that's enough for me. Idiom counts in a language, too. In and on might seem technically interchangeable, but (sometimes) if it sounds wrong, it's wrong.

You can say, "I lose my temper." So like you can say "It is poor form to sleep in a meeting.", you can also say, "It is poor form to lose your temper in a meeting."

As someone else commented, you definitely can say, "I let loose my temper." So, "It is poor form to let loose your temper in a meeting." is fine.

But, can you say, "I loose my temper."? I think you can, but it is very very unnatural.


My thought process was that the online discussion group is the target or the victim, not the venue, of the loosing. In which case "on", or "at" for some other direct objects (like "arrows") would be the appropriate preposition. As in, I loose my temper (the hounds) on the unfortunate souls of Hacker News.


I don't think anyone would ever use loose in that manner without preceding it with "let"? Feel free to correct me, but to me that sounds very wrong.


Turning and turning in the widening gyre

The falcon cannot hear the falconer;

Things fall apart; the centre cannot hold;

Mere anarchy is loosed upon the world,

The blood-dimmed tide is loosed, and everywhere

The ceremony of innocence is drowned;

The best lack all conviction, while the worst

Are full of passionate intensity.


A good poet knows when and how to defy grammar.


Except he isn't defying grammar. It's correct to use loose that way.


I agree with jmilloy, actually; regardless of whether or not that usage is correct, given that poems frequently flout grammar "rules" I don't think using a poem as evidence of a particular rule is a strong argument.


It's just that being in a poem doesn't have anything to do with being correct or not, whether it is correct in this poem or not.


OP said "I don't think anyone would ever use loose in that manner without preceding it with 'let'". I provided a counterexample, so the OP's statement is refuted. Now, granted anyone could write a nonsense sentence to refute any rule of grammar. But this is a very important poem by a prominent poet.


I think that my comment came across as an attack. Rather, I just find the ways in which poems can interact with and affect existing, seemingly fixed aspects of language are beautiful and interesting! And in particular, it means that poetry has a unique place in discussions about both proper grammar and common usage.

You make a good point that you were furthering discussion about common usage and not necessarily grammar.


It's correct without "let", but a little archaic. You're unlikely to see that sort of usage outside slightly pretentious plays, but it's correct nonetheless.


The mistake of writing "you temper" instead of "your temper" made it hard for me to see either one as correct for a moment.


Am I to understand from the title that we are equating “poor spelling/grammar in English” with “idiot”? Or is there some more subtle mechanism at work, such as equating “can’t be bothered to stare at hard-to-read-text just to post lulz, wat?” with “idiot”?


We are equating that after years of education, with the subject of English (albeit American English) being a core class for EVERY YEAR of attendance, and with spell checkers on most forum communities, that if someone still goes out of their way to transpose "lose" with "loose", then, yes, they are probably an idiot.


A part of this phenomenon, is that those of middle age or greater miss the days when there was good quality control over widely published text.

The Internet has democratized widely published writing. There is no quality control over comments, except through communities. At the same time, there are large populations who place are undereducated and place little value on education, while they do prize the ability to get a rise out of others.

An unfortunate consequence of this: reading more used to make me feel smarter and I'd sound more literate. Nowadays, through reading reading I tend to absorb things like "loose" for "lose" and other atrocious sounding quirks.

There is a clear market for quality text, and there is power in being the provider of it.


*then


An excellent example of Muphry's Law.

http://en.wikipedia.org/wiki/Muphrys_law


I know this is not appropriate, but it made me laugh so hard.

I'm trying to write as correct as I can, but the GP just proved that these mistakes happen to everyone, even the people that think that it might make sense.

What about non-native speakers? Their point might be valuable, the participation in a discussion really help- and meaningful, but still - these now/know, there/their, you're/your mistakes are common. And not an indication of intelligence.

Edit: Just to be sure here and kind of answering the posts below: Guys, I'm a non-native speaker as well (in fact, my accent is terrible. I try to write as decent as possible, but that's a different thing). So - my data point, without any backing but my past experience, is that this is indeed an error that non-native speakers do just as well. I didn't want to imply that non-native speakers are idiots per se according to the criteria of this blog.


I will agree with the others. Non-native speakers (such as myself) tend to pronounce the words more carefully, thus they don't confuse them as often.

In my head, "then" and "than" sound totally different. Same with "they're", "their", and "there", "loose" and "lose", etc.


Interesting. I kind of replied to the first line in an edit above, but the second one intrigues me.

Probably this just proves that my accent is crap, but for me 'then' and 'than' sound about ~the same~, ditto for 'their'/'there' and 'loose'/'lose'. Again - probably I'm just missing something here. Improving my spoken English is definitely on the list of things I want to do..


Maybe it depends on where you're from? In my Greek head-accent "e" and "a" are very different. Same with "their" and "there" (the-ir, the-r).

When I speak, of course, they sound identical, but my spoken accent is not the accent I think in.


I don't think it's a question of sound. (I'm french) 'they're', 'their' and 'there' sound the same (spoken or in my head), but their meaning has nothing in common so I never mistake one for another. How you acquire a language plays here. I learned english fully conscious of what I was doing rather than my organic absorption of french.


Oh, of course. I'm just saying that the most common reason for conflation (the fact that they're homophones) doesn't exist for me.


For non-native speakers these captchas would be easy. These are very basic things we tend to learn very well when we first start learning english.


These are not mistakes I typically see from non-native speakers. (YMMV)


FYI, you write "correctly". The only thing, grammatically, that you can do "correct" is "come correct", but that's an idiom.


Using adjectives as adverbs is incorrect in British English, but extremely common in American English. For example:

"I'm doing good" instead of "I'm doing well" "He eats real fast" instead of "He eats really quickly"

If enough people speak that way, I don't think you can call it "wrong" so much as a dialect in its own right.


I wouldn't correct a native speaker. I was just offering, in case they cared.

Fast and good are de facto adverbs. Maybe in some American dialects it's common to use other adjectives as adverbs, but not in mine.

Agree that language changes.


Thanks for the correction. I should've caught that..

You just triggered an image of my first english teacher. 'To be plus adjective, adjective plus noun, verb plus adverb' was her favorite. ;)


Is 'correct' even working as an adverb in that case, or is it more like "come ready" or "come hungry"?


upvoted for calling me on my BS


"Then" is correct, isn't it? The statement is an if-then statement. Am I wrong?


Yes it is, but the post was edited. It used 'than' before.


silverbax88 originally used "than"


Ah. Thanks. I thought I was losing my mind!


Maybe they're smart, but suck at spelling. Not everyone is wired the same.


> they are probably an idiot.

Did you mean "He is probably an idiot." or "They are probably idiots."?


I didn't downvote you myself, but the use of the singular 'they' to denote someone of unknown gender has been in use in English since at least the time of Shakespeare (he used it in places). There was a movement a long time ago to try to make English more like Latin, removing things like the plural they, the split infinitive and so on. Nowadays, with Latin no longer regarded as the highest form of language we really don't have any reason to look down on traditional usage.


Thanks. It's good to know that this usage is correct. However, it's also good to know that it might not be safe to use it in language test because of the dispute ;)


Except the people who create those captchas will, by Murphy's Law or similar, be precisely the people most opposed to split infinitives, ending sentences with prepositions, and other matters of usage that are objectively correct grammatically but are historically declasse (and is leaving out the acute accents in that word a misspelling in English?).



There's no excuse not to know common homonyms and misspellings. A smart person who has trouble with them knows they have trouble with them, knows what they are, and knows to double check himself. If you don't do at least that, you probably aren't that bright.

We'll forgive the less common words like your "grammer".


I'm not an English native speaker. Their/they're/their also stings my eyes when incorrectly used, but I must admit that I often do the lose/loose mistake. But that's fair, whoever uses a grammar that is loose, lose.


whomever. loses.

:)


Are you sure "whomever"? Isn't that the objective form and, in this case, "whoever" is the subject?


I'll go with "whoever" too. "He uses grammar", not "him uses".


On reflection, I'll buy that as well.


I spent 10 minutes on grammar links before picking one...

Originally I wanted to write something along the line "whoever's grammar is loose lose" but I am almost sure this is not a correct form


This sounds like a social signal, not a cause-and-effect relationship. You are not establishing that some quality of being an idiot causes someone to have poor grammer, you are saying that the ought to know their grammar “just because.”

Your argument reminds me of the argument that someone ought to wear a suit and tie to work: There is no excuse for not knowing that a suit and tie is what people wear to work, and anyone who doesn’t do so is not fit to work here.


You don't always need causation, sometimes correlation is enough.

EDIT: To clarify, it doesn't have to be that stupidity causes bad grammar, but if bad spellers are likely to be bad commenters, that's all you need.


Thanks for a good point.

Establishing correlation without causation is a minefield, IMO, although each of us may come to different conclusions. Filtering people on the basis of correlation and not the primary attribute is discrimination. I doubt filtering a forum is illegal discrimination, but I’m uneasy about where that might lead me personally if it were my forum.

Giving a ridiculous example, at a certain time in history, correlation might have established that 90% of the people who used AOL to get onto the Internet posted idiotic things. Should all AOL users have been banned from forums? I suspect that many forums would have had a net positive benefit from such a policy, however the 10% false negatives disturb me greatly.

I would morally prefer other mechanisms for filtering the 90% out, preferably mechanisms that directly test user’s propensity to the “post idiotic things.” This is a personal view, and I can accept that others may not agree.

p.s. And then there’s the question of gathering actual data and not anecdotes. The worst way to proceed would be to claim that since we seem to remember that 90% of all idiotic comments contained grammar mistakes, there’s a 90% chance that a comment containing a grammar mistake is idiotic.


Establishing correlation without causation is a minefield, IMO...

Those who miss nuances of thought and logic are often the same folks who miss nuances of grammar. It's a skill one can learn. Using that as a membership filter would be no different than forming a programming group and restricting it to programmers.


Ah, I agree about not banning people, judging the entire group by a few (or even most) members is racism. I'm just saying that correlations are useful, and we shouldn't ignore that.


>There's no excuse not to know common homonyms and misspellings.

Would you consider dyslexia an excuse, or English as a second language? Or even semi-illiteracy? I think it's definitely possible to be very clever and still not a good speller, and not bother to double-check on an Internet forum.

Einstein was notorious as a poor speller, which suggests he didn't double-check himself.


English as a second language is not an explanation for the "your/you're" errors which are generally sheer ignorance.

Ignorance and IQ are certainly different. But IQ is generally measuring "problem solving ability", and when to use a contraction is a problem. If one is very clever, one tends to have that particular problem solved. But, whether the misuse is from inability to solve the problem, or a choice to not solve the problem ("can't be bothered"), either of those is a reasonable predictor of job performance, health, and various other life outcomes.

On "Einstein was notorious as a poor speller" – after moving to the US, Einstein became completely bilingual but could never recall how to spell words correctly in both German and English. This is not the same issue as when to use a contraction, and isn't the same as "not bothering".

The ESL student learns, understands the theory of, and tends to be careful with contractions. ESL errors look and sound quite different than most of these errors encountered online.

The illiterate "I'm too cool for your TL;DR grammar-nazi wall-of-text" types have failed to prime they're Baysian neural networks by reading or doing homework, and generally have no idea their "doing it wrong".


> The illiterate "I'm too cool for your TL;DR grammar-nazi wall-of-text" types have failed to prime they're Baysian neural networks by reading or doing homework, and generally have no idea their "doing it wrong".

I see what you did there.


Some good points towards using the Captcha, but I'm still not convinced that "there's no excuse not to know common homonyms and misspellings".

As far as using common homonyms and misspellings as predictors of "job performance, health, and various other life outcomes", well, I'm not quite sure that's reasonable. For example, xd's comment below:

http://news.ycombinator.com/item?id=2868142

has an error in it "Too many people will spill there emotions" but I don't think it's reasonable to predict too much based on that. There's several reasons these errors are common, not just inability or indolence.

There's often a slip between mind and the keyboard - I know I often simply type the wrong thing even when I know full well what the correct thing is. Muscle memory or something.

Edit: 'their "doing it wrong"'. Ah yes, very good :)


"Because you are not Einstein, it is not OK for you to make mistakes"?


I hate to do this, but it's "grammar".

Not that I'm calling you an idiot, you have a very fair point.


Thanks!


Excellent point. Mathematics would be better. Simple questions or sequences.


If Fox Nation implemented something like this, they’d have zero commenters (unless, of course, they made the wrong answers “right”).

I wholeheartedly disagree. I am pretty much a total idiot (based on many years of evidence) and yet I can easily use the words listed correctly. The fact that I espouse bizarre and unworkable methods of government (see my username) certainly doesn't prevent me from spelling correctly. In fact, unworkable and impractical ideas may be a significant indicator of post-secondary education.

I'm surprised to see such a thinly veiled insult directed at one set of political beliefs sit at #1 on HN.


  I'm surprised to see such a thinly veiled insult directed at one
  set of political beliefs sit at #1 on HN.*
Things tend to sit at #1 on HN because they're interesting and engaging. I didn't post the link because I wholeheartedly agree with it, and I suspect that's not the reason why people upvoted it either.

That being said, I think the not-so-thinly veiled insult directed at the average Tea Party clientele is backed up by casual observation. And that's saying something, coming from a foreigner who has no stake in the US party system (=me).

  I am pretty much a total idiot (based on many years of evidence) 
  and yet I can easily use the words listed correctly.
You're obviously not an idiot. You can articulate yourself, bring points across in an eloquent manner, thus enriching the conversation. Compare that to someone who only communicates in lulz or derrrp-speak. The point of having an online discussion is not to surround yourself with people who agree with you 100% of the time, it's about having a stimulating discussion in the first place.


Not that that is a place where idiots are likely to venture, but I remember this "captcha" for the Arch Linux forums:

  What is the output of "date -u +%W$(uname)|sha256sum|sed 's/\W//g'"?
(https://bbs.archlinux.org/register.php)


That's a nice one. A bit like "how many were going to St. Ives" or "what color are the bus driver's eyes."

EDIT: whoops, I interpreted it as 's/\S//g', which I assumed would discard everything.


If you don't want to hear from anyone who isn't a pedantic, highly literate writer in English, that's your prerogative.

I, however, wish my Web had more input from illiterate people. There are a lot of them -- the majority of the world, an overwhelming majority when you restrict yourself to the English language. They have lives, thoughts, and stories too. And if it weren't for recorded music, oral historians, and the occasional documentary they'd be completely invisible in our media.

Of course, the average comment thread on the web is a terrible way to interact with the literate and the illiterate alike. ;)


Love the idea, but wouldn't it be simpler and more effective to just have them fill in the missing word without the images.


It would take much more proofreading to make sure the questions aren't ambiguous. Look over ______. What am I thinking? U = {these, there, this,....}


I thought that as well. Each question would need a good degree of context.


Agreed. If you're going the grammar route, why distort the words at all?


I was thinking of giving 5 or so sentences and having checkboxes for whether they're grammatically correct or not. I feel that even people that don't exactly know the correct usage can still get the OP's test right because one "looks" better than the other, whereas with mine there's only one choice presented.


This seems to weaken the captcha for bots, since a simple Google search for (for instance) 'do you know we are going' completes the sentence. No need to defeat the obsfucated OCR test.

If the goal is to keep out some idiots (since these tests will admit 33-50% of idiots who can at least complete the OCR test but answer the question randomly) but permit moderately sophisticated bots, mission accomplished.

Providing more context to the captcha solution in general strikes me as helping the bots, even if it confounds idiots.


Seems to be a lot of serious comments about this when I believe the content was created with tongue planted firmly in cheek. That said, a valid moron captcha would be a wonderful thing in some places. It comes down to the old saying:

"Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the Universe trying to produce bigger and better idiots. So far, the Universe is winning." - Rich Cook


I am very good at failing standard captchas on many sites.

I always blamed Captchas as being to hard and not user friendly.

This title got me wondering - maybe it was by design, and I am an idiot.


In my experience simple, domain-specific CAPTCHAs work pretty well. They don't even need to include a frustrating image with hard-to-parse words. Just ask commenters a somewhat randomized question that any visitor to the site would be able to answer easily. It can be really, really simple - like for example asking them to enter the abbreviation for something:

> Whats the abbreviation commonly used for the Hypertext Transfer Protocol?

This weeds out bots (because they generally don't bother with site-specific stuff) and many trolls (because they are either to stupid or to lazy to go through this process).


That would work well provided your site isn't a high traffic site.


Yes. The lower traffic your site is, the easier you can make the captcha. The truth is, most of us have pretty low traffic sites with no more than 100k hits per month, blogs for example. The key here is to have a simple but different captcha compared to the standard stuff. Pairing this strategy with a little bit of site-specific domain knowledge goes a long way towards discouraging trolls and idiots.

I think overall, bots would be negatively impacted by a more diverse captcha ecosystem even if the individual implementations are easier to crack. Spammers don't invest 10 or 20 minutes in cracking a domain-specific captcha system just so they can spam one single blog - they'd rather move on to thousands of sites that are protected by the same standard word reading puzzle. Of course, if your site rises into the upper strata of the web, you're also quickly entering a zone where the standard, hard-to-decipher image strategy isn't working so great anymore either.


Jeff from coding horror for a long time used a very very simple "CAPTCHA". You just had to enter 42 (if I recall correctly), and it was always just that, nothing random. It served him well, and he has a high traffic site (http://www.codinghorror.com/blog/2006/10/captcha-effectivene...)


And that the question randomizes pretty well. Otherwise posts could be automated.


The other use-case is defeating chaptcha-solving farms often run out of locations with low labor costs. I'm not entirely sure a specific cultural / intellectual / domain knowledge question would be more effective than simple character recognition, but it might be an interesting test.


This is an interesting concept, but won't keep out comments from the reasonably educated, but devious. Those often are the worst kind. It's usually possible to simply down vote puerile comments into oblivion, but one intelligent troll can waste considerable bandwidth.


At first take this seems like it would be easy to circumvent, (given a dictionary of homonyms) but...

google "look over there." == 1.5M hits

google "look over their." == 16.2M hits

google "look over they're." == 1.0M hits

Google doesn't support periods, apparently. But, other examples from the article are a little more easy to game:

google "do you know were we are going?" = 8 hits

google "do you know we're we are going?" = 0 hits

google "do you know where we are going?" = 6.1M hits


I know ESR isn't really popular here, but:

"Being a native English-speaker does not guarantee that you have language skills good enough to function as a hacker. If your writing is semi-literate, ungrammatical, and riddled with misspellings, many hackers (including myself) will tend to ignore you. While sloppy writing does not invariably mean sloppy thinking, we've generally found the correlation to be strong and we have no use for sloppy thinkers. If you can't yet write competently, learn to."

I still try to read through poor writing, but the comment has to contain something fairly insightful for me to get over a "then/than" mistake. That said I think a Naive Bayes classifier for "sloppy" and "not sloppy" would work better than this captcha system. Not to mention it's super-easy to get around by a bot, and getting rid of bots is the point of a captcha in the first place.


"If you can't yet write competently, learn to."

Excellent advice.

Ayn Rand wrote: "If you don't know, the thing to do is not to get scared, but to learn."


So, i can be a d*ck as long as i'm smart?

Edit: I'm disturbed by the amount of grammar nazi's here that think the quality of someone's grammar reflects their personality...

If the goal is to keep the conversation civilized, why not do something along the lines of "Doing X is a more nice thing to do".


I don't think it's really about being smart. Having a basic grasp of language so you can adequately articulate yourself in a conversation is critical to discussion. If you can't, then you probably shouldn't be taking part in the discussion anyway.

Too many people will spill there emotions into a discussion with sub standard spelling and grammar which makes it very difficult for others to understand them, which leads to inevitable misunderstandings.


The grammar issues presented on this page are picayune distinctions with very little potential for major misunderstandings. Getting them correct is more of a social signal than anything else, like eating salad with the correct fork in some places and not eating food with your left hand in others.

I have real trouble imagining that a good point made with substitution erros is somehow less critical to the discussion than a poor point that doesn’t confuse it’s/its.


No, but it's more trouble to parse.

More importantly, I think it's fair to ask that people learn basic grammar, and if we don't take it upon ourselves to insist on seeing correct grammar, then that will never happen. That's why I think being a so-called grammar nazi is important. Now, being a dick about it is a different matter, but somehow people think you're a dick regardless if you try to correct their spelling/grammar, so there's only so much that can be done there.

That said, there are definitely some levels of pedantry that can be varied. For example, I tend not to be too put off by it's/its. There/their/they're, however, is annoying for me, because the words themselves have such monumentally different meanings. Same deal with fair/fare. Then/than is annoying to me because I actually pronounce them (subtly) differently, so as I'm internally reading the words, it throws me even more.

Overall, I'm really agreeing with you -- I don't think there is serious potential for misunderstandings. And indeed, a good point made with grammatical mistakes is more important than a correctly-made bad one. But a good point made in `proper' English is, I think, better than a good point made in `bad' English.


In simple social situations that may be true. But not being able to grasp the basics of language will have an effect on how someone communicates complicated thoughts and ideas.

How would you expect someone taking on a programming challenge of say, an operating system, without the basic understanding of boolean logic to fair?


"fair", or fare? Those damn homophones ;)


Thanks for pointing that out, now I'm one step closer to infallibility ;)


> there emotions

Another good Muphry's law example.


I guess smart dicks are slightly better than stupid dicks?


Still a 33% failure rate. And oh, http://www.youtube.com/watch?v=N4vf8N6GpdM


This makes me wonder how much AI development is done for nefarious purposes. Someone's going to crack this and then they're going to have to make a more intelligent captcha system. Some day we're going to have strong AI and spammers will be thanked for it.


If you haven't read this, please check out Cory Doctorow's story on exactly this point.

First link I could come up with: http://singularityblog.singularitysymposium.com/pester-power...


Highly relevant --

http://xkcd.com/810/


i like this idea..! I am planning to open source my CAPTCHA research if any one interested please contact me ..! you can find my research on http://dndcaptcha.blogspot.com/2010/04/textareaid.html

You can find presentation of the same concept here: http://www.slideshare.net/desaiguddu/drag-and-drop-captcha-a...

If anyone of you are researching or preparing Survey presentation or educational research on CAPTCHA feel free to use the content from the presentation.


This made me smile, but I don't think it would work. I suspect a lot of people could use the right form when asked outright like this (and as a last resort, they could Google it). The bigger problem is probably when people don't care whether they get it right or not when they're writing their comments.

[Pre-submit verification of 'there' homophones . . . . 100%].


Via via via via...

The original source is here

http://www.defectiveyeti.com/iacaptchas/


The problem is there's a good chance of measuring the wrong thing.

If you need to pass a test to post, sure you'll keep out a lot of bad comments. But you'll also limit the pool of acceptance. There could be plenty of people who fail the test having something positive to add. Standardized tests prove this. There are those who are lousy test takers but end up being very successful. Also, certain topics might be easy to answer with someone with sufficient experience in the subject without requiring even average intelligence.

So this limitation might increase the quality of posts, but also adds a kind of tunnel vision to the site.


Idiocy is not the privilege of bad spellers.


The best Captcha IMO is just to write up a set of questions about the article. Actually reading the whole article and absorbing the details is a reasonably good predictor of comment quality.


The better idiot screen on any discussion forum with a well defined subject scope would be a factual knowledge test. But the factual knowledge test wouldn't maintain the existing community if the existing community already has "hivemind" about factual issues central to the forum's subject, contrary to fact. I wonder what we would all consider the subject scope of HN? Are there any issues on which the HN consensus about the facts of the real world might be contrary to "objective" fact?


The funny thing is that the smart captchas are easier for a computer to guess (if it wasn't a joke).

Too bad that the current implementation of captchas ever saw the light, they are pushing a security issue to the end user. Seems like a good opportunity :).

The acronym is "Completely Automated Public Turing test to tell Computers and Humans Apart" but it isn't really that automated, so do captchas actually exist?


This IMO is something that Disqus or Quora or similar systems should jump on - automated techniques for knowledge base quality management. I can see taking some of the existing essay grading software, combined with something like these captchas, being used to produce customer support knowledge bases from input from technicians and discussion forums.


Might be effective at keep the idiots out, but not the bots. The challenge sentence should not be in straight-up HTML.


this would also seem to discriminate against people who aren't native english speakers (or w/e other language).


In my personal experience, non-native English speaks are far more well-versed in grammar than native speakers.


It might discriminate in their favour. On average, foreign students I studied with had much better basic grammar than native English speakers.


Not necessarily especially if you've learned the language at school, analytically rather than organically.

These words/phrases are wildly different in other languages, and thus well distinct in the mind of non-native folks.

For example, in French:

were -> étais; etions, etiez, étaient (2nd person, singular; 1st, 2nd and 3rd person, plural)

we're -> nous sommes

where -> où


I think native English speakers are more likely to make such mistakes. Non-native speakers usually know the difference between those words, because they have the ability to translate the words back into their native language to see the difference.


Reminds me of a common beginner's problem if you're starting with German:

Who is 'Wer' in German

Where is 'Wo' in German

Trivial, but I see lots of people (and sometimes myself) falling for this trap.


It happens the other way around too (German to English), and sometimes you end up with non-words like "somewhen". It's an interesting exercise to determine a writer's native language by the oddities of their writing.


Exactly. I now live by the credos that someone making these mistakes may actually just not know better, or be struggling with a foreign language. So far (on HN at least) this assumption has held in many cases. It also makes me far less angry when reading the internet, regardless of whether the assumption is true that's hardly a bad thing.


These people often have better written grammar, because they've had to take the trouble to learn it as a foreign language.


I've the feeling that when you're speaking in a foreign language, you are more careful than in your own language. Maybe it's language thing, but I make ton of typos in French (double consonants, en/an, or/hors), and I can correct native English speaking people on stuff like your/you're.


I was talking about people who just don't really know English very well (but well enough to communicate on some level)


I am a non-native speaker, and to me it looks like the second captcha is wrong.

According to what they've taught me, it should be "Do you know _____ are we going?" instead of the proposed "Do you know _____ we are going?", shouldn't it (notice the "we are/are we" swapping)?


The sentence already has a VS inversion: "Do you". It is not correct standard English to add a second VS inversion, such as "are we", although it does sometimes occur in nonstandard native speech. The "we are" is in a subordinate clause, and the subordinate clause is not structured for the interrogative.


No, "Do you know where we are going?" is correct; however, without the "Do you know", "Where are we going?" would be correct.


Thanks, it totally makes sense now. "Where we are going" was an affirmative (sub)sentence because the interrogative is already expressed in the "Do you know" part.


(non native speaker)

In 20% of my HN comments I need correct "there"/"their", "buy"/"by" [...] which I write down correctly when re-reading my comments, but subconsciously write incorrect the first time when I do not think about the words but the thoughts.


Does noticing whether a link is the original link

http://www.buzzfeed.com/chrismenning/captchas-for-keeping-id...

or blogspam count as a useful screen?


I think, This seems to weaken the captcha for bots, since a simple Google search for (for instance) 'do you know we are going' completes the sentence. No need to defeat the obsfucated OCR test.


a cloze test (i.e. some text with a gap for a word) would do very poorly in filtering out bots, as they're much easier to do automatically than captchas.


The page loaded a 500 and I thought that it was some sort of gag or that my understanding of HTTP response codes was lacking.


I liked the method the USENET group alt.hackers uses---it's moderated, but there's no moderator. Good luck in posting.


(Non native english speaker) isn't the verb "to go TO" and the whole sentence: "Do you know where we're going TO?"


"Do you know where we're going?" would be what most UK speakers would say, and "Do you know to where we are going?" would be an archaic variant which implies a location is requested rather than an action. The verb used is "to go"; the answer might be "[we are going] to London" or "[we are going] shopping".

There's some potential for confusion in that the verb "to go" can be used for the future imperfect (I think, I was only taught my tenses in French and German lessons) as well as the present tense: note the difference between "We are going to London" or "We are going shopping" and "We are going to go to London" or "We are going to go shopping", or even "We are going to shop". The former is talking about an action we are currently undertaking (to go: implying motion), the latter about a future intent (to go: expressing an intent to do something in the future) to carry out the action in the former case. Using a different verb: "We are riding a bicycle", compared to "we are going to ride a bicycle"


Thank, good one with the shopping.


"Where" can substitute for phrases of the form "at PLACE" and "to PLACE" as well as bare placenames, so, "Where are we going? We're going to the market," is perfectly standard.

"Where are you at?" is considered nonstandard, but it (and the form "Where you at?" are standard in AAVE.


No, TO is a preposition, but not part of the verb. No more than "by car" would be part of a "to go by car" verb.


To my (non-native) ears, that sounds overly formal.


Unfortunately, grammar captchas like this would keep out a lot of second language learners as well.


several problems that need to be addressed. Non-native speakers would be arbitrarily penalized. This is very easy for a computer to crack. (small choice selection, systems to do grammatical checking have been around for a long time.)


Do yourself a favor and read the comment thread on that post.


Also keeps (some of) the non-mothertounge speakers out.


I don't get it. How come this article has not been flagged to death? How come this article has over 100 comments? Am I missing something?


those would be pretty easy to solve with a bit of code


Hands up if you were especially careful when commenting on this post?


He's missing this one

Fill in the gap: a) it's b) its

"The hacker got access to the server and killed ... processes"

lolz


The problem with this one is that at one point in the English language's history (17th/18th century I believe), the correct possessive form of "it" was "it's", which is arguably more correct^.

^You make nouns possessive by adding a saxon genitive to signify the distinct possessive-izing "s" sound. The word "it" is also made possessive with this same sound when spoken, but for the sake of a contraction (read: abomination ;)), we have decided to arbitrarily remove the saxon genitive and replace it with a simple "s". The excuse for this inconsistency is "well it is a pronoun, not a noun, so this is not an inconsistency" does not take into account that English is a language primarily spoken. Written English, where reasonably possible, should approximate the spoken constructs.

Demonstration:

  He stepped on the cat*'*s tail.
  He stepped on its tail.
Notice that although these two sentences are expressing the same idea and are of the same approximate form (they would be spoken similarly), the second has dropped the saxon genitive. This is clearly, if you look at it ignoring what you were taught in primary school thanks to Webster worship, an absurd change.

At the very least, people who use saxon genitives with the word "it" do not deserve the ridicule people like to heap on them. "it's"/"its" is not a "there"/"their"/"they're" scenario.


I disagree, I think it/it's is completely comparable. Sure, there's a long history to it, but it has evolved, and there are two reasons why it is comparable.

First, we weren't alive back when "it's" was used this way. Nobody currently alive was. There's no argument of "they're doing it the old way", because it's not even in living memory. As of the current standards, and those of the past 200 years, they are simply doing it the wrong way.

Second, it can't even be argued that they are confused because of the old way. Nobody accidentally writes her's, their's or your's. The reason people write it's is nothing to do with the possesive apostrophe, it's purely because of confusion with the "it is" contraction.


You seem to misunderstand me: I'm not suggesting that people use "it's" because they aren't up on the new conventions, or even that "it's" is better because it is the old way. Rather, I am simply suggesting that the old way is better (for the reasons stated above).

I object to the assertion that people are accidentally inserting "it is" where they mean possessive "it". Certainly people such as myself who are strongly verbally oriented type the saxon genitive reflexively to reflect the spoken possessive-ization construct. In a similar way, I also very commonly "improperly" insert commas in my sentence to signify pauses instead of using them merely syntactically.

Furthermore, I maintain that "its"/"it's" is not equivalent to "they're"/"their"/"there". The difference between the first is purely convention, while the second set of words are only related to each other incidentally.

"Nobody accidentally writes her's, their's or your's."

Actually, I do quite often.


My way of remembering its/it's is to associate it with his/hers rather than the typical 's.

`The cat stepped on his foot.`

`The cat stepped on its foot.`

Still, I don't get annoyed at its/it's because it's such an easy mistake to make. I still do it occasionally without meaning to. I don't quite get how people consistently get their/there/they're or loose/lose wrong, but that's just me.


Totally. The "loose" thing drives me nuts for some reason. Other types of misspellings don't really bother me unless there's a ton of them, but for some reason I always hear that one out loud in my head and it destroys my opinion of the writer every time.


interesting...


What would an anti-hipster Captcha look like?

http://www.dangerousminds.net/contributors


What does that link have to do with your comment or the story?


thats why the Smart CAPTCHAs are needed !




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: