Hacker News new | past | comments | ask | show | jobs | submit login
Mailcheck.js - How we decreased sign up confirmation email bounces by 50% (kicksend.com)
561 points by skyfallsin on March 20, 2012 | hide | past | favorite | 100 comments



I think there's a general purpose need for "optimal" form components. Let me elaborate:

- Fields for typing-in a credit card

- Fields for typing-in an e-mail address

- Fields for typing-in a U.S. street address

There are widely-known techniques for optimizing data entry for these fields, yet these techniques aren't widely adopted, and further yet they're known to increase conversation rates.

Someone should build a (subscription) service where you can embed a bunch of fields (and labels) onto a form with a single line of Javascript.

Then, the fields would render on the page. The performance of the fields (their effect on conversion rates) could be measured continuously. New variants of the labels and fields could be A/B tested continuously as well. That is, the performance of the fields would improve over time.

If there's interest I'll elaborate in a blog post (with mockups).

made a few edits


"Someone should build a (subscription) service ..."

I think you misspelled "open-source widget library".


Embedding a credit card field using third-party Javascript sounds scary.


Stripe does this really well using a js lib. The sensitive data thus never hits your servers and it's a much better experience for devs who don't want/have to deal with PCI compliance.


I'm not convinced using Stripe or similar services gets you out of PCI compliance.

The vast majority of PCI compliance requirements - those relating to server security - would seem to still apply. If someone hacks your server, they could easily add an additional bit of JavaScript that sends the CC field to a second, malicious server.


That's a worst case scenario and things could be worse if they hacked your production code base.

All data should be 256bit ssl encrypted for point to point security and asset tampering protection. After that, i doubt stripes js lib is much of a problem, it communicates in a secure tunnel from the client to stripe.

They as well say you don't have to worry about PCI compliance then because you are never handling financially sensitive data directly, only indirectly.


The GP is referring to doing client side Luhn validation on the credit card number, giving the user feedback if they mistype a number.

It's a technique I use, along with only allowing digits and doing some format masking (i.e. spaces or dashes to split the number into 4 digit blocks) to make the number more readible


Actually, the GP is talking about somehow building a startup based on... form widgets. He specifically said a "(subscription) service" - which would invariably require server-side support for these apparently startup-backed form fields.


Why? Many of us use Spreedly / Chargify / Recurly. Imagine embedding a Spreedly form on your site. Still scary?


> Still scary?

As someone who's currently looking over PCI compliance stuff: YES.


Actually, these third party services can be a lifesaver when it comes to PCI compliance. If you use a service like Stripe in it's entirety then you don't have to worry anywhere near as much about PCI compliance yourself.


Has there been any actual confirmation of that? i.e. an official ruling from the PCI standards council?

Other than the long-term storage of credit card numbers stuff, most of it focuses on security measures I'd want someone using Stripe to have implemented.

After all, if someone hacks the server, they're going to have an easy time adjusting the flow to save the CC# that's supposed to only be going to Stripe.


But on the other hand, if anyone hacks any server, they're going to have an easy time putting up a page with a form asking for credit card numbers. For example say somebody hacked HN and added a "pay for premium membership" form, that just mailed off credit card numbers to the hacker...

Just thinking out loud really, I know nothing about PCI compliance or law.


I've written more on this elsewhere, but you're generally right.

PCI compliance rules are designed to test the serious risks that can be tested.

If the credit card data doesn't pass through your servers, then it doesn't pass through your servers.

Stripe complies the same way as PayPal; it just looks different to the user.

It's also worth noting that there's a big difference between data at rest and data in motion. If your site is hacked tomorrow and starts redirecting CC info, then all of the CC info between tomorrow and the day someone stops the hack is compromised.

If CC data is stored on your server in any way -- and this if it passes through your server, this may be the case even if you aren't putting it in a database -- then when someone hacks your server tomorrow, it's quite possible that all CC info entered since 2003 (or whenever you started) is compromised. That's a much greater risk.


I think you would want to inspect the source code before you use a third-party library in a case like that


The whole point is that third-party source code can change without your knowledge (or permission, etc). Or, worse, the third-party can simply be hacked.


Well, only if you're linking it off someone else's servers.

jQuery is third-party source code too, but I can host it myself.


Obviously. This was a response to OP putting forward a hosted solution.


Nothing a hosted solution for more $$$ can't solve


Great idea, but the problem here is that what works somewhere, doesn't work elsewhere and it's all about testing.

But testing multiple fields, with multiple variables and experiments requires lots of traffic.

And what works for one source of traffic, PPC for example, won't duplicate to other sources of traffic like TV / Radio / PR / SEO / etc.

We recently ran a ton of traffic just around an inbound-email newsletter pop-up that lists out what we found to work best here:

http://www.conversionvoodoo.com/blog/2012/01/opt-in-email-ne...

Those are some "tried and true tips" that we'd stand behind for a popular affiliate marketing blog getting most of their traffic via SEO and referral, but unfortunately the entire addressable universe that we can guarantee those results for are the client we undertook the project with ;)


Which is why I think it would be an interesting startup.

The base case alone would be enough to get many to buy your component but for a subscription (where the real money is) you need to provide continuous value.

So your pitch could be that using your product would instantly get you the best industry standard input forms but over time they'll get better because you're going to apply a/b testing and machine learning on them continuously.


The credit card one is especially important. We run a hosted online ticket system for some of our customers and noticed they were getting a bunch of 'Invalid CC' responses from their payment gateway (and paying for each invalid attempt).

We implemented the Luhn[1] algo credit card check on the checkout page. Invalid CCs would trigger a little warning but still allow the form to be submitted. Invalid CC transactions dropped ~90% immediately. Even better we were able to get rid of the 'select your card type' field since that was detected by Luhn. A little JS was a win all around.

[1] http://en.wikipedia.org/wiki/Luhn_algorithm


"Even better we were able to get rid of the 'select your card type' field since that was detected by Luhn."

FYI: Card Type is not determined by Luhn algorithms, but rather (broad brush strokes, see http://en.wikipedia.org/wiki/Bank_card_number#Issuer_Identif... for more detail):

3 - American Express 4 - Visa 5 - Mastercard / Diners 6 - Discover


> Invalid CCs would trigger a little warning but still allow the form to be submitted.

Out of curiosity, why allow the form to submit anyways?


Rule #1: never assume your code covers 100% of all cases.


What richthegeek said. We put it in as a rough helper but didn't want to run the risk of denying something valid. The middle ground seems to work well.


"Someone should build a (subscription) service where you can embed a bunch of fields (and labels) onto a form with a single line of Javascript.[...] The performance of the fields (their effect on conversion rates) could be measured continuously."

Not quite the same, but sounds a bit like inForm[1]. These guys presented at a recent HN London meetup[2] and this was the meetup.com description which summarizes it better than I could:

"Ever wondered how users engage with your site, where they get stuck or how long they spend making choices? Forward Technology's upcoming form analytics service inForm can help you answer all these questions and more. Without any configuration Inform allows you to quickly build up a strong picture of what happens when real visitors interact with your forms."

Beta sign-up for HNers here[3]

[1]http://inform.forwardtechnology.co.uk [2]http://vimeo.com/32617520 [3]http://inform.forwardtechnology.co.uk/users/sign_up



Very similar to what I had in mind. Just no need to read or understand how it all works. Just embed some Javascript, and assume 1) that it works great and 2) it will improve over time (based on A/B tests on your site and other sites).


You're probably interested in http://wufoo.com/


HTML5 and tablets are moving towards this. You can set the disposition of the on-screen keyboard based on the input metadata, making it easier to type stuff in. I think email is included in the spec.



It's a cool idea, but I think they're doing it wrong.

We experimented with doing something like this on Quizlet, but didn't actually launch anything. We first looked at a lot of the data and doing based on string distance is the wrong approach.

For example, if you type hotmail.de into that checker, it suggests hotmail.fr. Another is ymail.com --> gmail.com. The more valid domains you add, the more (correct) permutations get marked as invalid. We have 20k users with ymail accounts.

I think a blacklist approach is much more solid than a whitelist approach, I just haven't gotten around to building it.


It sounds like you have the email addresses of your users stored in plaintext. In that case, you should be able to extract all of the domains of verified email addresses from your existing information, thereby covering all of the likely domain names of your userbase and future users.


> It sounds like you have the email addresses of your users stored in plaintext

Just curious, but are you suggesting that plain text is the wrong way to store an email address? Your comment makes me draw that conclusion, which of course seems rather silly.


It can be viewed as a security vulnerability, as many folks use the same password everywhere. As such, if somebody compromises your user database, they now potentially have a recoverable password and a plain text email address to go with it. This potentially compromise all users' email accounts, as well as other services that use email as username, such as PayPal accounts.

If email addresses are obfuscated in some way, the difficulty for an attacker is increased.

The tradeoff in convenience is that you force a user who has forgotten his password to remember what email address she signed up with in order to recover it via email.


Obfuscated in some way implies that it's reversible, which simply means that it's just going to take a little bit of time to unobfuscate the database--in other words, it's probably not worth it.

Hashing an email address would be pointless because the the email address is no longer usable to do things like, you know, send email to that person. As such, the only real option is to store it in plain text--and that makes the most sense.


just correct TLDs separately, and i don't think it's a problem if you have false positives in there, after all a user will recognize an incorrect suggestion and move on, should it happen at all. using something like this is definitely better than using nothing at all.


Cool until I checked the source and realised that the developer has to hard-code a list of domains they want to check against (there are none included).

Off-the-shelf usefulness would be improved a lot if the plugin contained a list of say 100 or so of the most commonly-found email domains.

My 2c.


Here is the list they're using: http://kicksend.com/assets/splash.js


It would be even cooler if it checked against MX records.


I really like the idea. My only nitpick is that these days most projects seem to be ending up as jQuery plugins even when they don't really use much of its functionality. I think a standalone version would save a bit of effort for some of us bound to other frameworks (or not using any framework).


I'm not going to lose sleep over adding a JQuery dependency. The fact is it gets you code that targets 95% of browsers with no hacks. I don't want to code explicitly for IE6-8, and JQuery gets me a lot of functionality on those browsers for free. Sure, there may be other frameworks, and sure, JQuery has its faults, and sure, a framework seems like overkill. But I feel a lot more sure about the functionality of my code when I'm relying on a broadly tested cross-browser suite.


An option would be nice for those who did decide, for whatever reason, to not use jquery.


The source is available and the license allows for derivatives. Would you be interested in forking the code and putting together a plain-javascript version?


No, sorry, I'm not much of an Javascript programmer, neither am I that interested. I realize my post kind of looks like standard wankery-won't-work-himself; I was replying specifically to lukeschlather only regarding whether a jquery dependency is advisable.


I totally hear you. The jQuery plugin actually wraps Kicksend.mailcheck, so it's easily decoupled (do fork it!). Since its main use would be on the client side, we released as a plugin for the ease of adoption.


ya, this. There's literally nothing jQuery related in this code. The jQuery part prevents it from being run in node or in non-jquery sites.


Great work and thanks for the mention. We've been thinking about doing something similar but going a little further and actually checking mx records, with a response if they don't exist. This would help with the long tail domains.

Edit: Another good idea from hinathan.


Be careful. Some legit domains might not have an MX record. (but could still work; messages will usually go to the A address)


No problem! The Mailgun webhooks helped us collect and analyze the delivery failures, so that was really helpful. Thanks for that.

A mx checking service sounds awesome. Keep us posted if you guys do it.


Just doing DNS checks might not be enough:

`$ dig hotnail.com MX` yields results but is still invalid.


Well, technically, it's valid; it's just unlikely to be what you meant to type. That sort of case should probably be handled separately.


Well worth noting: auto-correcting the email addresses would be a very bad idea; ditto for stopping them from submitting the form if this is triggered.

But just showing a little note "are you sure you didn't mean username@gmail.com", then letting them submit anyway, isn't generally annoying even for misfires.

Because there actually are going to be occasional users who mistype "ymail.com" when they mean gmail. Hey, thanks for the warning; the y is right near the g.


I can understand email validation for a service like PayPal or Yammer where money or access is truly tied to an email address. But it seems like there are a lot of service that validate email addresses unnecessarily (and could thus improve rates by 100%). If Kicksend only sends documents to email accounts then it would be one of the latter. If it actually makes docs available in an account, then, yes, it would need to verify.


So how about keeping the user on-site for a while and notifying them immediately if you detect their confirmation email has bounced? If you push the temporary session dowbstream enough to correlate seasion with bounce tou can talk back to the user. You could probably account for a healthy chunk of those bounces which this js doesn't catch.


This is being worked on.


Awesome. One issue found, a hotmail.es was suggesting a hotmail.fr domain


This is seriously awesome work. I know a couple places where I want to use this already. Thanks for sharing!


Thanks! Glad to share it.


Just a quick test of doing HTTP-based checking as opposed to a string-distance checking:

http://richthegeek.co.uk/ui/input/email.html


We've been doing the simpler version of this for years now, basically stolen the idea from a 2007 MarketingSherpa article [1]: we just raise a modal window for every subscriber address and ask something like "Please check again: is this your email address? - Yes / No (I want to correct it) (with the email address in big letters). Has worked wonders ever since ;)

[1] http://www.marketingsherpa.com/content/?q=node/2223


That would really annoy me!


This is really cool. It would definitely be useful on my sites.

One thing we do differently on http://www.queondaspanish.com/ though is allow users who haven't confirmed there email use the logged in features but with limitations. They can keep track of their lesson progress for example, but not send messages to other users. They can also change misspelled email addresses, which I think would help in your case.


Nice little gimmick, thanks! We put it in our sign-up script as well. Since most of our clients are Dutch, we added the most popular Dutch e-mail providers (+ a big German + Belgian):

casema.nl, chello.nl, hetnet.nl, home.nl, kpnmail.nl, kpnplanet.nl, live.nl, online.nl, planet.nl, quicknet.nl, schuttelaar.nl, skynet.be, t-online.de, tiscali.nl, upcmail.nl, wanadoo.nl, wxs.nl, xs4all.nl, zeelandnet.nl, ziggo.nl, zonnet.nl.


This is good for sanity checking the right side of the '@' but detectable things go wrong on the left side sometimes, too. One phenomenon I've been seeing for years is the erroneous "www." prefix, often tacked on to @aol.com and @yahoo.com addresses. I don't think I've seen one of these that hasn't bounced.


Another option is checking server side using whatever is your language equivalent of: http://search.cpan.org/~rjbs/Email-Valid-0.188/lib/Email/Val...

This checks email addresses using regexps and DNS.


I'd imagine domain squatters have snapped up all the common misspellings of popular mail providers, so a DNS lookup probably isn't as beneficial as you might imagine.


Now if only this could prevent people from accidentally using my email address instead of their own...


Ha! Yup, in the past I had a catch-all on "tellmelater.com", and there were a bunch of people who apparently use that domain as part of their "I'm not gong to give you an email address" address.


If you can predict the correction with good accuracy, why wouldn't you just fix the address on the backend? The only reason I can think of is to avoid spamming someone else in case of a wrong guess. But for popular domain name spelling errors, that should be almost never.


"Almost never" is not good enough when you silently send someone's sensitive files to the wrong person.


What is someone actually has a domain that is similar to a popular one? They would never actually be able to enter their email since it would constantly be "corrected" to the wrong address.


That would be an awful suggestion. What would this poor chap do? http://gail.com/


Do both? I'd personally like to know if/when I enter my email address incorrectly.


So awesome that this exists, and the timing is serendipitous -- I was about to ask someone on my team to make just such a thing after I noticed that the vast majority of our bounces are really obvious typos of popular domains, like gmial.com.


Out of curiosity, is there a significance or reason for using 2 as the threshold value?


Thank you very much, i just implemented it on a site i'm working on, in 5 minutes!


I'm not a huge JS fan but wow this is a brilliant solution. I had no idea how many people mistype their email addresses. Thanks!


This is awesome–a trained DB of common misspellings for email addresses would be so handy.


Does this check all possible/known domains or just the most common domains amongst users?


Very cool but how is this going to affect my hatmail.com address...?!


Awesome, we'll be implementing this ASAP! (:


Great resource. Will be implementing this.


50% from what?


good idea! i think im going to make something like this for my website but use php instead.


Great! Now can you stop video recording everybody that walks by your office on Castro st? What's up with that anyways?


email != identity There are better alternatives to login these days such as OAuth and OpenId, or FB for that matter. Nerveless, this script could be useful - well done!


Maybe for you, but I prefer using my email address.

It's certainly plausible that Google or FB login is a more pleasant experience for customers, just be aware that a non-zero number of people will bounce from your site if that's all you offer.


No, I'm not talking about preference here. This is a general observation with email. It's so easy to go and get a throw-away account, which means that email is not best suited for identity as the other solutions. Sure, facebook and OAuth login suck because it gives the website you login to too much access to your data. However, not many hackers know about OpenId and that it's really cool for login. With OpenId, access to your data is more restricted, and the website doesn't have permission to access your account like in OAuth. Google's OpenID gets it right, it is possible to log in to sites without disclosing your email address, and that's how it should be.


It's so easy to go and get a throw-away account, which means that email is not best suited for identity as the other solutions.

How is that different from OpenID? Creating an account on e.g. MyOpenID is easy enough - you just need to write an username and a password; even the email field is optional.


who uses OpenID? I honestly don't know anyone that does, not even hackers


Do you have a Google Account? Your Google Account is an OpenId account. So is Yahoo. Many hackers use it without knowing...


Stack Exchange sites (e.g. Stackoverflow)?


Use something like OAuth, OpenID, or FB where email would suffice, and you lose me as a customer.


This is a great idea, thank you.


I bought a license for an app recently and didn't get my license key because I entered gmail.con or something. I was really pissed at them for about a day until I pestered them and they sorted it out. After it was over I was still kind of annoyed with them, really for no good reason, even though I was the one who goofed up.


Hey, maybe that was me! I sell software online via PayPal IPN, and at least once a day (often more) I get email from customers angry at me for not sending their license key. Invariably it's because they no longer use the email address that PayPal thinks they do.

I tried highlighting this in the FAQ and on the purchase page, but the rate at which it happens didn't seem to change.


A good solution for them would be doing an email confirmation before charging someone's card.


Well, in this case they use Paypal for payments, and I used my Paypal account, so I got the receipt just fine, but never got the license key. So the possibility of "I gave them the wrong email address" never entered my mind for me to double-check.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: