Hacker News new | past | comments | ask | show | jobs | submit login
Giving away the secrets of 99.3% email delivery (37signals.com)
245 points by themcgruff on Jan 31, 2012 | hide | past | favorite | 85 comments



As anyone with half a clue about e-mail delivery will tell you: SMTP accept != delivery.

Almost every spamfilter in the world will accept your e-mail and then silently discard it. The 0.7% instant rejects that you see are merely the tip of the ice-berg.

Thus, the only secret you gave away is that you don't seem to understand how spamfilters work.


Amen. Without a robust delivery testing program, there's no way to know whether emails are actually landing in the inbox of actual users.

37signals has an easier time of it because the vast, vast majority of their emails are transactional, so users will pipe up and complain if they are not getting them. Marketing emails are a lot harder to troubleshoot.

And before the masses get up in arms about spam, when I say "marketing" I am talking about even the most compliant, user-friendly, double-opt-in email campaigns. Even if people happily sign up for your emails, that doesn't mean they will tell you if they stop receiving them.


> Even if people happily sign up for your emails, that doesn't mean they will tell you if they stop receiving them.

Plus, when they decide they don't want it six months later, they'll use "Report Spam" to stop them from showing up in the inbox.


You comment reminds me that we left out a biggie: We include a "stop sending me these messages" link in almost every email we send. The link actually works too.


I think ceejayoz was saying some users are lazy or don't remember signing up for your service, so they'll click the "Report as Spam" button as a quick way to "unsubscribe".


Or some users assume that by clicking the "unsubscribe" link you are actually confirming the email is read and might get even more spam.


That's probably because that's how it used to work. Remember before can spam laws?


What makes you believe it changed?


For some mailing lists you find yourself onto that's actually the case, removing yourself is next to impossible...


I know this wasn't clear from the post... We include a "fake" image (with a unique has for that specific message) in all html email so we know if it's "really been delivered" (ie it was actually opened).


And you get a 99.3% return rate on your image? That seems highly unlikely given that many (most?) mail clients do not display images by default.


Copied from Noah's comment on the post (I asked him first):

"We do track open rates for emails that are already HTML formatted and making remote requests for images, but you’ll never get 100% accuracy with that metric because many people use plain text emails or don’t load images. Our experience is that the best you’ll ever see is between 60-70% “open” rate because of this. Some of our applications in some contexts only send plain text emails as well, so we don’t track open rate there at all.

Why is remote server acceptance rate important anyway? Some thoughts:

1) First, because hard bounces really do happen a lot, and at our scale, a 1% difference in hard bounce rate means 160k messages per week that aren’t making it to users, which means a poor experience for many and many support requests coming in to us. Based on all the information we’ve been able to find, we’re pretty sure a 0.7% hard bounce rate at our scale is pretty good.

2) Second, because it is a relative metric of overall deliverability. Our experience has been that when we do get on a blacklist, servers start hard bouncing our mail until we get off of it. As we’ve improved our SpamAssassin type scores over the last few years, we’ve seen an improvement in hard bounce rate. We also see a strong correlation between hard bounce rate and the number of email delivery related support requests we get. While it’s not the perfect measure, it’s the best measure we have available that we can reliably monitor.

Again, there’s no perfect way that I know of to reliably tell whether an email is getting to a user, since read status isn’t particularly accurate. We use whatever we can (hard bounce rate, open rate, number of support requests relating to email) to get as close to that as we can."


I would be interested in know what their open rate is. As it was mentioned in another comment most email subscribers don't turn on images making open tracking a metric that is very low.


"Almost every spamfilter in the world will accept your e-mail and then silently discard it"

I don't think that's true. I've been actively running email servers for going on ten years now and the vast majority of mail servers do the opposite. They reject at SMTP time. Accepting and then silently discarding email is considered bad practice, and is not as common as SMTP time rejection.


Server-side host-based filter (e.g., blacklists): reject at SMTP

Server-side content-based filter (e.g., spamassassin): usually marks up the email based on rules, and may either reject/discard it, or leave it for the email client

Email-client filter (e.g., in thunderbird, outlook extensions, etc.): silently discards


Obviously when you're at the scale of 37Signals... you're gonna roll this stuff yourself. But! I don't want people here to get discouraged about using 3rd party services. I too am guilty of "NIH syndrome" or the need to reinvent the wheel all the time. I've setup mail infrastructure. It's not pretty, and it's actually not necessary any more.

I've been using Postmark for a while now and it's been fantastic. They provide a great admin interface with insight into what is being sent and why mail might not be getting delivered. The setup process is very simple (none of the downsides 37signals mentions... ) you essentially add some SPF and DKIM DNS entries and you're off to the races.


So glad you're happy with Postmark and thanks for the kudos - the entire team really appreciates it.

37S is doing a great job with all of their techniques. Of course, they also have an incredible sending volume on their side that helps improve the impact of any of these techniques.

One of the biggest benefits of our approach at Postmark is that you get those same benefits regardless of your send volume: http://blog.postmarkapp.com/post/14127210172/the-false-promi...


Postmark is magical. I've used them for several projects - emails just work. Really excited to try out their new in-bound features.


Postmark's inbound email mail parsing feature is very nice. I used it last week to set up automated order shipped confirmation emails for an e-commerce site we built. I just set up an auto forward to the Postmark inbound address, and they turn around and pass a nicely parsed JSON representation of the email to a URL on our web server.

See http://developer.postmarkapp.com/developer-inbound.html for details.


Thanks John - we're looking for creative uses of Postmark Inbound. If you're interested in sharing, drop me a line: alex@wildbit.com


I'm disappointed with the hand-waving dismissal of "why should we pay someone tens of thousands of dollars to do it?"

The list immediately after that of some of the headaches that getting e-mail delivered entails (monitoring and responding to blacklists, various configuration, feedback loops, etc.) is a very good argument for paying somebody to learn and handle the details. I don't see a convincing argument that getting 1% better delivery is worth spending time on instead of doing something else; indeed, they make the argument that improving validation and reporting on the app side is a much better use of time than fighting for that extra 1% on delivery.


"why should we pay someone tens of thousands of dollars to do it?"

Because it costs us less and we have better deliverability. Also, 1% of 50 million emails a month is a lot of undelivered mail.


Do you have better deliverability though? You said you use Campaign Monitor for newsletters and those are a different type of mail than transactional emails from an application (much more likely to be flagged as spam too!). Seems like two different things and would be hard to compare.

Campaign Monitor isn't what you'd use for your apps, you want something like SendGrid.


Sendgrid + their whitelabel stuff (SPF/DKIM) = better than 99% delivery for over 100k emails/day.

SG is probably the most pleasant email-related experience I've ever had.


We use SendGrid too at my company. It's really great and easy. Integrating with their SMTP server is really simple.


thanks to @getsat and @potyl for providing some insight to their experience with SendGrid. we appreciate the testimonials!

the discussion around managing email in-house vs. paying a 3rd party to do it for you, is of significant complexity, as is evidenced by the length of the original 37sig post and this comment thread.

i think it's important for each and every company to evaluate their own unique situation - the needs of their product, the resources at their disposal, the role email plays in their overall business model, their relative experience/expertise in email vs. other elements of the customer experience they are building, the maturity of the company/product, etc. it would be very unwise to make a decision on this type of matter, based solely (or predominately) on factors like "successful company X does it this way", or "successful company Y does it that way."

every company is different, and they often face this specific decision at different junctures in their life cycle.

the most useful lesson that can be gleaned from this conversation (which i've really enjoyed following), is this: as a business leader, entrepreneur, developer, etc, you have options! if you want to do it yourself, it's possible -- if 37signals can do it, so can you. but if you don't want to do it yourself, or aren't confident in your ability (for whatever reason), then there are several awesome companies out there to choose from, each of which has its own strengths and weaknesses, which puts you in the position to select the most ideal solution for your unique circumstances.

that's all :)


Agreed. Only at a huge scale is a custom made e-mail solution time or cost effective.

Obviously the blog post was only talking about how effective it is for 37signals, but I fear many young startups will misinterpret it and waste a ton of time rolling their own mass e-mail solution.


Having a great delivery rates is actually a lot easier when you are big and famous, have your mail being signed by (and contain links to) a domain with excellent pagerank, have substantiation email volume, long emailing history and so on.

But it gets much harder when you are running with a recently purchased domain on cold IPs or with the spammy subnet neighbors, your subnets are blocked by "know-it-all" small ESP admins.

It is very similar to having a great credit history: you're getting approved for much nicer interest rates, there's no secret. So I don't believe an average person will get 37signals results simply by following Noah's advice.

I work at Mailgun and we help startups and established companies get "37signals level" deliverability :) We also offer quite powerful parsing of incoming email into your app, so check out http://mailgun.net


I’m sure their solution of tailing and parsing Postfix log files works well, but I believe the more typical and elegant solution to tracking bounces is to send every email with a with a unique envelope sender address that identifies that particular email, which makes it easier to collect bounces without relying on correct parsing of an MTA log file.


By far the biggest cause of failed email delivery we see is due to bad email addresses that were entered in to the system—problems like ‘joe@gmal.com’ or ‘sue@yahooo.com’.

I wonder if you could hack up an 80% solution by comparing the submitted domain to common email domains, and giving a warning if it looks like a misspelling. That way, 'joe@gmal.com' would see a second step in the signup process asking him to double-check his email address. Any idea what percentage of misspellings that would catch?


I don't have any hard stats, but here's some anecdata.

One of my consumer-facing websites gets lots and lots of typoed email addresses. Based on the kind of support emails I get, my impression is that the general audience of this site is borderline illiterate.

I mined the user database for common email domains where users had signed up, but never confirmed the email address by clicking on the link in the welcome email. Based on that, I created a bunch of regexps that detect the most common misspellings of gmail, yahoo, hostmail, msn, etc. I also check for things like <domain>.con, <domain>.cm, <domain>.om, and the other various typo permutations.

If the user enters a suspect email address, the system asks them whether they're sure they entered it correctly. In most cases, it will also suggest what it thinks they were trying to type: "You entered example@verzon.cm as your email address. Did you mean example@verizon.net?"

This reduced the bounce rate significantly.

For those cases where I still get a bounce to the welcome email (mistyped username, or a domain I couldn't autocorrect), I have a process that parses the bounce messages and flags the user's account as bouncing. If that flag is set, every page on the site includes a warning box that basically says "hey, your email bounced... please update your email address". When the user updates their email address, the system sends them a new confirmation email.

This email update dialog also requires the user to type their correct email address twice, because at this point they're known to be a bad typist. :) The original signup form only asks for it once, which improves conversion rates over requiring double-entry.

The combination of both of these techniques has reduced my support load for bad email address cases down to basically nothing.


Heh - take your database, stick a webservice in front of it (request: "is this address valid?" response: "probably" or "they may have meant X") and charge a (small) subscription fee.

I'm sure that there are a lot of people who would find this valuable and you would gain a bigger dataset to refine your responses.


Check out how we use Postmark bounce hooks (http://developer.postmarkapp.com/developer-bounces.html) to catch these problems in Beanstalk: http://blog.beanstalkapp.com/post/758755557/handling-email-d...


FWIW, gmal.com has neither MX, nor A records so a couple of quick DNS lookups could have confirmed that the email address was incorrect.


Others are nabbed though.

    $ for t in mx a; do dig +short gmial.com $t; done
    0 nullmx.domainmanager.com.
    208.87.34.15
    74.86.197.160
    $


Yes. Some domains have DNS records. That is correct.


What happens if the user makes a typo but enters a valid email address at some other domain?


The email goes to that address at that other domain


Would some sort of distance score be more efficient though?


Getting whitelisted and managing your mail server isn't easy and usually costs a lot. I think it's only something that should be optimized when you are sending millions of emails pr. month (like they are).

For everybody else, you can save a lot of time and money by going for Amazon SES, Google App Engine, SendGrid or Postmark (etc.) A lot of these services also include analytics and monitoring and will be cheaper than rolling out your own customized solution (in terms of time and money).

Even for them they would only pay around $6000-$7000 pr. month by using Amazon SES.


Funny how Amazon AWS newsletter gets flagged as spam by gmail each and every time. Even if I tell it 'not spam' each and every month. Something tells me Amazon SES will not get a better gmail treatment.


It happens to me, too. Perhaps they can learn something from 37signals?


Maybe AWS is a Google competitor?


Has it gotten that bad? I remember back in the mid-90s, when I was still young and crazy enough to fiddle with sendmail.cf on my first few Linux boxen, setting up and running your own mail server wasn't that difficult (and again, I'm talking about sendmail here). Has the whole spamocalypse turned this into an nightmare of that magnitude?

There goes my plan for moving my personal accounts away from GMail before March 1st...


Can't speak for anyone else, but personally I'm getting fed up of legitimate personal e-mails going missing even when sent to long-time friends I've swapped mails with on numerous occasions, only to find they're in the spam tray. These are plain text messages, all sent from the same very small pool of mail servers, with correct headers, consistent From: address that really is me at a personal domain I've held for years, etc.

I should not have to jump through non-standardised and somewhat broken hoops like SPF and DKIM just to get a goddam e-mail delivered to my friend. When you reach that point, it's not your e-mail client or domain registration that's broken, it's the receiving e-mail system that's so paranoid about spam that it's regularly diagnosing and (silently) rejecting false positives.


No - it's not bad at all as long as you have some minimal understanding about what you are doing.

Graylisting + Delayed SMTP prompt + Blacklists + Whitelists works pretty well for filtering incoming spam.

For outgoing mail you just need to make sure that you're not sending mail from a dialup IP range, or some other IP range included in common blacklists. It also helps to register on one of the more common whitelists that are available for free. And of course you need a valid reverse lookup - many mailservers don't accept mails from hosts without them.


I run a mail server on my VPS, did nothing special to set it up, and it mostly works. I probably should look into SPF and reverse DNS and all the other TLAs, since at least craigslist.org silently drops all email from me. There have been two or three other cases over the years. But I'm really lazy and just fall back on gmail for those.


FWIW we haven't done any "paid" whitelisting. That's too mafia-esque for us.

SES requires each sending address to be verified upfront. For us that's an issue.


Why is it an issue to verify sending addresses? Your setup now requires that you control the domain as well.


You don't have to pay to get whitelisted with an ISP. You have to have good IP reputation then apply.


Many ISPs are turning to returnpath to manage their white lists, and that is indeed a daunting financial outlay.


Which big ones use that as their white label?


Cox, Comcast, Roadrunner, etc. Most of the large ones besides Gmail.


Thanks - that's why I use sendgrid & mailchimp. Don't have time to care about these things.


Don't know why you're being downvoted; any business <$1M annual revenue that has to send loads of e-mail shouldn't be wasting time rolling their own solutions


We do more than that, but only send a few hundred emails per day outside of our internal google apps accounts (using sendgrid) and then blast a newsletter once a week via mailchimp.


No secrets here, these are basics. Since your mail is mostly transactional your volume is significantly low compared to people who specialize in mailing. Transactional email is less likely to produce spam complaints. Misspelled addresses are detected on first sends, which is what you want. You're going to get an error at the smtp level or a bounce.

Here's a secret - Monitor the ip to domain ratios , usually gmail will allow 1k of mail from the same ip per hour.


While we all love services like Postmark and Mailgun on HN (and they are great services), does anyone know of any easy-to-setup open source projects that offer similar functionality? Traditional MTAs are a pain to setup and outdated to boot for webapps (mail parsing, signature and quote removal, UTF-8 transcoding, automatically updating spam detection, linkage to arbitrary storage handlers like MongoDB, and an HTTP+SSL JSON API should be a minimum).


I don't know if it provides precisely the functionality you're seeking, but Lamson was designed to address the need for an easier to setup, but still programmable and powerful mail system. It's designed to be able to use ORMs, databases, and data stores instead of text files as well.

http://lamsonproject.org/

Example list of supported backends:

    Django's ORM
    web.py's simple database library
    Tokyo Tyrant
    Raw SQLite3
    SQLObject
    CouchDB
    Mongo DB
    SQLAlchemy
Part of the problem with this sort of thing is that in most minds, the modern MTA is qmail or postfix which is in my view just sendmail++.

Let me know if this helps you, if not, tell me why so I can try to figure something out.


Fantastic, thanks so much! It looks like it's almost there, and definitely something I will check out in more detail. Admittedly, my use case is limited, but as services like Mailgun and Sendgrid are demonstrating -- it is not an insignificant use case. Lamson looks like it's trying to achieve a bit more than that single case.

So a couple of things I noticed that might be missing from Lamson:

(1) Spam blocking auto-updates -- does this tap into Spamhaus, etc.? (2) UTF-8 transcoding? (3) Signature, quote stripping (like Mailgun). (4) DKIM signing? (5) Simple client libraries for sending mail (i.e., I don't want to build a "Lamson application", I just want to talk to a Lamson server on a privileged port using JSON).

Of course, all these concerns might have answers that I'm missing.

A bit more about my (fairly common, I'm guessing) use case: you have a VPS and a MongoDB instance. I want my MTA to take a minimal set of config parameters, say:

-- MongoDB database name -- domain -> MongoDB table mapping -- privileged API sending port

Now I want the MTA to receive mail for me, perform spam blocking, transcode to UTF-8, parse out signature and quotes in a separate field, and dump the whole object to a MongoDB document with the index field = the recipient's email address. I want it to auto-update it's spam blocking rules using whatever external services and internal analysis necessary. I want it to periodically dump some usage statistics into a separate MongoDB table (maybe a circular connection).

Now for sending mail, all I should have to do is connect to the privileged port from a process on the same machine (it's firewalled to the outside world), and submit an HTTP POST request that specifies recipients, message body, attachments, etc. The MTA should accept the request, perform whatever queuing, rate-limiting, and retrying is necessary to send the message.

All failures and diagnostic information are dumped to separate MongoDB tables. The applications deal with the database/storage system directly, to keep the MTA simple.

I'm glossing over a lot, naturally, but I hope that helps.


Be advised there are messages lamson will fail to deliver because it can't transcode them at the server (regardless of what clients support). I have the impression it can't even bounce them, though I'm not sure about that. It should be reliable among speakers of English and other ISO Latin languages.


Lamson isn't an open source alternative to mailgun and sendgrid per se, it's an open source alternative to writing your own library for email.

With that said, specific functionality is more a question of learning lamson, and tying in libraries that do do what you want, or finding someone that already has on Github.

For your use case, you would be tying together and implementing bits and pieces of this yourself. Part of the reason for Lamson existing is to enable programmers to solve their own problems with a common sensical base to work off of.

It's more of an equivalent to a web framework than a CMS.

If that isn't workable, you'll have to either hire somebody familiar with Lamson to hack it up for you, or you'll be at the mercy of existing plugins to tie an MTA into MongoDB. (Doesn't exist).

Do you have a more specific question for me to address? The answer to pretty much everything you brought up is "Cool, so go make it". The point of Lamson being the way it is, is so that you're not limited by the creator's intentions, just by your programming ability.


Thanks for this. The more I look, the more it seems like the answer really is "cool, so go make it". :)


Also not mentioned is monitoring if your users actually read the emails. A web bug will work (some of the time) as will checks to see if users ever respond (eg if there is a link to click for full details) or login to the site after email receipt.

LinkedIn noticed that I never read one of the group messages I was getting and so switched to a far more infrequent digest of highlights. I think they may even have unsubcribed me completely from one group.


re: Linkedin: I found it hard to believe. I receive the same emails from them for years and most do not open /straight delete. and they never stop.


They are indeed inconsistent. They did the throttling for me for a few groups while others I never read keep on going. At some point I need to figure out how to unsubscribe, but it is currently quicker to hit delete (unread).


i recently finished a script that ran our 35k customer records through DNS/MX/SMTP servers and flagged all invalid addresses.

sadly any properties that use yahoo servers or sympatico.ca, bell.ca, always return "OK" and you physically need to bounce an email to verify it. many of our customers have @yahoo addresses so we still have about 25% that simply cannot be validated without a "probe" message :(


> sadly any properties that use yahoo servers or sympatico.ca, bell.ca, always return "OK" and you physically need to bounce an email to verify it

This is standard technique used to prevent spammer from mining the whole userbase.


yes, i realize this. still doesn't help the situation much :(


What are those options for feedback loops? Hotmail got one [1], but Gmail and Yahoo don't, AFAIK.

[1] https://support.msn.com/eform.aspx?productKey=edfsjmrpp&...


Can anybody recommend a service that would let me check whether an email (sent via my or the service provider server) was picked up? They discuss this functionality in their system but I'm looking for someone I can outsource it to.


It's not worth outsourcing. We just add a tiny gif that has a unique hash. When that file is requested record the hash and mark the identified message as read.


Seems like most e-mail clients these days don't load remote images automatically - do you find that significantly affects this technique?


The hashed image is the same method the big boys would use to record open rate.


Ok, but surely they have the same problem then?


Thanks - is there a library I can use? Wouldn't want to reinvent the wheel


I'll talk with Noah about open sourcing it. Right now it would take a little bit of effort, but it's probably worth it for the good of the community.


Do not rely on it. As the other message says it is trivial to do, but it will underrecord by a lot. Measure by actions not opens.


HN is becoming 37Signals News.


Marketing before the release of Basecamp.Next.


Regardless of whether HN has too many 37Signals posts, this is a valid point. HN is worth a thousand uniques an hour and startups would do well to study the success of 37Signals in writing popular content. I say this as someone who finds 37Signals very off putting!


I too find it rather annoying. I understand that the OP is one of their employees, but this is all he/she has been posting.


Other than the talk about tailing the logs, and using the three servers, there isn't much revealed to why their delivery rate is so high.


Cool. Then if you use their findings published in this blog post, they will publicly shame you and your investors.





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: