Hacker News new | past | comments | ask | show | jobs | submit login
Startup Fuck-ups: How we lost 25% of our monthly revenue overnight (medium.com/insync)
114 points by noellep on Oct 30, 2014 | hide | past | favorite | 56 comments



This could be solved an even more fundamental way: Don't run your own mailer as a startup. There are lots of companies that will be responsible for email deliverability on your behalf, via an API. If it took them 3 months to notice no mail was being sent at all, imagine how long it's going to take them to figure out that their IP is blacklisted in Spamhaus or any number of other deliverability issues?


Focus on your core competency, as someone who has run production mail servers in the past I know it's not something you really want to do (My advice: don't) there is always some fire to put out or blacklist you were put on that needs to be cleaned up.

Mailgun offers 10,000 emails per month free and is dead simple to use.


I'd modify that by saying; if you really know how to run a mail server go for it, but if you don't then leave it to someone who does know. Same goes for just about every other service your start up relies on; accounting, legal, etc.

Cost benefit analysis (even a brief one) will always help, even if you get the answer wrong. It's just a part of planning; your plans don't always work, but if you don't plan then you'll never know if you succeeded or not (or why) until it's too late.


To ad a couple more cents to the pile: email is serious business, get serious about it. That doesn't mean spend a lot of time. Use services. But, be serious about it.

Figure out if you're sending the right mail at the right time. Cut out unnecessary emails and unnecessary junk in the emails. Make sure they get delivered and read.

Any service from a dating site to an app for accountants a wholesaler of underpants needs emails and how well you use them is important. You can be minimalist. Analytical. Formal, casual.. There are as many approaches to this as there are to anything else. Just don't take it lightly. Get good at email. Just because it's simple doesn't mean it's not important.


I don't entirely agree with this.

I don't use my own mailer currently. I DO, however, use postfix to queue and relay email to rackspace, who actually sends my email.

I don't think the API method is appropriate because then you need to run some other queue system so your app has an instant response time for the user. Their action would create a queue entry (with whatever data) that will eventually be fired off as an API call to whoever you're using to send email via an API.

Or, you just set up an SMTP relay and use sendmail/postfix/whatever locally to handle that part of it.

I often see these startups using the API calls as part of the customer facing flow (website or otherwise) and the increased latency waiting on that API call to return really, really sucks.


Just wondering: would it work if your app provider supported email submission via Job queue (e.g. Amazon SQS/Iron MQ) ?


Yes, it would. In that case you are also punting on the job queue responsibilities and using a third party for both email and queueing jobs. That's not necessarily a good thing or a bad thing - it's just a thing.

You'll want to queue those API calls so your queue mechanism returns to the user very quickly, and then the emails can fire out .5-5 second later.


Every email service I've seen offers an API and SMTP. For many applications, SMTP will work just fine.


Excellent advice, the only thing I'd add is that it's like this for everything that isn't core to your service when you are getting started.


Maybe true, but a mail server in particular is one of those things that's a lot harder to set up and maintain than it looks.


Not everyone wants to outsource transactional mail handling to a third party.


Just because you don't want to doesn't mean you shouldn't.

The major advantage to outsourcing it is that the people you're handing the job over to actually know what they're doing and are experts at it. Plus they can spend 24 hours a day checking this stuff instead of you.

The arguments for being in-house on absolutely anything but your core competency when you're an early-stage startup are really hard to justify.


Especially with the number of services that offer zero-cost plans at low usage levels! You can rarely even make a fiscal argument for keeping this kind of stuff in-house, much less a skills argument.


If someone wants to do that then they've never tried to handle even a moderate volume of transactional mail themselves. Or they're starting a transactional mail service provider.

Either way, my hat's off to them. It's a giant pain in the ass that can be resolved for less than $20 per month.


I completely agree. I think unless your startup or core competency is something like that, why shard your core product's development/IT/whatever work to support an in-house product that's bound to have bugs(tm) right out of the gate? There very much should be a cost/risk analysis but there's an awful lot of NiH that honestly should be avoided if you can genuinely help it. You're still more than welcome to roll your own everything but you're just delaying your core competency to maintain something you could offload to someone that specializes in X, Y, or Z.

I would also suggest if you do go that route, have accurate metrics on everything. How much time are you spending maintaining what essentially is a separate product? How much is it taking from your MVP or is it part of it? If you can translate that to cost then you can judge how much offloading that service would be to someone that specializes in it.


That's the straight dope. #truth

Having worked on the IT and Biz side of tech companies (happily SaaS closing that gap daily) I can tell you the logic patterns are diametrically opposed.

IT = born problem solvers, nothing is too big, small or complicated BIZ = friction solvers, nothing can be too efficient

The struggle isn't that 'you' could do it (better, faster or even cheaper) in your mind... but that if isn't a core competency moving your business forward, write a check and let someone else handle the core. Break free from the burden of 'undifferentiated heavy lifting' and improve your core product.

We sell a SaaS solution and in turn I write ~20 checks to other SaaS providers a month to keep me focused on improving our customer's experience.


This is the whole idea behind economic specialization & division of labor[1]. And even if you were to be better at sending email than Mailchimp it's likely that you're even more better at your core competency which means that you should farm out your email sending and focus on your specialization.

[1]http://en.wikipedia.org/wiki/Division_of_labour


I think the specific term you are looking for is comparative advantage: http://en.wikipedia.org/wiki/Comparative_advantage


Yup - that is indeed what I meant to put. That's what I get for posting while on the phone.


But clearly if an error in it is going to cause 40% of your business to drop off "overnight", you might want to outsource it to someone who can actually maintain it properly.


If you want to send bulk email blasts from a large list then definitely use a 3rd party, but it seems their emails were triggered programmatically from many different parts of their application. The email providers I have used don't make this easy so it is usually better to run the server yourself. You don't need much knowledge or experience to run an email server (if you are competent at general IT tasks already) but you need one thing: to actually check once in a while to see if your sever is sending email.

Which is the real problem here. Team members knew mail wasn't being delivered in the forums and they chose to ignore it. They must have never done any follow-up (personal email, phone call, survey) on new customers even when they were doing their big marketing "ramp-up." They must not have even checked with a test walk-through of the new user process. Leadership was just too far removed from the customer experience, whether they used a 3rd party email service or not.


>> The email providers I have used don't make this easy so it is usually better to run the server yourself.

Mandrill, Mailgun, Sendgrid all make this easy.


It has been a few years since I have been saddled with email responsibilities so it's not surprising I am a little behind the times. But it still doesn't mean you shouldn't be checking that customers are receiving your emails.


with alot of these services you can actually watch on a google maps app as customers open your emails.


I'm not sure I understand. Why couldn't they use a 3rd party hosted smtp server to send mail programmatically?


I've used SendGrid for transactional emails and it works like a charm.


I recently migrated a couple of my sites away from directly sending transactional email, to using MailChimp's Mandrill service. A few months later, I only wish I had done that sooner.


What we found was that a number of failed jobs were being kept by the system, which meant that these were taking up a ton of space that they shouldn’t have been.

To fix the issue, we put together a script to delete the failed items, since any retries to send them didn’t appear to work.

At this point my head was screaming "NOOOOooooo!" and made me feel bad for author for the whatever disaster would soon follow.

Not only was the problem not fixed, it wasn't even understood. Hiding the problem by fixing its symptoms will rarely get you far. I don’t think I'm even Captain Hindsighting here, as I've learned over and over that not understanding the root cause of any issue means you will be screwed by the issue sooner or later, and it likely will not be pretty.

Sure, sometimes you don't have the time to get to bottom of an issue, but even then you cannot pretend that it's fixed. It'll be back with a vengeance.


Well said. This is the end result of letting developers perform system administration tasks. I doubt that they were able to Google the root cause on Stack Overflow. :)


Exactly what I thought!


I don't think terms like "Fuck-ups" or "screwed" should belong in corporate communications, start-up or not. It's cool they are talking about this openly, but unfortunately what I took from their write-up is that their communication style is less than professional.


I don't care what language they use, personally. However I am at work and having the F-word in 72pt font blaring from the top of my browser made scroll down very quickly. I doubt my bosses would be thrilled to see that.


Man, I'm sorry, but what sort of job do you have where seeing the word "Fuck" is like an actual job risk?


It pays awesome and is easy work. If the handbook prohibited saying the word "zucchini" under penalty of dismissal I'd happily strike the word from my vocabulary.


Sorry, fuck-up is an acceptable professional term now. A fuck-up is an error that is so bad that from CEO to customer, there is no sense in calling it anything other than what it is. Send 10 emails at once to a customer? Sorry, we fucked up. Lose 25% of revenue for several months? Sorry, CEO, I fucked up. It is an admission that you have made an error that will happen less than once a year and hopefully only once a career.


Further evidence that "if you aren't monitoring it, it isn't happening"! Ensuring that code is monitorable needs to be right up there with ensuring it's testable.


"Small" is relative -- at the "small startup" I run, our email volume is low enough that I BCC myself on every email the site sends. Poor man's monitoring and it won't scale, but I usually notice within hours if something had been broken.


Where I work, BCC-ing on all mails is not scalable. Instead a random sample of the mails is sent to ourselves.


As someone who has created quite a few of these mailers, the queue getting stuck on a single piece of mail and hanging indefinitely is incredibly common. As time has gone on my solutions have become simpler and more pragmatic, since additional complexity breads additional problems.

For example, if I was going to design an emailer today:

-Grab the email from a database save it to a file (likely one or several XML files) and place it in an "Outgoing" directory (ye olde file system).

- Then have a process which grabs an atomic lock (only one running at a time!), gets the directory listings, and launches the actual "sender" for every file individually (concurrently).

- When the launcher launches the sender it records the PIDs of the process against the actual emails/XML files internally.

- After a set wait period if any processes are still running, the launcher kills them, and moves the email/XML into a "Failed" directory which we monitor independantly.

- Every email which is sent gets moved to an "Archive" directory by the sender process, and we monitor that to see if no emails have been archived for a long time (e.g. 30 minutes).

You can accomplish the same thing using a database (Outgoing, Achive, and Failed tables), but frankly with so many awesome file system tools already around it doesn't make sense to reinvent that wheel. Plus people intuitively understand that if a file is sitting in the "Outgoing" or "Failed" directories then it hasn't been sent yet (just like your client would!).


I strongly approve. That's how I implemented the backend of my medical records exchange stack.

File system based queues. Point-to-point data interchange, so no concurrency; your notion of imprinted work tasks with PIDs is a good idea.

I used a "pull" model. A thread would take work from one directory and drop into another. Poor man's workflow. Worked great. Super easy to monitor and troubleshoot.

Using Java, implementing the cross platform file locking (so a downstream process wouldn't pull a task before it was ready) took some finesse, a small caveat.


Sounds like you just recreated Microsoft Biztalk.

But to sound less like a dick - communicating with a low probability of errors is hard !


Having implemented, deployed, and supported "workflows" using BizTalk, ICAN/JCAPS, Orion, homegrown JMS-based stacks, Cache & Ensemble, etc...

Someone1234's solution is the antithesis of BizTalk, the complete refutation of the ultimate futility of using workflow engines.


As someone who's actually used Biztalk quite extensively: You wish Biztalk was that straightforward.


If you don't outsource this type of service (preferred solution imho) then from the very beginning you have to monitor internally (the solution done after the fact) and also and most importantly externally, in this case having one or more monitored client-like email accounts.


I have a script that sets up 90% of a full nagios/icinga server automatically in about 5 minutes.

Why, in 2014, are people still not monitoring everything as job #1?

Why isn't this being taught in schools? How do people with tech jobs not know this?


Make sure your mistakes never directly affect your consumers. Don't spam them, don't overload them with ads, and don't leak their PII. Quickest way to lose customers.


I'd really like to know what the mail server software was that failed this way.


No one in the company was subscribed to their mailings?

Hope they're monitoring their backups.


scheduled emails to check in with our users.

That's not "transactional email". That's spam.

You're a spammer. Die.


"I am single, 41 and pregnant"...

I don't know when pregnancy became a part of a person's identity. I guess this adds to the coolness factor these days because you have to try harder? Sigh.


Pregnancy at 41 is generally high-risk. So you have lots of doctor's appointments, often lots of discomfort and you probably end up with mandated (and necessary) bed rest.

Being single while dealing with that is an enormous burden mentally and physically.

I didn't notice it referenced in the story, but it is definitely a factor that would influence a person's behavior that would be relevant, especially in a startup context when everyone is wearing 10 hats.


Where does it say this? (I guess it's been removed?)

And why does it matter to you whether or not the author considers being pregnant to be part of her identity?

IMO it would be more healthy to comment on the content of article than than the author's identity.


The pregnancy is just as relevant as being single and 41. Why are you complaining about it and not the others?


> The pregnancy is just as relevant as being single and 41

ie. not at all


All are fairly relevant in the user's profile, which is where this appeared.


You are being quite naive if you don't understand why it's there.

When is the last time someone introduced themeselves to you as: "hi, i am xx and i am pregnant"? It makes no sense whatsoever.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: