An Update from Robinhood’s Founders

czbond · on March 4, 2020

On the profession side of this, if you're an engineer at RH in the thick of this - many have been there. It seems dire now, but in a few years the fog, panic, and haze of no sleep will become a story you tell your peers at happy hour.

Many will cast stones - but they have been there too. If they haven't, well maybe their day will also come. You may feel bad at the moment - but the best way professionally forward is "We try our best tomorrow"

cheschire · on March 4, 2020

If this were an outage directly caused by a natural disaster, I could understand. This outage was an availability problem. This clearly points to some prioritization problems within the leadership layers if robust and resilient infrastructure was not emphasized.

The prioritization problems may not be due to ignorance or malice though, and may be justifiable if there are other fires that are burning brighter. It's still pointing to problems though, and I think it's completely legitimate for engineers to question the stability of the company when this sort of thing happens.

At the very least as an engineer I would be asking some pointed questions of my leadership. Maybe not dusting off the resume yet, but still I'd want to get reassurance from internally that the leadership problems that caused this are being addressed.

malux85 · on March 4, 2020

Sometimes you just have to cut them some slack. Have you engineered a highly available cluster before? I'm not talking about the hot-standby postgres master that gets called on once every 2 years, but I'm talking about a 180 node Cassandra cluster thats doing 15,000 writes a second 24/7 and peaking at 60,000 writes a second every day, and you have to do node replacements every week or two because of the high load.

Or I'm talking about a 200 node hadoop cluster thats doing the electrical metering and billing for 8 million people, and is NOT allowed to stop.

Or the trading platform thats running sub millisecond trades and downtime means 300,000 $ USD per minute.

These are systems I have engineered over the last 10 years, and I can say: These things are complex and have failures in 1000 different ways, and while you're monitoring 999 of them that one thing you're not looking at is festering under the surface (your monitoring system is tracking IRQ hardware interrupt response times, right???)

Part of being in a team is everyone pulling together, and yes it's stressful at the time, but even very good management cant see all ends, just like very good engineering cant predict everything. I don't think it's useful to start pointing the finger at management and "asking some pointed questions at leadership" because sometimes everyone is doing their best. Yes we should analyse our failures so we can do better, but your tone is very accusatory, and I believe that a better approach is an all inclusive chat about how we can do better, and management saying "great job engineering" for fixing it, and giving them a break after the stressful event.

TheCondor · on March 4, 2020

Does the duration of their downtime suggest a “1/1000” unmonitored oversight? Or is it more like a threshold that was meet and probably could/should have been observed?

And FWIW, they have down time every day and weekend, at least in a virtual sense; the load does drop off in a very real sense too. You are spiritually correct, they should pull together and sort it out, and they owe nobody money here (don’t use a discount broker if you want some sort of guarantee about trades) but as a general rule you should ever feel too sorry for banker under just about any circumstances. The harshest lesson here, for everybody, was the only thing they would do for you was give you some commission free trades but that won’t work with this one, so a non-apology is what you get.

mumblemumble · on March 4, 2020

I think you may be focusing on the finger instead of the thing that it's pointing at.

The post reads to me like all those examples were meant to be concrete examples to drive home a more general argument that complex systems are, well, complex, and that there's an element of hubris in taking potshots from the peanut gallery.

tmpz22 · on March 4, 2020

I think the original point in this sub-thread boils down to: basic micro-level human error like typos + bad configuration deploys is completely understandable (to a certain extent), but macro level failures that happen by ignoring obvious trends and best practices is malfeasance.

Personally I don't think Robinhood will ever release a full honest post-mortem and so we'll never know (and never be able to judge fairly).

If the system failed by virtue of being too complex, that is also malfeasance because any devops/SRE worth their salt (as might be expected at a 7 BILLION DOLLAR company) should smell unnecessary complexity from a mile away and slowly refactor it away over the course of several years - which looking at Robinhoods downtime history they never did.

The closest example to Robinhoods engineering woes is Reddit, which throughout its early history made fairly poor infrastructure and data modeling decisions but have since repaired and improved on. We should hold Robinhood to higher expectations then Reddit for obvious reasons. Them having similar engineering capability to circa ~2012 start-up reddit is INEXCUSABLE.

eyegor · on March 4, 2020

As with any big system, spinning it up is much harder than bringing it down. After an outage, they have to stay offline to audit their systems to ensure that all the nodes are synchronized, all queued trades have been processed, and no accounts are in invalid states. I'm sure they could have restarted in a matter of minutes, but the risk is ridiculously high.

raiyu · on March 4, 2020

No doubt there are many complex systems and they inevitably go down. Every provider has suffered meaningful outages.

I think the issue here isn’t so much that the system went down but the blog post.

It’s very light on details and doesn’t go far enough in terms of re-establishing trust with the customers that were affected. Which by the looks of it is everyone attempting any trade most of the day on Monday.

luckylion · on March 4, 2020

On the other hand, they've had plenty of time and resources to do just that in a reliable fashion, it's not like it's one guy in his bedroom (I hope!). It's not like they are volunteers doing this open source for the community, they are getting paid (very well, I assume) to run the system. And Management is getting paid (even better, I assume) to make sure the priorities are right and correct decisions are taken. "Who could've known there might be a lot more traffic" sounds like somebody failed in Management, and engineering might have failed by not foreseeing the issue and/or informing Management.

Sure, don't burn people at the stake, but "hey, it's hard, don't blame them, they are doing their best" doesn't cut it for me. I'm sure they're expecting to be paid and not for someone to "do their best" to pay them.

malux85 · on March 4, 2020

Can you give me a concrete example of a massive distributed system that has zero downtime?

Because the largest distributed system I have seen and worked on was at Apple (or maybe DFP at Google) - and even though they had some of the smartest people in the world and literally billions of dollars behind them, there were still an endless list of problems and downtime events.

Spoiler alert: It doesn't exist.

luckylion · on March 4, 2020

The point isn't that "a system cannot fail", the point is "if the system fails, it's no big deal, shit happens, cut them some slack" is a weird way to look at it for corporate systems, especially in sensitive areas.

If you're running a HA system and you only need one nine to express your availability percentage, sure, sure, you have the smartest people etc and you're doing such a great job, and yeah, yeah, show me one system that has 100% uptime etc.

malux85 · on March 4, 2020

It didn't say it's no big deal, you're extrapolating and exaggerating my words because your argument is weak.

My point was that failure is inevitable in any complex system, and I was responding to the parents point that he immediately pointed the finger at management in an accusatory way, and I was saying that's not constructive.

Also your point "They expect to be paid" is actually implicitly "I expect management do do their best to pay me" - there could be a failure in the payroll system, there could be a failure in the banks, there could be many reasons outside managements control that means I'm not getting paid. I can say "why don't you have redundant payroll systems" (which is a stupid waste of resources given the cost/benefit/low failure rate) But my point is again - complex systems have failures - and SOMETIMES, JUST SOMETIMES, YOU CAN CUT THEM SOME SLACK.

0x8BADF00D · on March 4, 2020

When a fiduciary breaks their duty to their clients, you don’t cut them slack. You sue them. This isn’t like Silicon Valley where you can get away with antics like this.

malux85 · on March 4, 2020

You must be new here, welcome to late stage capitalism. Nobody rich goes to jail, and lawsuits are cost of business. You just factor them into the 5 billion dollar company, pay your 300M dollar fine and walk away a billionaire.

It sucks, I'm not defending it, but it's fact

unicornmama · on March 4, 2020

Google doesn’t target zero downtime. The marginal cost is too high. For important services (like Search page and ads) they aim for 5 nines uptime (99.999%), which translates to 5 minutes of downtime per year.

https://en.m.wikipedia.org/wiki/High_availability

C1sc0cat · on March 4, 2020

As an ex telco guy all I can say is "amateurs" :-)

czbond · on March 4, 2020

And not a mention of Erlang at all? ;)

C1sc0cat · on March 4, 2020

I wasn't in Traffic

malux85 · on March 4, 2020

I know, I worked there for 3 years on million node clusters

unicornmama · on March 4, 2020

Then certainly you understand the importance of SLOs, how SLAs regulate reliability and feature velocity.

Let’s say I’m RobinHood. Let’s pick an SLO. I think three nines monthly SLO is a good start, that budgets ~45 minutes of down time per month. Maybe I can argue for a more aggressive SLO, but let’s pick this one - because I think it will keep users relatively happy as trades aren’t blocked for more than an hour at worst. I drive an agreement with stakeholders that if we needle out of this SLO, we drop all feature work and focus on hardening reliability.

RobinHood was out for a whole day. This is unacceptable. It points to a complete organizational fuck up - product and feature development have too much power and priority at the expense of reliability.

I’m not sure that RobinHood has ever heard of SLOs or reliability engineering. I really hope their leadership is smart enough to hire and empower the right people that will drive organizational change.

malux85 · on March 4, 2020

Why would they burden themselves and their feature velocity with SLOs/SLAs when they can build a 5 billion dollar company insanely quickly even though they have downtime?

The users are not saying "We measured your 5 9's and I'm going to quit if you have 6 minutes more downtime"

Sure they lose some users who get annoyed, but they have a 5.6 billion dollar company, some users will go, a lot more are coming

unicornmama · on March 4, 2020

Users are saying “you were down for an entire day and I lost money - I’m out”.

Your reliability target is a product decision. Maybe with the right features the market will tolerate shitty unreliable financial services that falls over for an entire day. Or maybe RobinHood will go from a 5.6 billion dollar company to a zero dollar company because users hate them.

Point is high reliability is choice based on priorities - which seems like RobinHood does not care about. And I will certainly stay the fuck away from their platform.

ethbro · on March 4, 2020

> some users will go, a lot more are coming

This works in the acquisition phase, which I suspect Robinhood is nearing the end of.

Once their userbase turns into the retention or conversion (competitors have $0 trades now, too) phases, mistakes like this are much more costly in the long term.

techie128 · on March 5, 2020

You're missing the point. Reliability and Performance are features in Financial markets. It is a key feature for brokerages which they constantly advertise to differentiate themselves. These companies lay undersea cables to shave off few milli-seconds latency and pay a very hefty premium to be colocated in the same DC/rack as the stock exchange. Therefore Performance and Reliability are inseparable.

Nobody is debating whether people will continue using RH and that was never the issue. RH has massively damaged its reputation and reputation _is_ everything.

C1sc0cat · on March 4, 2020

Dialcom (Telecom Gold) in the UK was pretty close to 100% Almost survived the big storm of 87 - unfortunately the modems where on the UPS.

We built an entire new DC and had Tottenham Court Road dug up in case the Thames flooded.

In fact any big telecom will have down times for a switch (central office) measured in generations

SirLJ · on March 4, 2020

Those Silicon Valley kids can't understand reliability... Good thing my bank and my broker are not run like this... what a joke...

techie128 · on March 5, 2020

You sir have a very warped view of SV. Do not stereotype us.

SirLJ · on March 9, 2020

First hand experience... move fast and break thing, forever in beta, etc. does not always work, especially when you have proper SLAs

frockington1 · on March 4, 2020

Can you name a reputable brokerage that was down all of Monday and Tuesday this week?

Spoiler alert: it doesn't exist

plusplusc · on March 4, 2020

Interactive Brokers.

Well known in the financial community, but nearly unknown outside of it.

SirLJ · on March 4, 2020

Not true: Interactive Brokers were up as always and the API was working without a flaw...

Ntrails · on March 4, 2020

> Or the trading platform thats running sub millisecond trades and downtime means 300,000 USD per minute.

I mean, I'll bite. Assuming you only traded 6 hours a day (ie US) that'd be a 27bn dollar a year strategy, and the only way for returns to be linear and trading to be sub milli is market making/arbitrage.

That is a lot of half spreads...

techie128 · on March 4, 2020

Kudos, these are moderate sized systems you've built over your career. There are lot bigger and more mission critical systems in the world and you might build them one day.

I understand GP's tone wasn't exactly nice here. But here's the rub with RH's outage. RH is unfortunately in an industry (Finance, Healthcare, Aviation, Food, etc.) where people _need_ to trust them to be successful. The consequences of failure in these industries is very catastrophic not only for them but their clients. Sure failures happen but the scale at which RH has failed and the lukewarm response they've put out has pissed off people. I don't recall any brokerage, old or new, that has failed so catastrophically and has responded to it so poorly. If you think you have a worse example, I am all ears.

kasey_junk · on March 4, 2020

Hard to judge _worse_ in this context but while I was trading all of CME globex was down for 4 hours canceling active orders.

https://www.profit-loss.com/cme-hit-by-globex-outage/

I don’t remember them offering any apology or explanation at all.

That’s an exchange mind you where things like the global price of oil and s&p futures trade. Not a small boutique brokerage.

Further they have planned downtime every week & at that point still had planned daily downtime I think.

I think Robinhood screwed up. I think they should learn a hard lesson. But people thinking that trading is some high reliability industry haven’t spent any time in it.

The scary thing to me is are healthcare, aviation & food the same?

vsareto · on March 4, 2020

Part of AWS's sell with elasticity is only spending what you need, but those industries have redundancies or unused capacity.

Someone in one of these threads said there's a hidden DNS within VPCs that can fail and isn't scaled, so if that's true, they might just have to architect around that unless they can get AWS to change it. It's on RH for not knowing that but it's also kind of on AWS too.

But as far as what you can do, you can really only split your cash across brokerages if you want to engineer the same redundancy yourself. Otherwise, RH would need to route everything to another exchange to keep satisfying orders, and even that is just another system that could fail. Keeping all of your money in one brokerage doesn't seem ideal if you want to completely avoid downtime. Doing the same redundancy yourself with those industries isn't really practical.

wonderwonder · on March 4, 2020

Boeing's failures have killed hundreds of people. Governments still pay them and people still fly on their planes. Stores sell salmonella contaminated products all the time and people still shop there. RH's failure pales in comparison. Crypto exchanges fail all the time, people still use them. RH may lose a few customers in the short term but I see no reason they wont bounce back, they provide a product people like and the majority of people dont like change and will stay with them provided stability returns soon.

Non technical people dont want a technical apology, they just want an 'our bad, working on it' which is what was provided. The company will be fine. Should they be is another question all together.

raiyu · on March 4, 2020

Technically people still fly the old Boeing planes that don't crash. The 737MAX is still not in service, and there is a likelihood that it may never go back into service. All future orders are cancelled, and there isn't a clear pathway to the plane re-certified and more importantly for people to trust them again.

High trust systems require just that, high trust. And once broken it's hard to re-establish.

Crypto exchanges certainly have their fair issues of downtime, but don't forget that crypto exchanges for a long time operated purely for early adopters as crypto wasn't something that everyone traded. There was also less availability of competition, because again the industry was newer and there were fewer choices.

And certainly Coinbase helped to popularize crypto trading and they had their fair issues, but I don't believe they had an outage of this exact magnitude, and again they were in an early adopter area where mistakes are seen as part of the process. If not expressly, then at least subconsciously.

wonderwonder · on March 4, 2020

I think that we have entered a new 'trust' phase, where we pretty much don't care about it and just want familiarity. Look at Facebook, privacy has been violated a thousand times, and we still keep logging in. Experian is still chugging along. People used and paid for AOL for years when they did not have to.

Online consumption is different than in person. You go to a restaurant and the food is bad you probably don't go back. Online the bulk of consumers just keep going back because that's what they are used to. We love our favorites.

I remember all of AWS going down a couple years ago.

Boeing itself is fine even though one product killed hundreds. Robin Hood is going to be fine. This will be forgotten in a week.

C1sc0cat · on March 4, 2020

The Problems with the TSB in the Uk come to mind and NatWest /RBS had a similar SNAFU a few years back.

tinus_hn · on March 6, 2020

Catastrophe would be if they actually lost money. This is all indirect damage that is probably disclaimed in their terms of service.

No service guarantees 100% availability, it doesn’t exist.

LaserToy · on March 4, 2020

It is not about scale, it is about the fact that people lost real money. If you can’t make it work you should not be in that business, and I don’t really care how hard they work.

I’m taking my account off their platform.

malux85 · on March 4, 2020

Is this your first day of trading or something?

People lose money in trading all the time, for hundreds of reasons and some of those reasons are infrastructure downtime.

If your risk profile doesn't reflect that, maybe you should take your money out of trading altogether.

anon102010 · on March 4, 2020

I carry reasonable investment balances - I’m not an active trader but in this space I expect availability. I’d never put my money on RH - and this has nothing to do w risk profile

frockington1 · on March 4, 2020

I've been trading for years, would not keep a penny on that platform. They've effectively cut off all liquidity for their customers for at least 2 days during high market volatility. You are missing out on tax loss harvesting, buying dips etc.

LaserToy · on March 4, 2020

I will take it off this platform for sure.

LaserToy · on March 4, 2020

And why -2 points??? Already moving money out.

What is annoying - only 50k per day is allowed.

blackearl · on March 4, 2020

Because complaining about RH being a low quality trading platform is like being angry at the burger quality at McDonald's. You get what you pay for.

EpicEng · on March 4, 2020

Nearly every brokerage now has zero fee trades, just like RH. They weren't down Monday.

argonaut · on March 4, 2020

Nearly every online brokerage has had an outage or outages in the past.

LaserToy · on March 4, 2020

For a full day + morning and issued non apology? Also, I prefer to be with the group that is not in the “nearly all” one

dirtydroog · on March 4, 2020

Exactly. When markets are volatile I imagine they find it difficult to manage risk and so just shut everyone out and blame it on IT.

simonh · on March 4, 2020

That would be extremely illegal, at a company killing, send them to jail level and would be impossible to hide.

yuppie_scum · on March 4, 2020

Nothing was stopping the Robinhood customers from opening an eTrade or TD Ameritrade account or something and doing their trading out of that platform for the duration of the outage. Robinhood isn't really an institutional platform in my understanding anyway.

LaserToy · on March 4, 2020

How can you do it if your equity is in Robinhood?

dirtydroog · on March 4, 2020

If you used Scylla you'd have only needed 90 nodes. (Don't believe the instability rumours)

gshulegaard · on March 4, 2020

I've was a primary contributor on a migration of time series data to Scylla. As an anecdote, I once emailed our business contact about tracking down why we appeared to have data inconsistencies between our new (Scylla backed) and old system. I thought the e-mail got lost since we never heard back...until 8 months later (long after we had de-prioritized the migration since our old system was "good enough") asking if we had tried the newly released version which fixed a data loss issue.

Blew. My. Mind. Not only because of the radio silence and then dropping back in out of the blue as if no time had passed, but also because they had a data loss issue.

So rechecked out my previous branch, upgraded Scylla versions and sure enough the data differences we were noticing before appeared to be resolved. I couldn't believe the amount of time I had spent combing through my code to see if I had a hard to detect bug somewhere...but nope, it was ScyllaDB (although I am sure there were plenty of other bugs...just they weren't the cause of this specific symptom).

I am actually a fan of ScyllaDB and what is trying to do. Performance was great (as advertised) and management was simple enough; but they are going to need to work pretty hard to convince me "instability" is just rumor after that experience not too many years ago.

dirtydroog · on March 4, 2020

Well, we moved to it from Cassandra. It's yet to fall over and we're querying it maybe 200k times per second. No changes were needed from the client driver side of things too. YMMV.

ethbro · on March 4, 2020

And this is why folks still run DB2.

donavanm · on March 4, 2020

Ive seen bigger, scarier, potentially costlier time based bugs personally. I dont think this would make me reevaluate my employment if I was at robinhood. As the parent says you either learn these lessons the hard way or you havent learned them yet. Thats doesnt translate to being a “leadership failure.”

Your smaller point about prioritization is spot on though. I dont believe Ive seen any similar incidents lead to business ending outcomes. I personally point to sony or, more recently, equifax as examples of the disparity between actual business impact and technical abhorrence. In light of that why is it worth trying to preemptively solve technical challenges instead of business needs? Every calorie spent on “what if” subtracts from “whats needed.”

kerng · on March 4, 2020

Reminds me of the book Showstopper and the personal stories in - its about the creation of Windows NT. Pretty interesting how things where not so differnet some 30 years ago

In case anyone is interested: https://www.amazon.com/Show-Stopper-Breakneck-Generation-Mic...

dmix · on March 10, 2020

Interesting, so it took 5yrs, $150-million, and 250-employees to get NT shipped. Adding this one to my reading list!

bertil · on March 4, 2020

Important step though: have a retro, many maybe and write a report explaining what was messed up and how you might mitigate in the future. It looks like it’s going to be a good one. If you can share a sanitised version publicly, that would hopefully make it all a little bit more worth it.

I think I speak for everyone here if I say that, if that report is public and interesting, everyone on this thread will be happy to get you a drink.

vinaypai · on March 4, 2020

This is all true for a company that is actually pushing any boundaries as opposed to failing pathetically at a well solved problem.

indecisive_user · on March 4, 2020

Robinhood opened up stock trading to a large portion of the population that would otherwise not have been interested in traditional trading platforms with high commissions.

Their success helped to pressure companies such as TD and Schwab to mostly get rid of commissions as well, which is great for the average trader

I think Robinhood has a lot of problems, but to say they're not pushing any boundaries ignores the huge changes they've brought to the industry.

C1sc0cat · on March 4, 2020

This is the 21st century low cost trading has been around for several decades now

kube-system · on March 4, 2020

Commission-free stock trading hasn’t, though. If you’re only trading a couple of shares at a time, $5-10 for a trade is a pretty steep fee.

C1sc0cat · on March 4, 2020

Fix your stock exchanges to have stock splits then like the LSE does.

kube-system · on March 4, 2020

The fees I am referencing were imposed by brokers, not exchanges. Our exchanges have stock splits, but that still doesn't make a $10 fee on a single $50 share very palatable to the small-time investor.

C1sc0cat · on March 5, 2020

A 50$ share should be split which was my point and tbh if your only investing $50 you should not be investing in individual shares.

kube-system · on March 5, 2020

Whether or not you think that market for trading should exist, it does.

unicornmama · on March 4, 2020

Pushing the boundaries? They wrote an app that gamifies stock and options trading...

RayVR · on March 4, 2020

Having worked as a professional investor since 2012, I can say these outages can happen anywhere. I've seen day long outages at exchanges where tens or hundreds of billions of dollars would have been trading, at brokers where who knows how much would have traded. I've also experienced these outages at retail companies that are more established, including TD Ameritrade (I become a customer when ThinkOrSwim was acquired.) I have also seen brokers screw over individuals on a significant scale without real ramifications.

The fact that Robinhood is telling people anything about the outage is only because they are the company they are, operating in the startup world/mentaity.

To the people thinking they should be compensated in some way...If you are doing >$1m daily volume, maybe you can contact them to see what they can do but even then, I doubt it. The way this should be handled is to have multiple executing brokers. You can implement offsetting positions if needed and transfer positions when your main account becomes available, if you are using a broker that can clear. Right now it seems Robinhood is working to implement clearing but you could still go to neutral or put on your positions.

twic · on March 4, 2020

> The fact that Robinhood is telling people anything about the outage is only because they are the company they are, operating in the startup world/mentaity.

Yep. Intercontinental Exchange and Eurex, two huge capital markets exchanges, routinely have multi-hour outages and don't even acknowledge that they've happened, let alone explain them.

whb07 · on March 4, 2020

multi hour isn't day and a half.

Itsdijital · on March 4, 2020

I have mixed feelings of sympathy about this whole RH thing.

Anyone who has used RH regularly should be well aware of how inept it is. Any spikes in volume or volatility, even on a single stock, bring it to it's knees pretty often. Like not just the last week, but even during calm periods. I've personally lost 20-30% on positions solely because RH was bugging out, thankfully I use RH just for "fun trades" usually <$100.

I cannot fathom having the balls to trade any real amount of money on the platform while being aware of these long term issues.

On the flipside I feel for new users and perhaps even generally inactive users who weren't aware of RH's incredible flakiness. I'd imagine (or hope to) the losses of most of those users were small, assuming they were new or casual and just testing the waters.

Even if one of my small plays hit it big on RH, the money would just go to my main account on TD (which has been smooth all week shy of a few hiccups Fri morning during record volume). It's been obvious for a long time that RH should not and cannot be trusted. If you're trading options with a $60K account on RH, well, I don't even have words for that level of ignorance.

stef25 · on March 4, 2020

I abandoned Coinbase after having difficulties getting a few 1000 bucks out of there. It worked out in the end.

Problems with my data I can tolerate up to a point. Problems with my money I absolutely can not tolerate. As you said, it's unfathomable how people can trade money on a platform that's flaky.

robinson-wall · on March 4, 2020

The interesting thing about working for a UK challenger bank - I now have visibility into all of the outages going on at large, high-street banks here.

Complete outages are rare, and well-publicised, but things go wrong a lot more[1] than you might think without any communications to customers that anything is wrong, sometimes outright denying[2] that there's a problem.

1: https://twitter.com/nickrw/status/1141058572547215360

2: https://twitter.com/nickrw/status/1164162320672669696

twic · on March 4, 2020

IIRC the SLA for FPS is 2 hours. So if a bank stops processing them for an hour, that's within tolerance, and they don't need to tell anyone.

I think your point is that it's a very different mindset to the native internet world, and that is certainly true!

0x8BADF00D · on March 4, 2020

It’s another example of why DevOps has become a buzzword and most teams just pay lip service to it.

UncleMeat · on March 4, 2020

Everything has outages. Is this the new narrative now that we've moved on from the leap year thing? That RobinHood is just a bunch of shitty engineers?

There are no public details about the root cause.

I think RH is bad for people in general, but this pile-on is outrageous.

Itsdijital · on March 4, 2020

Robinhood crashing isn't an isolated unfortunate "well it happens to everyone" moment.

RH has constantly had issues at least since I started using it over a year ago. I didn't notice it really at first, but I also didn't know much about anything trading related back then. It didn't take long though for me to have my first "incident" where my market orders were seemingly vanishing into the abyss as the underlying moved. I'm not talking seconds, I'm talking minutes. For a market order on high liquidity options. Never mind trying to get filled at anything besides the ask (buying) or bid (selling).

RH has had serious underlying issues for a long time now. This incident didn't happen in vacuum. The writing has been in huge block letters on the wall for a long time.

lolc · on March 4, 2020

Selectively delaying market orders by minutes would allow easy arbitrage! Very illegal and very profitable.

kortilla · on March 4, 2020

No, a brokerage being down for an entire day is not the norm.

cheez · on March 4, 2020

There are a couple of situations where outages are not normal or acceptable:

1. Dealing with other people's money 2. Monitoring/managing other people's health

speedplane · on March 4, 2020

> There are a couple of situations where outages are not normal or acceptable: 1. Dealing with other people's money 2. Monitoring/managing other people's health

Generally true, but there is a couple of exceptions to this rule: if everyone knows that the company is brand new and does not have an established reputation, then using that app requires a general acceptance of risk.

Robinhood was brand new, and outages should have been expected. The problem with Robinhood isn't the outage, it's that it was marketed to college students gambling with their parent's money, who know just enough about the stock market to be dangerous, but not enough to invest properly.

franga2000 · on March 4, 2020

From what I've heard, the "teams" maintaining most of these aren't paid half as much as a mid-level FAANG team.

Luckily for everyone, those industries are so old, they have accidental redundancy built in (paper records for old doctors who can't be arsed to use a computer, etc.).

UncleMeat · on March 4, 2020

Ok.

Can you describe to me how you’d design RH to ensure 100% uptime?

“Failure is unacceptable” and “failure implies total incompetence” are different statements.

cheez · on March 4, 2020

There is 100% uptime, and then there is infinite leverage...

bob1029 · on March 4, 2020

Saying 'everything has outages' is kind of disingenuous. There are many computer systems in the world today that can be considered to have practically perfect up-time. Mainframes have uptime measured in decades. I realize the concept of 1 gigantic iron box in a heavily-fortified installation with 2N+1 redundancies throughout is still not enough to ensure 100% uptime. But, when is the last time you swiped your credit card and had a failure to process the transaction?

acjohnson55 · on March 4, 2020

https://www.theguardian.com/world/live/2018/jun/01/visa-outa...

frockington1 · on March 4, 2020

> That RobinHood is just a bunch of shitty engineers?

It is confirmed they are worse than virtually any reputable brokerage. It might not be their fault directly but its 2020, not 1998

jennyyang · on March 4, 2020

I know quite a few people that were personally affected by this and lost money due to the two outages and they are all pulling their money from Robinhood. The fact that they can't offer any compensation might be a big problem for them, since they already have zero trading fees, which is what most brokerages offer as compensation.

Personally it doesn't pass the smell test for me. The load was much higher the previous week and load problems go away once the load disappears. They probably had a lot less load the rest of the day, so the fact they were down the entire day suggests it was something else. I would need a fully transparent post mortem before I believed anything they said.

solidasparagus · on March 4, 2020

Failures due to high load can take a while to resolve - you often need to fix the broken infrastructure, process the backlog, and catch up to live.

hcknwscommenter · on March 4, 2020

You can't process the backlog on a trading platform. If i put in a trade at 2:20 pm and the system goes down, I don't want my trade to execute next morning at market open. That's insane. Especially the RH flavor of YOLO infinite leverage call option nonsense.

radicaldreamer · on March 4, 2020

Exactly, you have to default to fill or kill within the trading day. You just can’t treat certain products like a standard queue... sometimes time is the most important component

stiglitz · on March 4, 2020

FYI, FillOrKill/ImmediateOrCancel are not the same as a day order.

FoK/IoC means “do not queue this order”. It’s immediately filled (or not) (or, for IoC, partially) based on whatever orders are already in the book, and then you’re done.

Whereas a day order is queued until the end of the day or until it’s filled, whichever comes first.

solidasparagus · on March 4, 2020

True for RH trades but I'm sure there is a lot of other data being handled - such as market data.

afc · on March 4, 2020

Load problems don't go away when the load disappears. If the system isn't engineered very carefully (this takes a lot of work!), you may have cascading failures that may take hours to resolve, especially if you have bad retry policies (their mention of thundering herd problem seems to indicate that they might).

We wrote a bit about this here: https://landing.google.com/sre/sre-book/chapters/addressing-...

I would strongly caution anyone who thinks this subject is trivial, just add a bit of load shedding and you're done. I wrote a bit about my team's work (including a simplified view of some of the considerations that go into how we do retries) here: https://landing.google.com/sre/sre-book/chapters/handling-ov...

jennyyang · on March 4, 2020

They specifically said it lead to a DNS failure. They didn't mention anything else, like corrupt data, etc. Sure there are plenty of ways that outages, not just load problems, can cause significant outages, but what Robinhood specifically said was that they had load issues that lead to a DNS failure. They should be more forthcoming with exactly happened if they want people to trust them.

rpdillon · on March 4, 2020

I'm not sure it's fair to assume that service gets automatically restored when load dissipates after failures due to high load.

driverdan · on March 4, 2020

This isn't something new, downtime is the norm for Robinhood. Anyone trusting them with more than play money is foolish.

neuronic · on March 4, 2020

This is the correct sentiment. People who put anything more than play money into Robinhood should not be surprised when their financial life is ruined.

balls187 · on March 4, 2020

How did they lose money?

jiqiren · on March 4, 2020

Quick example: They bought puts on Friday and couldn't unload them for a full day + following morning.

Monday morning puts were down - it was obvious the market was recovering in a big way. Instead of cutting losses at ~20% in the morning they lost ~99% of their position. Some lost 100% since the options expired EOD.

balls187 · on March 4, 2020

Thank you for the explanation.

UncleMeat · on March 4, 2020

> it was obvious the market was recovering in a big way

Was it? Markets started up today but ended way lower.

westpfelia · on March 4, 2020

It was more then just that though. Robinhood was displaying incorrect market data for the few people who could see it. (at least on monday)

So now you also have people making decisions based on the wrong data.

Honestly I dont see how this doesnt turn into a lawsuit.

UncleMeat · on March 4, 2020

I’ve observed very wrong prices on both vanguard and google finance several times over the last year.

akchin · on March 4, 2020

He is talking about Monday not today, Tuesday. RH didn’t work on Monday either.

UncleMeat · on March 4, 2020

Yes and the point is that today it looked the same as yesterday in the morning but it didn't turn out to be a bounce. It wasn't "obvious" that everything would rise on Monday. Only in retrospect.

kortilla · on March 4, 2020

Yes, there was nothing “obvious” about the market on Monday morning. Saying otherwise implies the ability to predict the market.

harikb · on March 4, 2020

I think they meant to say the “loss makes sense”, not that Friday/Monday’s move was predictable

rolltiide · on March 4, 2020

> The fact that they can't offer any compensation might be a big problem for them, since they already have zero trading fees

Robinhood makes the most money than any known firm on Wall Street by getting paid specifically to leak user's trades to other traders.

SEC requires a periodic report on that which shows compensation.

Can't believe people are still buying Robinhood's pitch of misdirection.

LatteLazy · on March 4, 2020

Is there a source you can cite for that? Why would anyone want retail investor order data? Especially since most of their orders execute immediately, so you can just get the trade data from the venue...

a2h · on March 4, 2020

Rule 606 disclosure for the source. And it's not about the trade data it's the order itself. Second link is a very thorough explanation.

https://cdn.robinhood.com/assets/robinhood/legal/RHS%20SEC%2...

https://www.google.com/url?sa=t&source=web&rct=j&url=http://...

JumpCrisscross · on March 4, 2020

> Why would anyone want retail investor order data?

Former market maker here.

Retail flow is low risk. If I buy $100mm of institutional flow, I could get a bunch of corporate hedging orders. Or I could make a single bet against George Soros. With retail, one tends to find lots of small orders. Even if there are some with high information, i.e. they're smart money and I'm going to lose money trading against them, they're small enough to be manageable.

Retail is also low information. At an old job, we bought a prominent retail broker's options flow. The number of in-the-money unexercised options that would come through that pipe was mind-blowing. (Today, whoever was buying Robinhood's flow likely got the same.)

SifJar · on March 4, 2020

I think you may misunderstand the concept of "Payment for Order Flow"

rolltiide · on March 4, 2020

In what way?

How would you describe it in a way everyone can understand in as few words?

SifJar · on March 5, 2020

It is not "leak user's trades to other traders", it is accepting payment from a third party in order to execute the trade via that third party.

RestlessMind · on March 4, 2020

This is such an empty update. At the very least, they should have published a detailed postmortem or committed to one by a certain date. How are we supposed to know that they have learned their lessons?

harikb · on March 4, 2020

I don’t work for them, but I am pretty sure we can blame the litigious nature of this industry for the lack of detail in the postmortem. Not everyone can afford to be cloudflare :)

Even for Cloudflare, I thought the company will get sued out of existence after the proxy data leak, but finance industry/SEC etc is a completely different ballgame.

dx034 · on March 4, 2020

I believe it's the fear of litigation rather than actual litigation. Other companies also manage to publish postmortems and don't get sued out of existence.

elliekelly · on March 4, 2020

The compliance world isn’t quite as fast-moving as tech. Even a “high priority” business continuity post mortem at a financial institution is going to take at least a week for all of the lawyers & senior management to agree on the language.

dilly_li · on March 4, 2020

Start from the email notification. They have been asking themselves the easy questions.

Just look at the top questions in their email:

* Are the funds in my account safe? Yes, your funds are safe.

* Was my personal information affected? No, your personal information was not affected.

* Can I use my Robinhood debit card? Yes. If you have a debit card, you should have been—and should still be able to—use your card, but you may have had issues receiving notifications, viewing your balance, and seeing transactions in your app.

------------

The real question is: How is Robinhood compensating for the missed trades?

Stop asking yourself the easy questions, RH.

throwsprtsdy · on March 4, 2020

I think it's unlikely that Robinhood (or any brokerage) would compensate people for losses on hypothetical trades that could have been made during an outage. Such a policy would allow customers to pick their entry and exit points, and extract money from the brokerage at will.

Even if the trades were well-defined at the time the outage occurred, there would still be an asymmetry between people demanding compensation on their profitable trades while eschewing losses on their bad trades. It's doubtful any brokerage would be willing to eat that.

Execution risk is a risk.

asah · on March 4, 2020

Are expiring in-the-money options a "hypothetical trade" ?

benmanns · on March 4, 2020

Those are automatically exercised at expiration.

Itsdijital · on March 4, 2020

If the holder can afford to exercise, and I can assure you that most RH users cannot afford to exercise a single contract(at least on most commonly traded stocks).

Otherwise it's on the broker to sell it at close to someone who can afford to exercise. And who knows if RH pulled that off or not.

benmanns · on March 4, 2020

I haven't seen official statement, but I did see a couple reddit threads where Robinhood exercised ITM calls without enough funding in the account, then collected the shares and paid the cash difference.

https://www.reddit.com/r/wallstreetbets/comments/fcqkmo/so_r...

There's also some threads where it looks like it did not go so well...

https://www.reddit.com/r/wallstreetbets/comments/fd2eko/upda...

itake · on March 4, 2020

If their systems are down, were they actually exercised or is that managed by another brokerage?

throwsprtsdy · on March 4, 2020

That's an interesting question. I suppose it's hypothetical in the sense that they now have to look at "what if" those options had been exercised; but unlike a spot trade that someone "would have" done, Robinhood might already have had obligations on its end of the original options trade.

hcknwscommenter · on March 4, 2020

Absolutely. Why wouldn't they be?

topherpalmtree · on March 4, 2020

Yeah seriously if you have > a few hundred in options on robinhood. And you’re waiting until the day they expire to unload them. You’re dumb or don’t care about your money.

kccqzy · on March 4, 2020

No brokerage will do that. Here's an excerpt from the account agreement of Schwab, a respected discount broker:

> During periods of heavy trading and/or wide price fluctuations ("Fast Markets"), there may be delays in executing your order or providing trade status reports to you. […] Schwab is not liable to you for any losses, lost opportunities or increased commissions that may result from you being unable to place orders for these stocks through the Electronic Services.

throway9812 · on March 4, 2020

This is absolutely not true. Broker-dealers and brokerages routinely credit clients for execution out of line with the market. Schwab does in fact give price adjustments for slowly or incorrectly handled orders.

The reason nobody will be compensated here is due to two things,

(1) There is no way to determine what a fair execution would have been, since clients couldn't submit orders in the first place.

(2) Clients will adversely select their losing trades for corrections and this would bankrupt Robinhood in about five minutes.

Source: work at a wholesaler.

vel0city · on March 4, 2020

I mean, you say its "absolutely not true" and yet that's literally verbatim from their Brokerage Account Agreement.

https://www.schwab.com/public/schwab/nn/agreements/schwab_br...

Maybe in some cases they go above and beyond their account agreement if they like you as a customer, but according to the agreement you sign with them its not their problem if things go bad in this way.

yesiamyourdad · on March 4, 2020

> There is no way to determine what a fair execution would have been, since clients couldn't submit orders in the first place.

On the flip side, clients have no guarantee that there would have been a counterparty for their order.

neom · on March 4, 2020

I had to wait on hold for well over an hour to get through to the HSBC trading desk, HSBC isn't going to compensate me.

driverdan · on March 4, 2020

Wait, people still trade over the phone?

Scoundreller · on March 4, 2020

Dunno about HSBC, but in Canada, often you have to call your broker when you want to sell a stock on a different market than you bought it.

E.g. Buying TD in Canada, and wanting to sell on NYSE for US$.

manigandham · on March 4, 2020

Unlikely to have compensation for trades, and only people with limit orders set before the outage would be able to claim damages.

It's no different than you breaking your phone or losing your network connection. Nothing is guaranteed to work all the time. RH might face fines for the extended nature of the outage though, specially since they've managed to avoid them for plenty of past mistakes so far.

floatingatoll · on March 4, 2020

If they compensate for missed trades due to service outages, then an attacker could take a position, repeatedly DDOS Robinhood until the position is favorable during a DDOS, and then demand reimbursement since they "would have" cashed out that favorable position.

It follows that Robinhood must never reimburse for outages.

seabass · on March 4, 2020

I’d be interested to read a deep technical post-mortem like those which have become fairly standard among other big tech companies. Hoping Robinhood does the right thing here.

0xy · on March 4, 2020

Still silence on the traders who lost tens of thousands of dollars? Are they going to be compensating or not?

This blog post doesn't appear to say anything. It's not an apology, it's not an explanation, it doesn't say what they're going to do in response.

This is after the incident in which there was no status updates or support availability for multiple hours of time. Why can't they commit to updates every hour or every 30 minutes?

SkyPuncher · on March 4, 2020

I'm having a really hard time understanding this argument.

Unless I have an SLA with a provider outlining penalties, they don't owe me anything if they go down. How is this any different?

titanomachy · on March 4, 2020

You could view it as a business decision. Will they lose reputation and customers if they don't compensate for the outage? Do they expect that the long-term cost of that loss would be more than the one-time hit of paying out now?

They may not have a legal/contractual obligation here, but that doesn't mean that treating their customers poorly is without consequence.

solidasparagus · on March 4, 2020

That one-time hit would be massive. With the benefit of hindsight, everyone is going to say they lost money by missing the perfect trades.

topherpalmtree · on March 4, 2020

Yeah but your business decision was to be a high stakes gambling platform to begin with

ivalm · on March 4, 2020

Almost certainly they cannot compensate. If average user lost $1k that's a cool $10b they would need to compensate.

0xy · on March 4, 2020

The difference is regulation. There are very few regulations and oversight on cloud compute providers, whereas an average person cannot just spin up an app and begin selling securities in a month as you can being a cloud provider.

While RH's ToS does theoretically absolve them of technical issues, they are obligated to comply with 'best execution' securities mandates, no? Separately, it'd be extremely bad for business if they refused compensation.

The point is moot anyway, since they're offering "case-by-case" compensation.

https://techcrunch.com/2020/03/03/robinhood-outage-cause/

toomuchtodo · on March 4, 2020

Robinhood will have to deal with a flood of FINRA and SEC complaints from these outages. I'm unsure how much longer FINRA will allow them their broker dealer license with a copious amount of failure in the rear view mirror.

Arbitration is forced, but Robinhood is on the hook for the fees for everyone who decides to arbitrate. Robinhood users might not get anything, but they can still cause pain.

sitzkrieg · on March 4, 2020

if that were true thinkorswim would have been gone years ago. heavy futures volatility killed it for days at a time

hcknwscommenter · on March 4, 2020

You mean "case-by-case" denial delay and obfuscation.

crystaldev · on March 4, 2020

> This blog post doesn't appear to say anything. It's not an apology, it's not an explanation, it doesn't say what they're going to do in response.

On the advice of any good lawyer.

ska · on March 4, 2020

I agree the level of feedback isn't great, but what would people be compensated for? Did they misplace actual orders?

CamelCaseName · on March 4, 2020

There were some people claiming that RH erroneously exercised their options on r/wallstreetbets. Could be a hoax, but if it isn't, then that seems like grounds for compensation.

Of course, no one complains when RH makes a mistake in the client's favor.

ivalm · on March 4, 2020

Those people don’t know what is pin risk. Basically their long puts got exercised because automatic execution is determine at 4pm and they didn’t object (creating a short position in equity), their short puts didn’t get exercised because the stock rallied by 5pm and their counterparty was diligent and prevented auto execution (thus no long equities position to compensate the short equities position). Robinhood couldn’t rebuy the short equity position because the actual price rose above the put price leading to a net loss.

speedplane · on March 4, 2020

> Those people don’t know what is pin risk.

Is it just me, or does it feel like the only people using Robinhood are college students gambling with their parent's money?

Given that many extremely smart people who have devoted their lives to the stock market cannot beat average returns, the lack of Robinhood user's knowledge of "pin risk" seems to miss the greater point.

ivalm · on March 4, 2020

There are two things:

1) Somewhat pedantic: A big reason why performance is-what-it-is is that at any real $$$ liquidity/volume becomes an issue. Lots of option markets are just not that liquid. If you play with only a few $k and robinhood pays for much of market friction then you can potentially outperform market at risk parity.

2) More real: For most people active trading is not about investing, it is about easy and legal gambling. There is a thrill of throwing you money into high risk options or skyrocketing meme-stocks. Because markets are (relatively) efficient the prices of these assets usually reflect their risk profile, so on average you should gain money (flip side of it being hard to beat market is that it is hard to severely underperform, on average, as long as you don't all-in; normally friction cost makes these kind of strategies not work but RH reduces that significantly). It ends up like going to a casino where on average you make a bit of money (but with high volatility means some people lose a lot, some people gain a lot).

GabrielBen · on March 4, 2020

You couldn't execute or cancel orders.

alkonaut · on March 4, 2020

That means there are no orders mishandled either. If no one has an SLA then just switching the servers off without thinking about whether customers were planning on trading seems fully in their right. This is terrible for their reputation, but that does't mean they are going to start handing out money because people argue they could have avoided losses if the servers had been up. It's going to be extremely difficult for any customers to back that up legally.

GabrielBen · on March 5, 2020

I doubt that you need an SLA to enforce a penalty for keeping other people's money.

alkonaut · on March 5, 2020

They can probably be fined by some authority, but that penalty isn’t the same as being liable for losses people claim they made because the site was inaccessible. The fine wouldn’t be paid to customers.

mandelbrotwurst · on March 4, 2020

People lost the opportunity to place orders. Determining the actual cost is of course impossible since you don't know what orders people would have placed.

ska · on March 4, 2020

Absent a contract on availability that doesn’t sound like something you would have a case for.

mandelbrotwurst · on March 5, 2020

I agree with that, was just answering the question.

cdurth · on March 4, 2020

Yesterday was the largest upswing in market history and the entirety of RH missed out.

majormajor · on March 4, 2020

"Missed out" doesn't seem like the right phrase here. If you already owned the stock, you still held it, no?

So people who were going to continue to sell off got lucky that they couldn't make that trade, and people who were going to buy got unlucky?

Does anyone seriously expect compensation, or think that it's deserved, or is it group wishful thinking? How would it even work? Would they just take people's word for their supposed intent? Or are people wanting some sort of "here's a gift card" type deal?

This is not to defend RobinHood - I've personally kept my money with well-established companies cause conservative, old, proven systems seem like a good thing for a product in this space - but shit happens, no? There will be more good days, and more bad days, in the market, it's a long-run game anyway, and it's pretty easy to vote with your wallet in this space.

cx4life · on March 4, 2020

There could be folks holding leveraged Bear ETFs or similar after last week's downturn, who were waiting to see how the market moved Monday morning to decide whether to sell or hold. I could see those folks losing quite a bit of money due to the inability to sell off those types of positions after the market reversed course on Monday.

I suspect you're right though, that it's mostly sour grapes concerning the opposite case - inability to buy as the market rallied.

yesiamyourdad · on March 4, 2020

This is just like the late '90s, up to 9/11 when day trading was the rage. Now it's back in fashion, low-key.

I had never heard of /r/wallstreetbets or Robin Hood (well, barely) until a couple weeks ago.

JumpCrisscross · on March 4, 2020

> Now it's back in fashion, low-key

There is an entire generation that has never traded through a crisis.

speedplane · on March 4, 2020

> There is an entire generation that has never traded through a crisis.

Given that most crises seem to occur roughly every 7-15 years, there will always be such a generation.

A hypothesis: the reason why crises occur roughly 7-15 years is because that is approximately the length of society's collective memory concerning monetary issues.

manigandham · on March 4, 2020

Only possible claims would be limit orders set before the outage that didn't execute but were in the price range.

jjeaff · on March 4, 2020

And even then, limit orders are placed on a best effort basis. I'm sure their terms of service say as much. I have had limit orders not get placed before on otherwise functioning platforms.

endorphone · on March 4, 2020

The close of today is effectively the open yesterday, so everyone is back where they were.

Of course the problem with the "compensate me" arguments is that a lot of people were going to make decisions that would have turned out poorly yesterday (indeed, the market is balanced and every transaction has a counterparty), though of course with the amazing clarity of hindsight few would recognize or admit that. So if they need to compensate for illusory lost trades, do some people have to pay them for losses they would have incurred?

[I get that there are some complex options that can legitimately be all downside when trading isn't available, but that's a less common option]

ivalm · on March 4, 2020

Lots of RH are gamblers with short term options. Today's implied volatility and time-value for those options is way lower.

hcknwscommenter · on March 4, 2020

Dude. They are NOT compensating. This is clear.

dang · on March 4, 2020

Recent and related:

https://news.ycombinator.com/item?id=22477567

https://news.ycombinator.com/item?id=22475019

https://news.ycombinator.com/item?id=22468361

https://news.ycombinator.com/item?id=22465178

aloknnikhil · on March 4, 2020

Genuine question: With no commission trading at places like Schwab and eTrade, is it even worth trading on Robinhood? For as far as I could remember (about 2 years ago), Robinhood has always failed to scale.

manigandham · on March 4, 2020

Options are completely free on Robinhood while they still have a per-contract fee at other brokerages. If you don't care about that then no, there's no reason to stick with Robinhood.

benmanns · on March 4, 2020

Additionally Robinhood self clears options (or for some other reason?) and does not charge the Options Clearing Corp fee of $0.055/contract or the Options Regulatory Fee of $0.0388/contract which all other brokers charge (incl. ones with $0 or flat rate commissions/fees like WeBull, Gatsby, Tradier). All you pay is the FINRA and SEC fees on sells of about a penny each for small trades.

Actually, if anyone knows of another broker who _doesn't_ charge these, please let me know. If you're first for the broker I'll give you $20 for the tip.

Itsdijital · on March 4, 2020

Trust me, please trust me, you really really really want to be paying a competent broker when trading options.

If it's chump change you're trading, sure, use RH.

If it's serious money, the $0.65/contract or whatever pays for itself many times over. Even if it's just the ability to regularly get filled between the spread it pays for itself.

jzl · on March 4, 2020

Yeah, there seems to be no end to the horror stories of options trades on Robinhood having significant delays before being filled, costing people far, far more than 65 cents.

You get what you pay for.

dehrmann · on March 4, 2020

I've always been told trading, but especially trading options, is asking the big boys to take your money.

manigandham · on March 4, 2020

Options are a derivative, meant for hedging. It's relatively recent that they've gained so much attention as a primary security for speculation. There are decent strategies to make consistent income, especially in selling options, but it takes discipline and capital.

Most people just want the high leverage and quick wins which usually ends badly.

throwaway9d0291 · on March 4, 2020

It's not at all so straightforward. There's definitely money to be made trading.

In particular, there's plenty of opportunity to find a successful strategy that just doesn't scale. It won't make you rich but it will make you money.

dehrmann · on March 5, 2020

> there's plenty of opportunity to find a successful strategy that just doesn't scale.

True.

kccqzy · on March 4, 2020

What kind of options are you trading such that the fee of a few cents per contract is noticeable? The bid-ask spread is wider than that.

benmanns · on March 4, 2020

I don't want to reveal too much, but basically undervalued far OTM spreads. I usually net -$0.02 on each trade but occasionally earn $1-10. If I use a "real" broker and pay $0.10-$0.65 per contract the math just doesn't work.

mjs33 · on March 4, 2020

Their DNS system failed? How?! Unless DNS stands for “Do Not Sell”

powerbook5300CS · on March 4, 2020

This happened to us at Hustle years ago. Basically if you run on AWS there’s a DNS server provided inside each VPC that usually works fine but which has no observable load metrics etc... so you don’t really know you are slamming it and are about to have a problem unless you audit your entire codebase.

Why? Well that tiny DNS server has certain capacity constraints and if you don’t cache DNS lookups by using a http/https agent for example (in NodeJS) you wind up looking up the same dns info over and over and churning sockets like it’s going out of style. If you run really really hot the poor thing falls over (rightly so).

The limits are high and DNS is fast so you usually don’t notice but when you are under load bugs like this come out of the woodwork. When it falls down you look up the AWS docs, lean back in your chair upon finding this isn’t an “elastic” part of AWS and say “FUUUUUUUUCK” so loud it can be heard from outer space.

If you are Robinhood though don’t you have some former Netflix SRE/DevOps beast on staff that knows this and so you run your own DNS and monitor it?

jcheng · on March 4, 2020

I read this and thought, “surely there’s an OS-level DNS cache?”

Apparently not on Linux! https://stackoverflow.com/questions/11020027/dns-caching-in-...

anaphor · on March 4, 2020

Well, there is https://www.freedesktop.org/software/systemd/man/systemd-res... but you may or may not think that's part of the "OS".

JdeBP · on March 4, 2020

That's misleading. The way that this has worked for decades on Linux-based operating systems and on Unices is that one installs a local caching DNS proxy, choosing one of the many available: ISC's BIND, Bernstein's dnscache, unbound, dnsmasq, PowerDNS, MaraDNS, and so forth.

Every Unix system having a local caching DNS proxy was and is as much a norm as every Unix system having a local MTS. A quarter of a century ago, this would have been BIND and Sendmail. Things are more variable, now.

To illustrate that this was considered the norm, here is a random book from the 1990s. Smoot Carl-Mitchell's _Practical Internetworking with TCP/IP and UNIX_ says, quite unequivocally:

> You must run a DNS server if you have Internet connectivity. The most common UNIX DNS server is the Berkeley Internet Name Daemon (BIND), which is part of most UNIX systems.

People sometimes think that this is not the case nowadays, and the fact that a computer is a personal computer magically means that a Unix or Linux-based operating system should offload this task and not perform it locally. They are wrong, and that is DOS Think. Ironically, they don't even get to play the resource allocation card nowadays. The amount of memory and network bandwidth that needs to be devoted to caching proxy DNS service on a personal computer is dwarfed by the amounts nowadays consumed by WWW browsers and HTTP(S).

There's no similar argument for a node in a datacentre.

Ideally, not only should every machine have a (forwarding/resolving) caching proxy DNS server, every organization (or LAN, or even machine) should have a local root content DNS server. A lot of (quite valid) DNS lookups stop at the root with fixed or negative answers. Stopping that from leaving the site/LAN/machine is beneficial.

Ironically, putting a forwarding caching proxy DNS service on the local end of any congested, slow, expensive, or otherwise limited link is advice that I and others have been handing out for over 20 years. It's exactly what one should be doing with things like Amazon's non-local proxy DNS server limited to 1024 packets/second/interface.

* http://jdebp.uk./FGA/dns-server-roles.html#ChoosingProxy

So the question is not whether there a local DNS cache mechanism exists. It's whether it's set up by the company dishing out the VMs, and if not why not. Amazon provides instructions on how to add dnsmasq, and clearly labels this as how to reduce DNS outages. So it's not even the case that Amazon is wrongly discouraging having local caching proxy DNS servers.

* https://aws.amazon.com/premiumsupport/knowledge-center/dns-r...

jcheng · on March 4, 2020

The point of my comment wasn't to say "don't cache" but rather, don't expect that the OS is going to automatically do it for you (as would be the case on Windows and Mac).

powerbook5300CS · on March 5, 2020

I didn’t say they discourage usage of a dns cache at all.

ajsharp · on March 4, 2020

Wait, what?? There's an invisible DNS server running inside your VPC? I get what you're saying wrt cached DNS lookups but this seems wild.

ra1n85 · on March 4, 2020

It's a DNS resolver that runs on the hypervisor hosting every instance.

powerbook5300CS · on March 4, 2020

Yes and they limit you to throwing 1024 packets per second per network interface at it.

Of course you could run your own dns cache per host/pod whatever.

cocire · on March 4, 2020

you've got me so curious, could you please point me to the aws docs?