yes, pubsub is an old pattern. what HFT experience do you have that makes you th...

true_religion · on Aug 7, 2012

It's not as if many webpages couldn't be made to respond to faster.

Not to brag, but I've built trading engines in the past year that can process 20 years worth of data in less than a ms.

Now I'm a founder at your usually derrided photos-sharing start up. Sure we could make most requests come out of the server in less than 2ms, but what of it? It's still going to take the user upwards of 1.5 seconds to load all of the content, and at least 100ms spent on the wire. Working at a consumer level startup, there's no point in optimizing the server yet when network costs are so high, and there's no point in improving the network when there's no ROI in it.

----

Answering to his question: > what HFT experience do you have that makes you think of those systems as toys?

I'd imagine that massive distributed systems are more complex to think through that HFT systems because the later are more monolithic due to speed constraints.

MarkPNeyer · on Aug 7, 2012

HFT systems _are_ massive distributed systems.

where i worked - a small shop with 10 engineers - we had 350 machines running in 5 different data centers in chicago, new jersey, and korea. each machine ran around 100 different processes, which were coordinated using a combination of several different pubsub systems:

- in house reliable tcp-based broker mediated (we called it pubsub3 )

- in house unrelaiable ip-multicast system (i named it blue ray because i thought that sounded cool)

- 3rd party implementation of a PGM derivative that we licensed for god knows how much

the engines subscribed to various topics on the system, processed market ticks (sent by marketdata publishers over blue ray) and fired orders at whatever instrument they were handling

the risk engines modeled risk scenarios inresponse to trading shifts, powering through the risk models our quant trader wrote in c. we had a 'calcparser' system that constructed a DAG rerpresenting the AST of the c code and distributed the DAG across several machines.

shit was nuts. i miss it.

photorized · on Aug 8, 2012

>HFT systems _are_ massive distributed systems

The complexity is also in having to deal with limited resources, and factors outside of your control. Have you seen many bootstrapped HFT implementations? The budgets are just... different.

When you spend 10x or 100x hardware, get massively redundant backbone connectivity (connecting to a tier 1 core router directly, and not through an oversubscribed shared switch port that a typical bootstrapped startup would get), you end up with a much better controlled and somewhat predictable environment.

With a typical "web" (non-HFT) distributed system, you've got millions of "subscribers" aka users - and I don't think people outside of IT fully realize just how bad a typical user's [Windows] machine is messed up - and that was before they installed a couple of toolbars, adware, firewall, and AV products... And all of these subs are tying up your publishers in their own unique screwed up way, keeping sessions open, dropping packets, blocking ports...

I've designed global CDNs, and operated one for a decade - and the kind of randomness our distributed systems have to deal with are not always technical, even though we don't typically count microseconds.

true_religion · on Aug 7, 2012

I'm very interested in this.

Is there some reason that you'd need your Korean data center to constantly be talking to your US data centers.