~20GB/s read/write over a thousand+ cores seems slow, especially for embarrassingly parallel data such as this (split on security). That works out to megabytes per second per core. Am I missing something?
They're not doing sequential scans of files on disk, they're doing random reads and writes in a database, where each write is replicated and durable, in parallel, across the entire key space of market transactions. The task was to reconcile market transactions end-to-end by matching orders with their parent/child orders (e.g., as orders get merged/split or routed from broker/dealers to others or to exchanges to be executed), thus building millions (billions?) of graphs across the entire dataset. You can see more details in the video of the presentation at the bottom of this blog post:
https://cloudplatform.googleblog.com/2016/03/financial-servi...
but I presume you're much more familiar with the intricacies of the stock market than I am. :)
Here's the performance you can expect to see per Cloud Bigtable server node in your cluster, whether for random reads/writes or for sequential scans:
https://cloud.google.com/bigtable/docs/performance
Here's a benchmark comparing Cloud Bigtable to HBase and Cassandra that may be of interest (on a different benchmark than presented in the FIS blog post, but shows the relative price/performance):
https://cloudplatform.googleblog.com/2015/05/introducing-Goo...
Disclosure: I am the product manager for Google Cloud Bigtable. Let me know if you have any other questions, I'm happy to discuss further.
Disclosure: I work on Google Cloud, so I want you to use Bigtable ;).