There's a re:invent presentation of them having an in-house 1pb+ cluster. What they don't tell you is that 101 8xl nodes is half a million dollars a year in reserved instances, before you include any other costs associated with it; and that particular workflow (log scanning) is very nicely suited for any columnar store.
Operations are also a disaster with Redshift; anything where you have to touch the cluster itself at any scale past a handful of nodes typically requires a support ticket with "hey, when this breaks, please work your magic and fix it?" There's also the issue of tuning your queues, which is a whole extra layer that you, the customer, must tune. Their suggestion tools are getting better on that front though.
Just use something else if you have more than a few TB, or you have a ton of time and money to just throw around.
There's a re:invent presentation of them having an in-house 1pb+ cluster. What they don't tell you is that 101 8xl nodes is half a million dollars a year in reserved instances, before you include any other costs associated with it; and that particular workflow (log scanning) is very nicely suited for any columnar store.
Operations are also a disaster with Redshift; anything where you have to touch the cluster itself at any scale past a handful of nodes typically requires a support ticket with "hey, when this breaks, please work your magic and fix it?" There's also the issue of tuning your queues, which is a whole extra layer that you, the customer, must tune. Their suggestion tools are getting better on that front though.
Just use something else if you have more than a few TB, or you have a ton of time and money to just throw around.