Moving the NYT Games Platform to Google Cloud With Zero Downtime

chatmasta · on Dec 7, 2017

> We found that some web customers were unable to access the puzzle, and found the cause of the problem to be App Engine’s limit on the size of outbound request headers (16KB). Users with a large amount of third-party cookies had their identity stripped from the proxied request. We made a quick fix to proxy only the headers and cookies we needed and we were back in action.

That’s pretty funny. Users of NYT are sending request headers with sixteen kilobytes of tracking data. Maybe that’s the real problem eh?

I wonder which news website has the largest amount of trackers. If I let the CNN home page sit open in chrome, I can come back an hour later and find thousands of requests blocked by uBlock.

benp84 · on Dec 7, 2017

Try a Cracked.com article. I left one open for 10 minutes and had 50,000 requests and 80MB transfered.

awj · on Dec 7, 2017

Holy crap, you weren't kidding. Loaded their main page, 12MB transferred in the first minute before I noped out of there.

paulddraper · on Dec 8, 2017

How times change.

That would've taken half a hour on a dial-up connection.

manno889 · on Dec 7, 2017

> Users of NYT are sending request headers with sixteen kilobytes of tracking data.

How do you see this? What tools in the browser?

chatmasta · on Dec 7, 2017

Well nyt saw it by proxying user traffic. You could do the same with a chrome extension, similar to uBlock.

pixl97 · on Dec 7, 2017

If you use chrome just look at network under the developer tools.

godzillabrennus · on Dec 7, 2017

If the website is free then you are the product. Not sure why it’s a surprise these companies aim to maximize the value they can derive from their product (aka our data).

rhizome · on Dec 7, 2017

OP said it was funny, not that it was a surprise.

freeone3000 · on Dec 7, 2017

It's not free. NYT relies on a subscription model.

tgb · on Dec 7, 2017

The mini-crossword is free as well as some of their full crosswords.

iamthirsty · on Dec 7, 2017

Some. I actually cancelled my subscription with them because of the horrendously large Google Home ads on all the paid crosswords. When I asked them about it I got about four non-answers — it's ridiculous that even if you pay they still serve you ads.

dragonwriter · on Dec 7, 2017

Newspapers have always included ads even if you pay; not sure why anyone would expect “on the internet” to change that.

iamthirsty · on Dec 7, 2017

Because I pay specifically for only the crossword part. I bought a book of 50 (of NYT puzzles) for the same price, and it came with no ads. Much better deal, if you ask me.

sgolestane · on Dec 7, 2017

I'm a paid subscriber to NYT (so it's not free) but I'm still to prone to this issue

jdavis703 · on Dec 7, 2017

If you're a paid subscriber then I think it's morally OK to use an ad blocker on them. You're paying for what you're consuming, and in all likelihood you're only earning them a few cents to maybe a dollar from ads anyways. What you spend on your subscription drives way more value.

froindt · on Dec 7, 2017

Something I haven't though of before - with paper newspapers, there is no way to opt out of ads, and those are also subscription based. Why is the internet different? There is no opt-out for newspapers. Presumably they are getting less revenue than the customer agreed upon if the ads aren't displayed or aren't clicked on. Is using an ad blocker effectively stealing?

Note: I use an ad blocker primarily as a means against malware.

vkou · on Dec 7, 2017

If internet ads were as unobtrusive as newspaper print ads (I'm not talking about the shitty free tabloids), a lot fewer people would be using ad-blockers.

chatmasta · on Dec 7, 2017

It’s not a surprise, but it is a certain kind of schadenfreude to see a bug caused purely by the sheer amount of trackers they’re forcing into their users’ browsers. It might have been a good time for them to do some internal reflection.

And btw it’s not free; there is a paywall and you can subscribe to the NYT.

untog · on Dec 7, 2017

NYT isn't forcing these trackers into the user's browsers, ad banners are. You might say that's a distinction without a difference but I disagree, until you work with programmatic ad stuff it's difficult to fathom just how stupid it is, but also how unavoidable it is if you want to make money.

ianlevesque · on Dec 7, 2017

They could try a subscription model.

untog · on Dec 7, 2017

Glibness aside, there is probably more written about NYT's revenue from ads vs subscription than any other major media company.

IIRC they recently made more money from digital subs than they made from ads for the first time ever. But if they removed ads tomorrow it would still destroy the business as it currently operates.

duijf · on Dec 7, 2017

Is anyone else curious why the NYT uses Medium? Their own website is literally about reading stuff

(Sorry if this is off-topic)

hueving · on Dec 7, 2017

"How we did X" is "not worthy" of the main brand.

dgritsko · on Dec 7, 2017

It's an engineering blog post, which usually serve double duty as being both informative and also useful for recruiting ("Look at the cool stuff we are building! Come be a part of it!"). Case in point, the post ends with "we’re currently hiring for a variety of roles and career levels".

In addition to not being really appropriate for nytimes.com, I'm guessing that publishing content there brings along a lot of extra cruft that is probably not necessary for a post like this (advertising, paywall system, isolating it from the "real" NYTimes content, etc.). Easier to just throw it up on Medium and call it a day.

jprob · on Dec 7, 2017

Bingo.

Our CTO made our first post to Medium explaining the move: https://open.nytimes.com/introducing-the-new-open-blog-23eba...

natural219 · on Dec 7, 2017

This is really fascinating to me. Is this because engineers who they want to recruit dislike the New York Times brand, or because readers of the New York Times don't want to read things as informal as transparent blog posts about internal NYT decisions?

It's very easy for me to see something like blog.newyorktimes.com with a similar design / community philosophy as Medium, but would that somehow cheapen the experience for NYT readers? Or does NYT just not see itself as a "hip tech company" like Medium? I have endless questions about this, haha.

It seems to me like there's a lot of unstated assumptions hiding in "not appropriate for nytimes.com". Some things mentioned include -- "advertising, paywall system, isolating it from the "real" NYTimes content, etc.". This is absolutely baffling to me! I would be much more inclined to read regular NYT content were it not for these things.

untog · on Dec 7, 2017

I think you're over-thinking this. There isn't a huge crossover audience for engineer blog posts and general NYT audience, and I imagine the engineering blog posts do not go through the same editorial process content on nytimes.com does. That alone makes the case for using a different domain.

natural219 · on Dec 7, 2017

I'm being a cheeky detractor of NYT here. I think "candid, engineering-style blogposts" are the future of news, and ancient vehicles like New York Times are long dead. I think the "general NYT audience" is participating in #FakeNews, and they should radically reconsider their information diet.

As vivid example of this, compare James Birdle breaking the "Youtube exploitative kid videos" story way before, and in greater depth, than in any major publication. This is actually the future of news, and pretending like aging institutions like the New York Times are remotely relevant anymore is longshot wishful thinking.

Editorialization, fact-checking, and cultural leadership have important roles to play, and I'm excited to see these features unbundled into separate services. I'm long on services like Verrit and Snopes, and wish that I, as an independent publisher, could pay an intern to get official statements, cross-check narratives with history, and perform some of these functions. As is, I think people are operating under the delusion that ONLY NYT-style institutions can perform these functions, which baffles me.

(Actually, the future is probably more like James posting on jamesbridle.com, and then aggregating it through sites like Hacker News. But what do I know, I'm just a millennial who doesn't understand all these big partisan topics like modern journalism)

https://medium.com/@jamesbridle/something-is-wrong-on-the-in...

untog · on Dec 7, 2017

Hmm. I'm going to disagree with that! I think "candid, engineering-style blogposts" are and will continue to be great for an engineering audience, but I'm very skeptical that they will be great for a wide audience. For instance, I read and was fascinated by the YouTube kids post, but I do not know anyone outside of the tech industry that read it. And you're wrong to say he reported it way before, the NYT published this two days previous:

https://www.nytimes.com/2017/11/04/business/media/youtube-ki...

and Birdle's post itself links to reporting by New York Magazine from 2016. I don't dispute that his post goes into more detail, I just dispute that longer automatically equals better. Someone with domain knowledge reporting a story in great depth and a major publication reporting a simplified version for mass consumption is certainly not a new model.

I'd also strongly disagree that NYT is an ageing institution unable to adapt to this modern tech reality. John Herrman writes some of the most perceptive pieces about the state of tech out there:

https://www.nytimes.com/by/john-herrman

(and a minor quibble: I don't think the post linked here and the Youtube Kids post are in any way comparable. The engineering writeup is not news in any way, shape or form, it's just a guide to how NYT implemented something)

natural219 · on Dec 7, 2017

Thank you for this good response. I didn't notice the NYT covering this story before, because I have cut NYT out from my life due to their malicious, partisan reporting. So maybe I should be less bold about my evaluations of them and just continue to enjoy my personally-curated, high-information-dense feed.

marksomnian · on Dec 7, 2017

Doesn't this lead to vendor lock-in? All these Google proprietary services seem like they would be a big issue if they decide for whatever reason to migrate away from GCP.

ddorian43 · on Dec 7, 2017

They can rewrite it again on the next cool lang (rust,kotlin etc) using nanoservices and some new per-column/second-pricing db.

jrs95 · on Dec 7, 2017

Distributing data across the /tmp directories of many AWS lambda functions is the future of storage.

dboreham · on Dec 7, 2017

Wait...you can do that?

azurezyq · on Dec 7, 2017

It's a trade-off between time to market and risk of vender lock in. Also, typical tech stack got fully or partially rewritten every a couple of years.

outworlder · on Dec 7, 2017

Which ones? There are ways to use way, way more Google services in your architecture. They have a bunch of industry standard stuff there, presumably that's how they were even able to migrate from AWS in the first place.

merb · on Dec 7, 2017

well if you just use appengine without using a vendor lock-in service, i.e. using cloudsql instead of datastore, etc. than you probably won't run into trouble. but it looks like appengine still has it's momentum (they actually added java8 support lately)

Top19 · on Dec 7, 2017

It’s actually double lock-in, so 2x worse.

You used to have to just be afraid of lock-in, which I don’t think is as big an issue as it sometimes seems.

But with Google, you’re not only locked in but might be LOCKED OUT when they kill your product.

chatmasta · on Dec 7, 2017

Don’t be ridiculous. Google killing a feed reader is a way different from Google killing a cloud service with paying customers and SLA agreements.

un_montagnard · on Dec 7, 2017

Like the QPX Express API?

chatmasta · on Dec 7, 2017

Interesting point. However that’s not a google cloud product and never had an SLA (the QBX FAQ says “we do not guarantee support”). It’s also a unique case because of its reliance on third party data vendors.

If google starts killing their cloud products, I will eat my socks. Just let me wash them first.

nik736 · on Dec 7, 2017

?? This happened before.

non_sequitur · on Dec 7, 2017

This is from their legal agreement:

7.1 Discontinuance of Services. Subject to Section 7.2, Google may discontinue any Services or any portion or feature for any reason at any time without liability to Customer.

7.2 Deprecation Policy. Google will announce if it intends to discontinue or make backwards incompatible changes to the Services specified at the URL in the next sentence. Google will use commercially reasonable efforts to continue to operate those Services versions and features identified at https://cloud.google.com/terms/deprecation without these changes for at least one year after that announcement, unless (as Google determines in its reasonable good faith judgment)

So technically they can do it, though their enterprise customers likely have stronger agreements that require at least X time (probably 1 year) notice

chatmasta · on Dec 7, 2017

Of course they can do it. I’m sure similar language exists in AWS and Azure agreements.

Look, I hate a lot of what Google stands for and where it’s going. But I find it very implausible they’ll kill any non-beta products that are part of google cloud platform. GCP is poised to take the place of AdWords as the google golden goose, helping them to diversify from their heavy reliance on advertising for revenue. They do not want to screw that up.

I’m sure they are well aware of the uprising that would cause amongst developers, aka the core customers of GCP. It would be a stupid move in a highly competitive cloud market, effectively telegraphing the fact that you can not rely on GCP services to exist in perpetuity. Their competitors would likely respond by re-implementing the shut down product with a compatible API so they could literally steal disgruntled users from GCP.

If you’re really concerned about this, the solution is pretty simple: don’t use GCP. If you want to use it, then only rely on the very core services that google clearly has strong incentives not to kill. Those would likely be VMs and any products that have an equivalent at another cloud vendor.

jacksmith21006 · on Dec 7, 2017

Silly statement. Google is in the cloud business and this is very different than a free product they offer.

kuschku · on Dec 7, 2017

The Flights API they just killed? Custom searches, which many websites paid for, which they killed?

Google has a habit of killing things, no matter if you pay for it and your business relies on it, or not.

jedmeyers · on Dec 7, 2017

If I remember correctly, they were required to keep that API up for a specified amount of time after the acquisition and they have kept it longer than that.

kuschku · on Dec 7, 2017

And that’s an excuse how?

Many of their cloud APIs are also acquisitions. The entire Firebase product line, and the Fabric.io product line are acquisitions.

Should we expect those to also disappear suddenly?

panopticon · on Dec 7, 2017

I think the difference is that the QPX was the byproduct of an acquisition (ITA Software) while Firebase and Fabric.io were the desired targets in those respective acquisitions.

jedberg · on Dec 7, 2017

Usually when a large company relies heavily on a cloud provider, they have an additional contract that specifies, among other things, advanced warning of any pending shutdown, often measured in years, to give them enough time to adjust and also to appease their shareholders and auditors.

eitally · on Dec 8, 2017

And even without this, Google has a history of proactively notifying paying customers years in advance of termination of a commercial enterprise service. The Search Appliances are a perfect example -- EOL was announced a couple years ago but support has persisted for existing customers and only next spring will they finally be fully unsupported. Moreover, Google is actively offering migration plans & assistance to move GSA customers to the new Cloud Search service, or even to third party indexers like Elastic.

I get the gist of the OP's complaint, but like you said, that behavior pattern is just not tenable in the kind of operating environment Google Cloud finds itself in these days.

Disclaimer: I work for Google Cloud, but not on any of the aforementioned products.

dnr · on Dec 7, 2017

If anyone wants to try collaborating on crosswords in real-time, try

https://squares.io/

You can upload .puz files or let it download from NYT with your subscription, then share the link with friends.

(Web only for now, sorry.)

johns · on Dec 7, 2017

Since there are people from NYT here, can you point me in a direction to help figure out why my streaks have been all messed up the past few months? Not sure if it's an app bug or something on the backend, but puzzles are retroactively being marked as being completed perfectly when they're not. Email in bio if you'd like to discuss more.

amelius · on Dec 8, 2017

What does zero downtime mean?

No interruptions of services?

Or just that people could still log in all the time?

degenerate · on Dec 8, 2017

Pretty sure they mean people could login again after the switch.

I can't imagine what the purpose would be of capturing session data for each logged in user and transferring that over... I wouldn't even expect that of a fortune 500 company moving platforms.

If that is what they did, it warrants a post on its own.

everyplace · on Dec 8, 2017

Imagine if you were half way through the puzzle when the cutover happened, and then you lost your entire puzzle state and the board was reset. For the die-hard crossword players, this would be devastating.

amelius · on Dec 8, 2017

Imho, in that case "zero downtime" is the wrong term.

Because it implies that nobody's running session went "down".

That's much harder because otherwise you'd just start a new service parallel to the other one, and flip a switch that directs all new logins to the new service.

spyspy · on Dec 8, 2017

It means users' puzzle and game progress was never interrupted, along with login sessions. A half-played puzzle before the cutover could be picked up as it was afterward.