Two things keep coming up while comparing GCP and AWS:
* This accomplishment would not have been possible for our three-person team of engineers with out Google Cloud (AWS is too low level, hard to work with and does not scale well).
* We’ve also managed to cut our infrastructure costs in half during this time period (Per minute billing, seamless autoscaling, performance, sustained usage discounts, ... )
This thread here came at the right time. Today whole day I attended the DynamoDB training. Honestly, one thing that I understood is its cost based on reads and writes per second. Irrespective of the amount of read data per operation (whether its 1 bytes or 100 bytes), its always charged for 1KB. So, as a work around what they suggested is using a Kinesis, a Lambda and an another service to make the write operation as a batch, in such a way the reads are near 1KB always. He pitched it like thats the perfect way to do. The problem I see is too many moving pieces for a simple thing to achieve. If the Dynamo team makes the reads cost based on the actual data we are all set.
* Kinesis Streams: Writes limited to 1K/sec and 1 MB / shard, reads limited to 2K/shard. Want a different read/write ratio? Nop, not possible. Proposed solution: use more shards. Does not scale automatically. There is another service called Kinesis Streams that does not offer read access to streaming data.
* EFS: Cold start problems. If you have small amount of data in EFS, reads and writes are throttled. Ran into into some serious issues due to write throttling.
* ECS: Two containers can not use same port on same node. Anti pattern to containers.
AWS services have lots of strings attached and minimums for usage and billing. Building such services (based on fixed quotas) is much easier than building services which are billed purely pay per use. This complexity + cost optimization pressures lead to complexity and require more human resources and time as well. AWS got good lead in Cloud space, but they need to improve their services without letting them rot.
Agree totally. The solution to overcome those shortcomings in AWS, is to sort of put bandaid's with more services (at least their suggestion). I do understand, its not feasible to provide service which fits for everyone, however it will be good if they solve the fundamental problem.
One more to add in the list.
In DynamoDB during peak (or rush hour) you can scale which increases the underlying replica's(or partitions) to keep the reads smooth. However, after the rush hour there is no way to drop those additional resources. May be someone can correct me, if I am wrong.
Thanks. Not the auto scaling part. I thought even manually if you scale up with new replica's, we can't scale down. I should read the manual and get a clear picture.
> * ECS: Two containers can not use same port on same node. Anti pattern to containers.
Could you elaborate for this? I'm not sure I understand, are you saying that 2 containers cannot be mapped to the same host port? Because that would seem normal, you can't bind to a port where there's already something listening. But I guess I must be missing something.
The OP is talking about how when using a classic load balancer in AWS, your containers will be deployed all exposing the same port, kind of like running "docker run -p 5000:5000" on each ec2 in your cluster. Once the port is in use, you can't deploy another of that container on the same ec2 node.
The solution is to use AWS's Application Load Balancers instead, which will allow you dynamically allocate ports for your containers and route traffic into them as ECS Services.
I'm not familiar with the details of AWS here, but maybe the OP means mapping two different host ports to the same port on two different containers? That's all I can imagine that would be a container antipattern in the way described.
That is perfectly possible with ECS, so I don't know what OP was referring to. The thing I remember though is that you have to jump through a lot of hoops like making 4 APIs calls (or worse with pagination) for what should have been a single call to make such a system work on ECS.
Nowaday you would often run containers with a container network(flannel, calico, etc.) that assigns an unique IP per container thus avoids conflicting port mappings regardless how many containers with the same port run on a single host.
Adding more context. Sorry for missing it out in first place. I mostly work in Big Data Space. Google Clouds Big Data stuff is built for Streaming / Storing / Processing / Querying / Machine Learning at Internet Scale data (PubSub / Bigtable / Dataflow / BigQuery / Cloud ML). AWS scales to terabyte level loads. But, beyond that, its hard and super costly. Google's services autoscale to Petabyte levels / millions of users smoothly (for example BigQuery / Load Balancers). On AWS, it requires pre warming / allocating capacity beforehand and that costs tons of money. In companies working at that scale, that usual saying is "to keep scaling, keep throwing cash at AWS". This is not a problem with Google.
Quoting from the article: "This accomplishment would not have been possible for our three-person team of engineers to achieve without the tools and abstractions provided by Google and App Engine."
Talking about the use case from the article, they release the puzzle at 10 and need to have infra ready to serve up all the requests. On AWS, you need to pre warm load balancers, increase the quota of your Dynamo DB, scale up instances so that they can withstand the wall of traffic, ... and then scale down after the traffic. All this takes time, people and money. Adding few other things author mentioned: Monitoring/Alerting, Local Development, Combined Access and App Logging ... will take focus from developing great apps to building out the infrastructure for apps.
Currently, I am working on projects that use both Amazon and Google clouds.
In my experience, AWS requires more planning and administration to handle the full workflow: uploading data; organisation in S3; partitioning data sets; compute loads (EMR-bound vs. Redshift-bound vs. Spark (SQL) bound); establishing and monitoring quotas; cost attribution to different internal profit centres; etc.
GCP is - in a few small ways - less fussy to deal with.
Also, GCP console - itself not very great - is much easier to use and operate than AWS console.
Could you please post the URL for the resource and the number of hits it receives? I'm interested in high load websites and I have a hard time picturing how this could lead to petabytes.
The impression I'm getting is not that GCP scales better, but it scales with less fuss - the anecdotes here all suggest that with AWS, once you hit any meaningful load (i.e. gigabytes), you need to start fiddling with stuff.
I don't know if this is actually true, I've never done any serious work in AWS.
Hold on, please do not say google cloud scales well, yes they do have services that make a ton of claims, but unlike AWS, things don't work as promised which is magnified by the fact that their support is way worse.
Additionally, Big Query is far more expensive than Athena, where you have to pay a huge premium on storage.
The biggest difference is that what amazon provides you in infrastructure, where as google provides you a platform. While app engine is certainly easier to use than elastic bean stalk, you have very little control over what is done in the background once you let google do its thing.
GCP support can sometimes be bad but these other claims don't add up. What isn't working as promised? BigQuery can do a lot more than Athena and it's storage pricing is the same or cheaper than S3.
We've used 5 different providers for a global system and GCP has won by both performance and price. We still use Azure and AWS for some missing functionality but the core services are much more solid and easy to deal with on GCP, which is also far more than just app engine.
It seems very strange to paint AWS with such a broad brush, considering that AWS has tons of services at various levels of abstraction (including high-level abstractions like Elastic Beanstalk and AWS Lambda).
Sorry to not add context. I was referring to the use case the author of article was talking about: running a website: You need to stitch: ELB / EC2 / Database / Caching / Service Splitting / Auth / Scaling / ... Where as on Google Cloud, App Engine covers most of the points.
As a DevOps consultant I've actually worked with clients migrating stacks to and from GCE/AWS (Yeah, both ways, not the same client).
What I've found in aggregate is that GCE is a bit easier to use at first as AWS has a LOT of features and terminology to learn. When it comes down to it though, many GCE services felt really immature, particularly their CloudSQL offering.
One client recently moved from GCE to AWS simply because their CloudSQL (Fully replicated with fail-over setup according to GCE recommendations) kept randomly dying for several minutes at a time. After a LOT of back and forth Google finally admitted that they had updated the replica and the master at the same time, so when it failed over the replica was also down.
There were other instances of unexplained downtime that were never adequately explained, but overall that experience was enough for me (And the client) to totally lose faith in the GCE teams competence. Even getting a serious investigation into intermittent downtime and an explanation took over a month. By that time our migration to AWS was in progress.
GCE never did explain why they would choose to apply updates to replica + master SQL at the same time and as far as I know they are still doing this. I asked if we could at least be notified of update events, was told that's not possible.
There were other issues as well that taken together just made GCE seem amateurish. I'm sure as they mature a bit things will get better, and it is cheaper which is why I wouldn't necessarily recommend against them for startups just getting going today. By the time you are really scaling it's like they'll have more of the kinks worked out.
(GCP support here)
This is a known bug, I've worked at least a few cases where this happened. There is a feature coming out soon that will allow different maintenance schedules to be set for masters/replicas, which will likely be automatically set for different times. And, once the kinks get worked out, hopefully we'll be able to re-deploy the feature that shifted traffic to failovers while the master is being updated, and eliminating maintenance downtime altogether.
I use Azure all the time (app services, storage, cloud services,VMs, SQL, CDN etc) and almost never run into this issue. Can you share some example on what you mean?
I use all that stuff and constantly run into it, are you using it in anger?
Here's a few I can remember off the top of my head:
- A (relatively) huge 6ms lag between the website and the DB
- One (random) site will mysteriously max out on memory on app-pool startup and take all the others down
- Their scheduler has no concept of timezones
- Their scheduler uses your local time when setting up the job, but UTC for other parts (this has been an open issue for over a year I think)
- Files will get mysteriously locked in deployments and the deployment process will silently fail
- Deployments will suddenly take an absolute age for no reason
- The entire admin UI will slow to an absolute crawl for hours on end
- Some admin tasks always claim they've failed, even though they've succeeded
- Their API wrapper is just wrong on almost every level
Add on top of that the worse management UI I've ever had to deal with and it makes Azure very painful to use at times. Some-one thought nesting menus in a standardised format was a good idea. It wasn't. Everything is fairly terribly named too. Want to see how your deployments doing? That's under "Deployment Options".
Performance is also dogshit compared to the cost, my 4 year old laptop is faster than their "premium" offerings.
No I'm using it happily. We don't use the scheduler and create new deployment slots when deploying (which maybe prevents locking issues). Sometimes I experience oddness in Azure portal and has to refresh but has never had it slow down. As for SQL latency it's been insignificant to us so I'm not sure if what we experience is better or worse than yours.
Portal i agree is partly confusing/messy but we set up things once and then do deploys via CI infrastructure. And even the initial setup we try to automate using PS instead (to make it reproducible).
When it comes to pricing I agree, but my laptop does not do multi-datacenter so well.
Im not questioning anything you say of course. Maybe I've gone blind or don't see the issues as critical as you, or maybe I'm just more lucky.
Hi, I am not sure you will read this as it's 10 hours after posting. The cloud sql updating both copies sounds like a bug. If you want to email me your case number I can look into it. I know you don't work with GCP anymore but I like to resolve the issue for other users.
Email: tsg@google.com
Disclosure: I work on gcp support. Not paid to be here.
As a Database Engineer working for one of the largest e-company in the market, I can clearly see your point. Definitely AWS RDS is very matured when compared to CloudSQL. I think CloudSQL only provides MySQL and Postgres (still in beta?). So GCE needs to build their Database Arsenal soon.
Next, your client faced issues with replication in GCE, thats not good to hear, but we do face issues in our AWS RDS MySQL and Aurora very frequently. RDS MySQL error logs not generated properly. Aurora has weird memory leaks, connection spikes, starting to behave sporadically when the memory crosses 80% and so on. We are working with AWS to figure out the issue still (credits to the AWS Support for trying to help us). So, to conclude whether you are in AWS or GCE this is the trade-off of "cloud". We need to live that, if you are moving to cloud !!
Cloud Postgres is also hilariously hard limited to 100 simultaneous connections (the default). Doesn't matter how much RAM you give it.
My experience with GCP in the past 4 months has led me to revise my "friends don't let friends use App Engine" motto to "friends don't let friends use Google Cloud", there isn't a single service I touched (except maybe Compute Engine) that didn't have half-baked client libraries, documentation, bugs in the server part, or a complete failure by Google to even have their engs use the competitions tooling before inventing their own shitty clone (DNS)
I was super keen to switch to GCP (for cost saving etc) but this mirrors a lot of my experiences. Deploys to App Engine took 20 minutes, not 2 minutes, and I have absolutely no faith in their firewall settings actually working. I have no idea what the problem is, but it's basically impossible to boot a Rancher master node on Compute Engine. Even with all ports open. In the end I just bailed on the platform as a whole, and I'm moving to a hybrid approach on smaller providers like Packet.net and Digital Ocean.
And I would have been totally fucked over by that Postgres connection limit when we went into production, I'm glad I dodged that bullet! I hadn't bumped into that when playing with dev environments, and I haven't seen that limit mentioned anywhere.
App Engine Flex takes a long time to deploy, and always has. App Engine standard is what deploys quickly, and also scales quicker.
Firewall settings work just fine for our platinum clients with complex network architectures, I don't see why it wouldn't in your case unless something was misconfigured.
Isn't Flex the newer of the platforms? Is there a reason why it's so slow to deploy? I deployed an app via it that deploys in a couple of minutes anywhere else, including build time, but it took an insane amount of time on GAE, and I never managed to find a good reason why.
Normally I'd think I had configured something wrong, except in this case it was insanely simple. A network label that allows all ports both ingress and egress, to any destination/source, definitely applied to the servers, and yet they had constant connection issues with each other.
It probably was something I did, but the combination of those issues, surprise egress bills, very laggy UI, and various other little niggles just made it not worth my time for now. I'm keen to avoid vendor lock-in anyway, so GCP and AWS don't have that many extra features over smaller providers for me.
Because a key-value store is a foundamentally simpler data structure (it's an hash) than a relational database, which tracks the relations between different data types. If you make an advanced use of the key-value store, you have a lot of logic in the application (for example to key management, cascade operations between related data...) which a relational database should do for you. It's not fair because there is a development cost in using the key-value you are ignoring.
DDB -> NoSQL, No Automatic backups, No support for ad-hoc querying, eventual consistency (though you can set to get consistency with few tradeoffs)
Spanner DB -> RDBMS, Automatic backups, Enriched SQL, Strong consistency.
Let me if you still think its fair to compare these 2 databases.
Hey community, let me share my experience with AppEngine. I work in a small firm, where we've developed a massive Software Application comprising of 12 medium-sized apps. I went with Phoenix 1.3 w/ the new umbrella architecture.
With AppEngine, the beauty is that you can have many custom named microservices under one AppEngine project and each microservices can have many versions. You can even decide how much percentage of traffic should be split between each of these microservices.
What's awesome is, in addition to the standard runtimes (Ruby, Python, Go, Java, etc.) Google also provides something called custom VMs for AppEngine, meaning you can push docker based setups into your AppEngine service, with basically any stack you want. This alone is a HUGE incentive to move to AppEngine because usually custom stack will require you to maintain the server side of things, but with Docker + AppEngine, zero devops. Their network panel is also very intuitive to add/delete rules to keep your app secured.
I've been using AppEngine for over 4 years now and every time I tried a competitive offering (such as AWS Beanstalk, for example) I've only been disappointed.
AppEngine is great for startups. For example, a lesser known feature within AppEngine is their real-time image processing service API. This allows you to scale/crop/resize images in real time and the service is offered free of charge (except for storage).
Works really well for web applications with basic image manipulation requirements.
The best part is, you call your image with specific parameters that'll do transformations on the fly. For example,
<image url>/image.jpg?s=120
will return a 120px image. Appending -c will give you a cropped version, etc.
I really hope to see AppEngine get more love from startups as it's a brilliant platform, much more performant than it's competitors' offerings. For example, I was previously a huge proponent of Heroku and upon comparing numbers, I realized AppEngine is way more performant (in my use case). I'm so glad we made the switch.
If you're looking/considering to move to AppEngine, let me know here and I'll try my best to answer your questions.
Full disclosure: I DON'T work with Google, I DON'T sell their services/products. I run a startup myself, with an Saas on top of AppEngine. This is just my documented (positive) experience with the stack above. I get paid nothing by Google, nothing in $ nor credits this is just my own personal documented experience.
Being a long time HN member, I would responsibly disclose if I were somehow affiliated with Google (trust me, I wish I was).
The way you've delivered this "experience" makes you sound like you either work for Google or were asked to make a sponsored statement - for credits or $.
The reason is that the accusation is false orders of magnitude more often than it is true—because people falsely assume someone else can't possibly be holding an opposing view in good faith—and false accusations damage the community.
or a passionate user who had a great experience?! I love finding solutions which require less 'square peg round hole'. Unfortunately, rare these days when piecing together a stack w/ the myriad of platforms/frameworks/etc.
I'm usually not skeptical of comments but this comment definitely feels "artificial".
I think Google has better things to do than to pay people to comment on HN, but I do think either this person is trying too hard to sell us on Google Cloud because they like it (which isn't a bad thing per say)
Edit: I thought about it and they probably aren't related to it, probably just really enthusiastic about it (good thing) but they want to sell us on it (eh, not sure how I feel about)
Yep - I guess I'm a bit empathic to the comment. I'm always trying to sell what I'm using to others, to get more into that camp, to generate more discussion and innovation. But, it's all just like ice cream [1] anyways.
Sadly, this is rarely true. Many of the people defending AMP for example turn out to be Google employees that didn't disclose it (found via their comment history), and people that complain about AMP are not just privacy activists, but often also involved in publishers or advertisers that lose money from it.
I doubt it would be much different in this thread.
I migrated a big data stack to GCP from AWS. Reasons: GCP has better documentation, the AWS console and various services confuse the heck out of me (I guess I'm getting too old), and the security integration between GCP services saves a huge amount of time. It's super easy and very fast to used the Google Compute Engine VMs. Given that the company I work for uses G Suite, it's a piece of cake to implement SSO and other integration pieces. It's also cheaper for us than AWS and more performant.
Wait until GCP becomes older with bigquery and other versions. Also GCP does not have regions ready and it takes forever to get a new region. Waiting for Mumbai region for more than 3 to 4 months now.
But one thing i like about GCP is that it allows to the limit setting in terms of cost and ensure you wont cross it. In case of AWS it can give alerts but for some reason say all you team in in one location and there is emergency like flood etc and you dont check email then you are done. I stopped using AWS after i learned that there is simply no way to set limit. Waiting for GCP to open their Mumbai region. sigh.
Also AWS is very deceiving with free tier , there is simply no way to understand which products get free and worst case is after free tier you will get charged.
AWS doesn't add a new region every 3-4 months either. Adding a new region is very complicated. Normally vendor does not actually build the DC themselves. They would source from other data centers in the region whenever possible. Building a new DC is not something can be taken lightly. Then finally local laws.
"Due to the inelastic architecture of our AWS system, we needed to have the systems scaled up to handle our peak traffic at 10PM when the daily puzzle is published."
WT... I had to reread this to make sure I didnt misunderstand... why not work on making the current arhictecture elastic?! #cloudPorn
The "inelastic" might have been a shot at AWS. When pressed, the AWS people do use phrases like "pre-warming", "over provisioning" and "advance notice" around their ELB/ALB setup and ECS.
Google's cloud salespeople pitch that they don't require any of that.
AppEngine instances can typically start in 30 seconds or so. So if your spike is because your video went viral on facebook and lots of people are looking it it, that's fine.
If your spike is because you have 10 million clients with an app set to do an HTTP request at exactly 10:00:00pm, and they all arrive within a quarter second, thats a problem.
30 seconds is actually a gross overestimate for AE startup. Your description of our problem is very accurate, though. We still need to over-provision just before the spike, which we do with a cron that scales up to several hundred instances 5 minutes before 10pm and then back down to normal levels 5 minutes after. Autoscaling takes care of the rest.
Curious if the need for pre-warming ELB/ALB still applies. Last time this came up, an AWS employee mentioned it is no longer necessary (https://news.ycombinator.com/item?id=14052079), but would be nice if this was documented.
I don't want to dox myself but about a year ago when my employer forgot to notify AWS about switching our production traffic (about 5K rps at that time) from one ELB to another we failed requests for several minutes before we decided to just switch back to the old ELB then ask them to do a prewarming before we switched again.
The "advance notice" and "over provision" advice is still being given for things that could scale up fairly large. (where fairly large isn't anything that exciting, really)
Imo, one really killer bit is the first piece they mentioned:
"Google provides an SDK that enables users to run a suite of services along with an admin interface, database and caching layer with a single command."
I really wish AWS had a decent local dev story, rather than relying on 10 separate half-baked OSS solutions
Have you tried the documentation for google's sdk? I stopped in disgust trying to track down what I needed and then trying to manage through multiple different admin interfaces.
I've been using google bits recently. I agree that the docs for e.g. all the python sdks need a lot of work.
That said, boto3 has thorough docs, but I wouldn't consider them particularly well organized. I can only really navigate AWS sdk docs because I already know what I want to do, and can google the specific terminology
(Google cloud support here)
The pages you linked to are supposed to serve as client libraries reference only. If you want higher level instructions and examples, always start with the main Google Cloud docs first. On any page that offers instructions, the top of the code windows offers a selection of client library languages/CLI tools/REST API available to do whatever that task is. For cloud storage, start here:
Choose a topic, and select "Python" at the top. It should provide instructions and examples using the Python libraries.
Also, we have a repo of demo projects and examples for nearly every GCP product/service, and then some. Great examples to be found here (some might be out of date though):
Thanks for the link! You're right that this is what I was looking for. Unfortunately, that hadn't shown up in a convenient place while I was googling around. Would be good to add direct links to those from the client lib references, because those pop up for e.g. "google storage python" first. (Unless they're already there and I didn't see them).
I don't disagree with you, but it's also true that you don't have to go all-in. I run services that run server-ful-ly on EC2, while requests to the "front-end" are handled by Lambda. If you've got one or two very hot endpoints and the majority of your server load spike is thanks to the processing for those endpoints, moving just those endpoints to Lambda could give you reassurance that you don't have all of your eggs in one basket (not having to go full-serverless) while also getting the benefit of being able to scale up and down essentially effortlessly.
Of course, there are lots of other consideration that you should be making when going even partially serverless, and it could be that the NYT chose not to experiment with Lambda for other reasons. For instance, you're effectively tied to CloudWatch for logging and monitoring, which could be a deal-breaker. Much of the processing could be happening in the DB, which would make Lambda moot. It may simply be that their estimated usage of Lambda was too costly.
I'm not sure why people are so enamored with ELB -- just terminate SSL at your web boxes using nginx and publish the public ips of all of these machines in your DNS records.
You remove a bunch of ELB per-request costs doing it this way and you can scale it however you see fit.
That works great until a machine starts failing health checks and you need to take it out of rotation ASAP. It's also the case that DNS gets cached and not all users create the same load: one user making many requests will burden one server instead of having the load evenly distributed.
Clients will try another ip if they can't connect -- partial failures may be a problem, but in my experience as long as nginx is alive, you can load balance to a different web backend if the app processes on that machine are wedged.
I've deployed this solution in 50k req/s environments and not seen a single user be a problem like you mention -- any motivated bad actor could cause problems in either scenario I expect.
Clients might not fail to connect. That's even worse. They connect and the server hangs and returns no response (perhaps due to bad configuration changes). Now you're stuck, and your server is too oversaturated to SSH in and fix it.
It out depends on your application and users. Building a website? Probably not much of an issue. Building a low-latency API? YMMV. Keeping your load evenly balanced across your front end cluster also can keep your cost low, since you are able to distribute load more evenly.
That's another point: if you scale your cluster size up and down frequently to accommodate load, doing that with DNS is a nightmare.
Getting pro GCP articles to the top of HN must no-doubt be a high priority for the Google marketing team. This is the nature of modern advertising, sneakily trying to subvert your thinking by masquerading as something else.
We actually spent the entire day at the Giants game today. ¯\(ツ)/¯
There's no incentive for high ranking HN posts, or any HN posts, actually. If there were, you wouldn't see others continually submit our news here before we do. This was a nice and unprompted post for everyone in GCP to read, as well.
(Disclosure: I work on GCP as a product marketer.)
A GCP product marketer responding to negative commentary on HN within ~20 minutes of it being posted strikes me as automated.
EDIT: It is a double standard though, HN readers want access and responses from people on the GCP team but at the same time tinfoil hat subliminal marketing etc.
The first statement is partially true (it's certainly nice) but that's true of any marketing team. To be clear, I don't think the NYT people were pushed to do this. Engineering blogs commonly say what they did to take pride in their work. I'm not in marketing, so I don't know if we we're involved. Maybe they sent it over for review, but I kind of doubt it. (To the comment below, we don't pay people for content, and it's demeaning to the engineers at NYT to suggest that).
Disclosure: I work on Google Cloud (as an engineer, not in marketing, despite how I like HN).
I sometimes ask Google Cloud clients who post this type of stuff, what's your motivation? The one consistent theme is that telling the world that you're doing cool stuff with cool tech raises your organization's profile and helps in recruitment.
From my experience working with NY Times they are certainly a top-notch engineering organization. They should be free to advertise that.
I don't know if these engineers are being forced to release these types of blogs, but the far more likely (and respectful to said engineers) scenario is that they just want to talk about their work. This isn't the first time NY Times has done this [0][1].
(Work at Google Cloud on same team as boulos and worked with NY Times on some of their migration pieces like BigQuery)
I imagine the resulting discussion may also open the team's eyes to alternatives that others have succeeded and failed with, which in turn helps the team in further refining their process.
> Engineering blogs commonly say what they did to take pride in their work
Yep, engineering blogs are also marketing, aimed right at HN readers. Attracting good engineering talent isn't easy, so companies have to do marketing on that front as well.
But, as usual, people being marketed to (in this case, HN readers) don't realize they are being marketed to.
Heaven forbid a software engineer post about a major technical change their team just accomplished. It couldn't be because they want to raise their team's profile for hiring, or discuss the pros and cons of their experience, or raise their own profile as an engineer.
That's the nature of conspiracy theories: sneakily insinuating a well-organised plot to harm, while masquerading as someone who thinks Occam's razor is a hairdresser.
Google likely offers significant discounts to companies which write these sorts of pieces for them. And obviously a lot of Googlers participate actively here, so getting upvotes naturally is likely not a challenge.
Has anybody had successful experience deploying docker containers on appengine ? Last time i tried, i had such a bad experience in terms of deployment speed ( time to build the image, then upload it, then waiting for the stuff to deploy) that i reverted to managing my own gce instance.
With GKE you don't necessarily update your load balancer rules with a deploy correct? The linked thread points the blame for app engine deploys on waiting for GCLB to update.
Correct, you have software based load balancing under the covers.
My Google Load Balancers never move.. It is a single thing that points each node (physical machine) in the cluster, and distributes traffic between then.
Each node knows how to route traffic to each app. So when I deploy that app, the software load balancer at the node level will slowly move traffic over from old app to new app. Entire thing is MAGICAL. And 0 downtime, very very fast deploys.
Edit - But yes this explains iy. Changing the google load balancers is like a 5 minute ordeal. Total pain. Nice that with GKE you only need to touch them when your node count changes, which can be very rare (~monthly for me)
Funny you mention that, I was just experiencing this yesterday in the flex environment. Re-deploy time for a 3 character text change on one page is 8+ minutes for each re-deploy. This is insane.
(Google cloud support)
Unfortunately, this is a pain point that has no easy solution at the moment. It doesn't matter how trivial a change is, it isn't the docker deployment that is taking all that time. As mentioned above, the bottleneck is updating the GC load balancer with the new routing rules, which takes time to propagate throughout the system. This is a high priority issue internally, but updating the load balancers is no trivial task, and will take a lot of time and testing.
In the meantime, I recommend the following mitigation strategies:
1. Try to get into the habit of carefully reviewing and testing new versions locally before deployment. Client libraries should still work if you have a valid default application credential set up. I say this because I have a hard time remembering to do this as well.
2. Static content and templates for your site should be hosted on GCS, not deployed with your app in a "static/" folder or something. Easy to fix typos, HTML/CSS, and JS errors by simply using gsutil to copy the fixed file over, takes only a second.
3. Always keep a stable version of your app available in case you broke something in a new deployment. It's quicker to route traffic to an older version than it is to track down a bug and wait for the fix to finish being deployed.
Not ideal, but again, Googlers have to suffer this too, and are very motivated to find a way to fix this.
I was sure this was about some multiplayer game thing, but no, it's a crossword. Not entirely sure what they are even scaling here, I was expecting an article about a CDN..
It’s a paid service that hundreds of thousands of people use. Not world changing stuff, but considering competitors are still using Java applets and Flash, I for one applaud their efforts.
Anyone know how much it cost to add a custom domain and SSL to AppEngine(Standard og Flexible)? I have been looking and not able to find out how much it cost.
Its free for custom but only to use existing certs. You still have to set up the cert and such with your domain as you normally would. You also get the *.appengine.com domain with automatic SSL>
* This accomplishment would not have been possible for our three-person team of engineers with out Google Cloud (AWS is too low level, hard to work with and does not scale well).
* We’ve also managed to cut our infrastructure costs in half during this time period (Per minute billing, seamless autoscaling, performance, sustained usage discounts, ... )