"Unlike traditional relational databases, the Datastore uses a distributed architecture to automatically manage scaling to very large data sets. ..
Datastore reads scale because the only queries supported are those whose performance scales with the size of the result set (as opposed to the data set). This means that a query whose result set contains 100 entities performs the same whether it searches over a hundred entities or a million. This property is the key reason some types of queries are not supported."
"You should be aware that Cloud Datastore has a serving component that runs on Google App Engine, so there will be instance hour costs."
It is only after reading this bit that I went, d'oh, this is the same Datastore that you use on App Engine, abstracted and factored out. So basically you don't need your app to run on App Engine to use it anymore -- this is what the noise is about. Got it. Sheesh could have just said that. :)
The most comparable thing seems to be providers of Big Ass CouchDB in the cloud like: https://cloudant.com/
It's because that URL erroneously has the left-to-right control character[1] at the end, which encodes as https://cloudant.com/%E2%80%8E - a missing page on that website.
That looks like an awesome product! Kudos to the team, they truly did a great technical job here!
From a market perspective, it kind of makes no sense. I challenge CTO/CIOs to justify building their company on a closed source critical product with no option. While that used to be how things worked in the 90s, I don't think that is as acceptable as it used to be.
Maybe the answer is "well it's the only thing that works", in which case I congratulate you for building such an in demand product!
Absolutely agreed. For most companies, a couple instances of MySQL (or PostgreSQL/Oracle/SQLServer) with replication and backup should provide 4 nines uptime (which is probably comparable with what Google/Amazon offers in terms of availability). Throw in a few memcache here and there and you can easily scale up to millions of requests a day. FWIW, even Facebook is running on mysql/memcache.
It's not the hardware costs, it's the head count to setup and keep those database instances running, backed up, and tuned. That's the IaaS pitch -- no sysadmin.
As for the vendor lock, most of the datastore API is very similar to other nosql APIs I've seen. It wouldn't be trivial to switch, but not impossible either. I imagine it'd be even easier if you used a orm that is widely supported but I stay away from those, so I couldn't say for sure.
In theory I agree with you, but the reality is that even though SQL databases are all on the same "standard" there is enough non-standard things that it in practice becomes impossible for high performance apps.
I'm afraid I don't see your point. Docker, and some of the other new system tools seem like great products. I'm sure they make sysadmin's lives easier than when they had to run everything via custom shell scripts. It probably makes them more efficient, so you don't need as many sysadmins per server. But I don't see how it is ever going to reduce the need to less than 1.0 sysadmins. And just as it will make sysadmins that work for your company more efficient, so to will it make IaaS/PaaS' sysadmins more efficient.
As a developer, I have very little interest in administering databases, servers, networks, and so on. It's not what I'm good at, and it's not what I'm interested in, it's not what I know. That the tools now use Turing complete configuration language that also happen to be a fairly popular general purpose languages doesn't change any of that.
Certainly, the IaaS/PasS pitch isn't for everyone. But for those who are in the sweet spot, it is a very compelling pitch. Frankly whether or not the platform is open source doesn't really impact its appeal very much. At least from where I sit.
AFAIK Facebook is no longer running on memcache, they built their own service. Besides, Amazon and Google offer 5 nines which is much better than you can get on a conventional database, not mention the scale.
"a closed source critical product with no option."
Sadly, I disagree. While it isn't SOP for a startup to go down this route, many companies still end up dealing with vendor lock-in.
Unfortunately, Google is going to continue dealing with the stigma of abandoning and sunsetting their own product line, regardless if it's not happened in the enterprise sector. I just don't trust them.
On the downside it can't do aggregates, it can't do query filters on more than one property, it can't do joins, it has its own unique query language, and it is a proprietary system with no chance of moving to another provider or hosting it yourself.
Speaking of it having its own query language, perhaps that isn't a big deal given you can't do any non-trivial queries.
For me, the pejorative NoSQL claim is actually a marketing turn off.
I don't get this attitude towards commercial Google products. You are a paying customer that is very likely to use other Google cloud services. It would be stupid to annoy a large subset of the paying customers by not providing a clear roadmap of sunsetting. And Google stupid is not.
I am excited and will gladly try it on some side projects.
You may have missed when they jacked up AppEngine prices. It was fine for a lot of people already making significant money with their app but it basically killed my app (would have been unjustifiably expensive to keep running). I won't be trying proprietary google infrastructure again anytime soon.
A clear sunsetting roadmap still means that you trusted your business to something Google no longer cares about. It doesn't protect you from the cost of migration.
This is a cloud service. I'd imagine they will be much less likely to kill these services outright. The impact on paying customers would just be too large. That's not to say they won't sunset certain features though.
The impact of killing Google Checkout for customers using it was probably minimal, since nobody was using Google Checkout (hence the reason Google killed it.)
Cloud services on the other hand, killing those would be highly disruptive and would probably effectively kill any products built on top of them. It's hard to imagine Google killing any of these services, unless one of them say is wildly unpopular and only a small handful of people will be outraged if it is killed. Even then, it's seems insane to think about Google shutting off database or computing services people are using to run businesses.
You don't know that any company will keep a service alive a few years from now. Either (a) the company is willing to shutdown services that don't make sufficient money to justify keeping them, which creates a risk, or (b) the company is not willing to do that, which creates a risk of the company failing and either going out of business or being purchased by someone who is willing to shutdown the parts that aren't making money.
Still not standalone: “You should be aware that Cloud Datastore has a serving component that runs on Google App Engine, so there will be instance hour costs.”
Even if the serving component runs on Google App Engine, you don't have to deploy or manage it yourself. Once you activate Cloud Datastore you can access it like an other Google APIs:
https://developers.google.com/datastore/docs/apis/v1beta2/
But then what I don’t get is why is it advertised as a separate product. If it’s just a managed App Engine application then it shouldn’t be called “Cloud Datastore” and have a top link on your Cloud products page. However if it’s a standalone product then Google should eat the instance costs. It would be crazy if Maps API started charging for instances suddenly…
If you want to charge for instances then the way to do it is the DynamoDB’s way — charge for throughput. The actual number of instances should be an implementation detail. At least that way it’s deterministic and you can feature it prominently on your pricing page with an appropriate calculator.
Until then the advertising is completely misleading if not dishonest, since I’m pretty sure the instance costs would be quite significant for any use case that tries to use “Cloud Datastore” as a standalone db solution for GCE ecosystem, rather than to just run some Map Reduce for an App Engine app.
This service looks pretty good. After Google dropped Wave, I more or less stopped using AppEngine (useful for Wave robots). I think that their charging a higher price for AppEngine makes it look more appealing (because the service is less likely to be cancelled). Still I was pretty sure I would just go with my own servers and AWS in the future.
My opinion changed when I started working as a contractor at Google 2 months ago. I love the Google infrastructure, from inside the company. It is amazing (that is an understatement!). When my project is done next year and I go home, I am fairly sure that I will go back to using AppEngine, and other services for my own projects just out of nostalgia.
For the general public, I am not sure if the AppEngine and other Google services story is a good one though. It will be interesting to see how their market share for the PaaS does over the next few years.
Sorry if i've missed something, but what's the news ? Schemaless datastore has been a part of app engine since the beginning, and this particular evolution since a least a year...
I've got a little experience with using GAE for a production app, and the datastore was my favorite part. A lot of their components are a little bit jank, but the Python ndb library was pretty nice.
I definitely hope they plan to include ndb in their googledatastore library. I feel like ndb is the actual selling point to me, and would make me really want to use this for a production app.
I don't think I could trust Google enough to build anything serious with this, going by their recent track record it could be shuttered in under a year.
This looks just like Windows Azure Table Service. I'm guessing it has the same limitations such as no joins, aggregates, etc. They're great systems when you ONLY need to store and retrieve things by a single key. They work for some systems, but don't expect it to be your main data-store.
These links are much more informative:
https://developers.google.com/datastore/docs/concepts/
https://developers.google.com/cloud/pricing#cloud-datastore
"Unlike traditional relational databases, the Datastore uses a distributed architecture to automatically manage scaling to very large data sets. ..
Datastore reads scale because the only queries supported are those whose performance scales with the size of the result set (as opposed to the data set). This means that a query whose result set contains 100 entities performs the same whether it searches over a hundred entities or a million. This property is the key reason some types of queries are not supported."
"You should be aware that Cloud Datastore has a serving component that runs on Google App Engine, so there will be instance hour costs."
It is only after reading this bit that I went, d'oh, this is the same Datastore that you use on App Engine, abstracted and factored out. So basically you don't need your app to run on App Engine to use it anymore -- this is what the noise is about. Got it. Sheesh could have just said that. :)
The most comparable thing seems to be providers of Big Ass CouchDB in the cloud like: https://cloudant.com/