Hacker News new | past | comments | ask | show | jobs | submit login
Google Cloud SQL: your database in the cloud (googlecode.blogspot.com)
175 points by boundlessdreamz on Oct 6, 2011 | hide | past | favorite | 55 comments



If this means not having to be locked into App Engine's proprietary database shenanigans for even considering using it to deploy a project, then Google just made it back onto my list of useful PaaS providers.


Using a database wrapper like http://www.sienaproject.com/ you wouldn't be locked.


My feelings exactly. It was always a blocker before, being locked to a provider is murder.


My biggest complaint with Heroku is that they don't have a database offering between $15/month and $200/month.

This addition to app engine makes them more appealing than heroku at this point.


True, but since they sit on AWS, you can just use Amazon RDS with your Heroku app on a small instance and be just fine.


Which starts around $80/month. There is still a hole there.


but a reserved instance for a year is $227, which is not far from $15 a month.


This is misleading.

The reservation fee for a small instance is $227.50 annually (http://aws.amazon.com/ec2/pricing/) but you still pay an hourly rate to run it, albeit a lower rate than for a non-reserved instance. Running a small reserved instance for a year costs $227.50 + ($0.03 x 24 x 365) => $490.3/year or $40.86/month. (This is in the US East region.)


oops, of course, sorry for this


Thanks, I missed the reserved instances somehow.


The $227 buys you a lower hourly rate, it doesn't buy you use of an instance for a year.


Nothing has been said on pricing yet. It's free while it's still in developer preview.

Pricing will be revealed later. But even if it ends up being a 'mess' like app engine's pricing situation, it should be much easier to move onto another platform.


Cloud SQL is available free of charge for now, and we will publish pricing at least 30 days before charging for it.

Yeah. We know how well that went with GAE.


I'm convinced they were giving it away at a loss before. Now it's priced the same as similar services which makes sense. Obviously it sucked for the people that caught in it, but the expectations are properly set this time around.


honestly, after what happened last time, this

     Cloud SQL is available free of charge for now, and we will publish pricing at least 30 days before charging for it. 
Seems a lot less nice.


And the fact that Google likes to shut down services.


Agreed, but the difference is that it should be far easier to migrate away if anything turns out badly for you.


Wow, does that mean we finally get full text search (i.e., LIKE '%%')? Bonus points for OR, JOIN's and multiple inequality filters.

Since they forbid these operations on the datastore, I'm guessing this will either not scale as well or be significantly more expensive. Since the sign up sheet implies you can combine technologies, I suppose you can use a "hybrid" approach and keep denormalized data in the datastore.


The Google App Engine team has been working on a full-text-search service for GAE projects. They gave a demo at Google I/O back in May. Here's a video of their presentation:

http://www.youtube.com/watch?v=7B7FyU9wW8Y



Yep, it currently limits you to a 10G database size. Makes sense, though - SQL techniques like LIKE filters or joins really do require all of your data to be in memory with a very high-bandwidth, low latency interconnect - ie, it all has to be in memory on a single box.

This is the reason the datastore forbids these operations - because they're extremely difficult to efficiently implement and still scale indefinitely (without making other, potentially very large sacrifices).


What? A join requires all of your data to be in memory? I've sure as hell seen databases that executed queries with joins and the data was not entirely in memory...


Well, sure, it doesn't _require_ it. But the alternative is higher query latency. You have to collect a filtered set of the join column from table A, then ship them over to the servers responsible for table B. If your interconnect has high latency or low bandwidth, this can be painful - particularly if the intermediate set of keys is very large.

Hence why the datastore simply disallows this - yes, it's _possible_ to make joins work on larger datasets. But it's not possible to make _arbitrary_ joins work well on larger datasets.


Joins on indexed columns don't require anything apart from the index to be in memory in order to do the filtering.


Or preferably, Sphinx support. From the announcement I can't see that you can't do it, but not that you can.

InnoDB and MyISAM are not good at full text search even on local disk, no reason to make that worse.

Somewhat kinda germane post from Percona: http://www.mysqlperformanceblog.com/2009/09/10/what-to-do-wi...


As far as I can tell this is just for App Engine stuff, with APIs for Python and Java. They could have opened it up to the web with REST APIs or something similar. Am I wrong?


Wouldn't the latency be a huge issue?


Depends on the use. Plenty of systems have a local memory cache and queries database only for large chunks of data.


Only as much of an issue as with other database hosts? (ie. not much of an issue at all)


What stops you from writing a GAE app to do just that? There's papers out there on how to map SQL to REST APIs, I've written a few myself.


This should make App Engine a fantastic place to host Django projects.


Except every Django project I've done last several years is PostgreSQL not MySQL.

And I assume still all the other issues; e.g background tasks, processing time.


Who cares what DB it is when you don't have to manage it?

Django's data access layer is abstracted so it's actually pretty easy to migrate between.


I care much, much more when I don't manage it cause I've relinquished control to 3rd party.

Because I need to use it. Because I need to build features such as GIS, full text search, transactions. Features who's existence and level of maturity varies wildly across DB platforms.


I didn't see any references to ways by which one might use the special JDBC driver to access one's hosted MySQL data. It would be cute to create an app which proxies JDBC calls so that developers can use TOAD/DBViz/whatever to manipulate remote data. Such a driver might also be useful for one-off schema migration tasks. Did I miss something?


It seems pretty doubtful you would be able to use this to access the hosted data remotely.


This is huge for allowing corps to migrate apps onto Google's cloud. Many open source web frameworks can be used on GAE (albeit with limitations), but de-SQLing an existing app is probably a non-starter for most corporate dev groups. Even if scale-up is an issue past some threshold, the long tail of most mid-sized companies' in-house apps includes a bunch of semi-legacy ones which don't have very high usage levels. Think about stuff like the environmental regulation compliance database -- important, but probably not used by more than 10's of people.


I don't know of it's so huge for corps. They tend to not mind spending the extra money to get EC2 + RDS in return for reduced lock-in, an environment you can control, and serious SLAs.


The buzz word scalability is not in the blog. I assume it's just a shared MySQL server like the shared hosting vendors offered.


I hope that the cost ends up being purely research based. I might use this for a few apps that have very limited database requirements. If my database is small and I run relatively few transactions per day then I hope that the cost is minimal. If there is a large minimum charge then that would be a showstopper for my use.


I'm really curious how they implemented this. They say "data is replicated synchronously to multiple data centers." Synchronously MySQL replication over a WAN is pretty impressive. As others have said, this post doesn't mention scalability.


There are disk device level replication like DRBD that mirrors disks synchronously across network. It's a generic solution for any disk mirroring. MySQL just sits on top of it.

Google probably got some super fast cross datacenter links to allow fast replication. Still the latency will be worse than local mirroring for synchronously replication.


I believe they (Google) has a MySQL backend implemented on BigTable.

(I don't think that is officially announced but it was mentioned - somewhat accidentally - at a public Google event.)


How scalable is this? Are there capacity or transaction rate limits on Cloud SQL?


You get MySQL instances, just like Amazon RDS. All the limitations of MySQL in the real world still apply.


Is there any area that google don't stick their nose in? Now they are also in the DB business??


Google: bringing the reliability of the Internet to relational databases!


Gotta love the ways you have to literally just give away your data to Google.


The other cloud providers have similar offerings. It has nothing to do with Google.


same issue. most of the time its google tho. their business is based on it and theyre good at it.


> Gotta love the ways you can CHOOSE to give away your data to any third-party service provider.

Fixed


> Gotta love the ways you can optionally give away your data to any third-party service provider.

Fixed


Fixed

Please don't do that.


ways with an s. its called english :-) unfortunately many just misread it because the comment emits criticism.


> Gotta love the ways you can elect to give away your data to any third-party service provider.

Fixed




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: