Hacker News new | past | comments | ask | show | jobs | submit login
AWS Database Migration Service (amazon.com)
141 points by hepha1979 on March 15, 2016 | hide | past | favorite | 62 comments



The truly interesting thing to me about this product is the pricing. Competitors in this space sell software starting at ~$5,000 all the way up into 5, and I presume 6 figures.

AWS is giving away their software and only charging for the hardware to run this migration software on. The catch is that you need to be migrating to AWS resources which you are paying for. I'm sure this tool will have less features than the more established competitors, but for customers that fit the sweet spot, it will be very attractive.

Disclosure: I know some of the people at http://www.dbvisit.com/, a database migration company, and competitor to this.


You can migrate in our out of AWS, but one or both of your endpoints needs to be on AWS.


I wonder if you could realistically do a two stage migration, X -> Amazon -> Y


Amazon probably wouldn't care if you did. They still get paid to shuffle the bits.


What they really want is keep you on RDS, not just get paid for the migration bandwidth.


Sure. But by forcing you to go through RDS even if you migrate X->RDS->Y, they are halfway there.

In the process you need to set up an AWS account, create a RDS database, import your data into it... and perhaps you stop there despite the initial plans. Or use RDS for your next project, since you are already more familiar with it.


Well it's in Amazon's best interest to get people to use and migrate to their infrastructure at the end of the day, so it's no surprise that a service that helps you move to AWS is competitively priced.


The lock-in is where the catch is right.


There's not a whole lot of lock-in using an rds instance. It's MySQL.


... with automatically managed updates, backups, monitoring, auto-scaling, auto-replication ...

The managed services are a huge part of the lock-in with AWS services, because even though they generally cost more than the competitors, they can often come out equal after accounting for those aggregate hours lost to doing those things.


Lock-in refers to the effect of being unable to change vendors due to switching costs. This can be things like a large amount of effort/money required that a company just cant afford or sometimes the lack of any other vendor who can even offer the same product or service.

In this case, there is basically 0 switching cost since relational databases are everywhere and you can always take your data/schema with you. This new migration service actually makes it even easier than before.

There is no lock-in.


It's not lock-in if you don't want to leave – that's just being competitive. Real lock-in would be something like removing an export function or introducing a proprietary storage backend with incompatible semantics so anyone leaving would need to rewrite application logic.

If you want to talk lock-in, look at services like Lambda where there isn't a direct counterpart (although the model there is so simple that most users could port relatively quickly).


That's what you call "paying for a service." Lock-In is an entirely different matter.


Lock-in happens when migrating would be expensive, too. And leaving a managed service like this could introduce substantial cost.


When the vendor lets you export your data in a form that you can easily import into your own copy of the exact same software, I fail to see how that is "lock-in". If the vendor increase the cost of the service or changes other terms in a way that's unfavorable to you, you can easily leave, that's the opposite of lock-in.

You can't really accuse a vendor of lock in just because his product is priced lower than it would cost you to do it yourself.


I think that's a very rigid definition of "lock-in" that implies it must be done with some anti-competitive malice. That's not how I've encountered the term in my past, so I'm welcome to other interpretations.

Per (not authoritative, obviously) wiki:

"In economics, vendor lock-in, also known as proprietary lock-in or customer lock-in, makes a customer dependent on a vendor for products and services, unable to use another vendor without substantial switching costs."

I agree with this. It shouldn't require that the service is purposefully preventing you from changing; lock-in can occur because a service offers advantages that prevent changing by virtue of some other cost.


"lock-in" implies some sort of lock. "I don't won't to leave because the service saves me so much money" is not a lock, you can leave, you just choose not to because it's better for you to stay. And if the vendor increases his price you can just take your data and go - your same application will run on another vendor's hosted MySQL instance.

If I say I'll pay you $1 if you sit in my office for an hour, you can't say that you were "locked in" by me, if Mary offers you $2 to sit in her office, you can walk over to her office and get more money.


That definition makes it very clear why people are disagreeing with you. Switching costs are those incurred by the difficulty of switching, and not the marginal cost of one provider or another. If internet provider A is $80 / month, and provider B is $60 / month, the switching cost has nothing to do with the $20 monthly discrepancy, or any speed differences, and rather represents how easy it is to cancel my subscription by being on hold for hours or cancellation fees.


Operational costs != switching costs.

You can backup/restore and run your database on basically any hosted service so switching costs are trivial here.


Is it really "lock-in" if you use it because it's cheaper and better for you and migrating somewhere else is as simple as a database export?

I can see something like DynamoDB being vendor lock-in since you can't just download your data into your own self-hosted DynamoDB instance, but if you're using a standard off-the-shelf RDS database (MySql, PostgreSQL, Sql/Server, etc), I don't think you can call it lock in when you're staying only because it's easier than running it yourself.


The whole idea of a migration service which lets you migrate to or from an AWS resource means there is no lock-in.


One of the biggest problems I've had with RDS (postgres specifically) is not being able to do incremental offsite backups. I wonder if the database migration service can help with that, as they say the process is "reliable" and "automatically restarts the process and continues the migration from where it was halted"


You can, you just need to implement it yourself in EC2 (other compute platforms are available) with trigger-based replication, using something like bucardo or londiste.

https://aws.amazon.com/blogs/aws/rds-postgres-read-replicas/


One feature that I wish was supported in RDS is to migrate from a MySQL engine to Aurora engine without having to create new database instance at all. It looks like Amazon DMS makes this process easier, but since Aurora is fully compatible with MySQL, I don't see why I have to create a new database instance and point my application to talk to the new server when Amazon DMS could do this behind the scenes on the one I already have.

Or am I allowed to set the source and target databases to be the one and the same?


You can snapshot your RDS database and restore it as an Aurora DB. Only caveat is it has to be MySQL 5.6.

http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Aurora...


Not quite the same, unless I'm misunderstanding. I'm talking about an online migration on a single RDS instance with an engine switch handled fully behind the scenes. The link provided doesn't seem to handle data changes since the snapshot, and also requires a switch in the application to point to the new instance which has been restored via snapshot.

DMS gets closer than the above link, I think, since it will handle replication onto the target database for you. But, for two fully compatible database engines, it seems like you should be able to make the switch without having to fuss with replication configuration or spinning up a new instance at all.


The on disk format for Aurua is likely different.


I wonder, if it will be possible for me to "rollback" back to my "plain" database server if level of service does not satisfy me.


Not unless you're ready to try to create a physical host replica, and then fail over to it as a master during downtime.


Is it possible to run a read replica outside of RDS these days?


Funny this showed up on here. I just spent the last 4-5 days playing with DMS.

My company has been using AWS for a while, but management wanted a "backup" backup off Amazon "JUST IN CASE". Due to amazon blocking replication credentials for MYSQL servers we had to basically dump over scp to our off amazon server and run the update on that machine via a script. We tried a number of different options but none were reliable. All nasty stuff.

Anyway after setting up a dms.t2.medium replication instance, I was able to create a number of tasks pulling from our amazon server to our off amazon server.(You have the option to just pull a full dump, pull a full dump and continue replication or simply replicate data.) It's been running for a little under 24 hours now and has been, solid so far with the replication. I know not even a day yet, but it's looking promising. Fingers crossed!

A small bonus to doing the setup for this. I found out the hard way that there was bad schema in our database, which I spent the last couple days fixing. DMS is rather sensitive, and will fail and not restart if it encounters to many errors trying to replicate data.

Overall cost is looking like it's going to cost me about $150 a month for the replication instance, which is only marginally more than the bandwidth costs I was incurring doing full dumps to our off amazon server.

Benefits are almost instant replication and an interface that will give me almost instant feedback on failed replication tasks, all within AWS which is where we are hosting everything else at this time. I was also able to create individualized tasks for separate schema so I can watch and manage errors on a schema by schema basis which is nice.

Overall I'm happy with it, but only time will tell if it can continue to be a reliable replication option.

O.


Thanks for all the info. Is your On Amazon DB on RDS?

We use Postgres so it might be that we couldn't do the same. Presently I run the DB on EC2 instance(s) but one day I'm sure I'll switch to RDS. Just trying to understand what an exit strategy might look like in the future.


Yes we are running a mySQL 5.6 Database on RDS.

If the Database you want to migrate/replicate from is running on Amazon you should be able to make it a replication source.

I saw nothing that stated the database HAD to be on RDS. You use the servers address to setup the endpoints. And provide a database username and password to connect with so it should be doable.


I don't see why not. Replication is just a command stream over a network. You can secure it over the daabsse's built in TLS support or over VPN. The only question is bandwidth and latency between your DC and AWS's.


I believe AWS stop you from connecting to anything outside your RDS subnet.


Then get you outside DB server via a VPN.


Unless something has changed, that's the problem. When I last looked RDS was a sandboxed environment. You can't ssh to it or connect it to a VPN.


You don't need to get the RDS box on a VPN. Instead, you get your server in your data center into the AWS VPC via the AWS Gateway VPN thing. From there you tell your data center server that it's master is the RDS box and you should be all good (assuming you can do the usual song and dance with the full DB dump and replication coordinates).


Some RDS flavors now support offsite replication, it's quite easy to set up.


Do you know if PostgreSQL is included in that list? Can I just pg_basebackup to start up a read slave?


Does anyone know how the MySQL migration & replication works under the hood? How is Amazon doing all this remotely via just a DB connector?


Typical MySQL replication works when slave is listening to changes on master's binlog. Which, essentially, is a log of all operations performed on the dataset. I did some quick googling and apparently it can be accessed quite easily: http://dev.mysql.com/doc/refman/5.7/en/mysqlbinlog.html so I'd assume that this is how they are doing the replication, or something very similar.

As for migration, don't know, you could export the database and listen to binlog but that will lock table for a bit, depending on the database size. But maybe that's acceptable.

Would be curious to hear from folks with more DBA experience :)


Why is Amazon Redshift not a target for this service?


Syncing to redshift for reporting is a somewhat different use case - you don't always want to sync everything, you want to sync in larger batches, the "migration" never ends.

There are several startups that will do this for you, including mine (Fivetran).


..or a source.

I've deployed a Redshift-based analytics application, and we're discovering operationally that -- for some workloads -- we would be better served by a Postgres RDS instance instead. It would be nice if I could one-click (or few-click) migrate everything from our Redshift instance to RDS.


Finally! This is one of the most wanted functionality that we've waited since AWS re:Invent announcement!!!


The "Price and Availability" section doesn't seem to mention anything about the price?!


"Because you only pay for the compute resources used during the migration process, a terabyte-sized database can be migrated for as little as $3."

The above is quoted from an email I received from them regarding the service this afternoon.

Note: using continuous replication and adding logging Dashboards to your links will also increase the cost. I seem to recall seeing the dashboards are $3 a piece after the first 3 or something silly like that and then data costs on top of that. My cost so far don't match that though.

I'm guess-estimating my DMS cost are going to be about $30 if the $6.48 I've spent in the last 7 days during setup and configuration time averages out.


Free!


Not free! They have now updated the post to include pricing details.


This has been in beta for a while, the key thing lacking for our uses, at least with postgres, was full support for DDL.

The lack of "true" replication into postgres RDS instances is one of the things blocking us from fully using it.


Hearing a few mixed things and the page doesn't make it clear so can someone clear it up for me, please; if you wanted to, could you migrate _away_ from AWS using this service?


Yes, you could migrate away. As long as either the source or the target is within AWS, you can use this.


What are my chances of a relatively successful SQL Server migration with several hundred tables and complex views to PostgreSQL using their migration tool?


I tried this, SQL Server isnt supported yet... its just not even in the dropdown options. However it is listed in the marketing and product overview text so it seems like it's coming.

EDIT: looks like it's there now. will try it out tonight then.


Great! Let me know how it goes. It will be a couple of days before I can try it. Hopefully I can migrate just the schema and cherry pick some relatively static tables. We'll see.


Postgres and SQL Server have different concurrency semantics, unlikely that you won't have subtle bugs unless your updates are very straightforward.


Looks like this could be a better way to migrate between major versions of PostgreSQL on RDS than rolling your own.


If you didn't catch it, AWS now supports upgrades between so-called "major versions" (eg. 9.3 -> 9.4). https://aws.amazon.com/about-aws/whats-new/2015/11/rds-postg...


I didn't catch that, thanks for point it out. Like another commenter said: this seems to be an in-place upgrade which is not advisable for major PostgreSQL versions. Additionally, it looks like this involves restarting the cluster (downtime). The Database Migration Service looks like it wouldn't require downtime.


Folks in the know claim that in-place major version upgrades are unadvisable, as it's not a super heavily tested area of postgres development and in the worst case scenario you might end up with unwanted side-effects in the data that can go uncaught for a while.


When 9.5 lands in RDS, they will inevitably take six months to roll out the major version upgrade to 9.5 feature, during which time this can be very useful.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: