AWS – Migrating from DynamoDB to RDS

amazon-dynamodbamazon-rdsamazon-web-servicesdatabasemigration

I'm considering the different options available for migrating a DynamoDB Database to RDS. The data structure makes much more sense in a relational format.

There are 8 tables with around 1 million documents in each. We have worked out the mapping between primary / foreign keys.

From the documentation I've read on AWS I have a few options.

  • AWS data pipeline -> S3 -> convert to csv -> AWS Database Migration Service
  • Custom program writes tables to S3 in csv format -> AWS Database Migration
  • Custom program reads from dynamoDB -> inserts immediately into RDS table by table until complete.
  • Maybe use AWS Data pipeline to copy from DynamoDB to RDS directly?

Has anyone else had experience with this kind of migration? Are there any other options?

Best Answer

8 millions documents is not that many, I wouldn't spend too much time trying to over-optimise a process that in the end may run only once and only for a few minutes.

If you scale up the DynamoDB read performance to 10k capacity units per second you should be able to read the entire dataset in less then 15 minutes. At the same time run your RDS on a big enough instance so it can sustain the writing of the 8M rows without slow down. Don't use db.t2.* class as that uses CPU credits and once you run out it slows down. Instead use something big (e.g. db.r4.2xlarge) with a lot of memory and once the import is done you can immediately downgrade it to whatever suits your long term needs to save money. Also consider Aurora instead of the old-fashioned RDS.

You can experiment with different ways on a small subset of records and once the process works run it on the entire dataset. I would probably choose a simple custom program that reads from DynamoDB on one side and writes to RDS on the other side. And I would run it from an EC2 instance for performance and cost optimisation reasons (to prevent traffic leaving AWS which costs some money). Unless you already use Data Pipeline for something else it's probably not worth learning for such a small one-off job. But if you already know how to use Data Pipeline you use it. In the end anything that can read from DynamoDB and write to RDS will do the job, so choose something you're already familiar with.

Hope that helps :)

Related Topic