Sql-server – Best tool to migrate a PostgreSQL database to MS SQL 2005

migrationpostgresqlsql server

I have a database in PostgreSQL 8.3.1 that I'd like to migrate to MS SQL Server 2005 (or maybe 2008), including both the table schema and the data. The database is about 50GB in size with about 400,000,000 rows, so I think simple INSERT statements are out of the question. Could anyone recommend the best tool for performing this migration? Obviously it needs to be reliable, so the data is exactly the same in the target DB as in the source one and it needs to be able to copy this volume of data within a reasonable time.

Best Answer

I ended up not using any third-party tool for the data as none of the ones I've tried worked for the large tables. Even SSIS failed. I did use a commercial tool for the schema, though. So my conversion process was as follows:

  1. Full Convert Enterprise to copy the schema (no data).
  2. pg_dump to export the data from Postgres in "plain text" format, which is basically a tab-separated values (TSV) file.
  3. Python scripts to transform the exported files into a format bcp would understand.
  4. bcp to import the data into MSSQL.

The transformation step took care of some differences in the formats used by pg_dump and bcp, such as:

  • pg_dump puts some Postgres-specific stuff at the start of the file and ends the data with ".", while bcp expects the entire file to contain data
  • pg_dump stores NULL values as "\N", while bcp expects nothing in place of a NULL (ie. no data in-between column separators)
  • pg_dump encodes tabs as "\t" and newlines as "\n", while bcp treats those literally
  • pg_dump always uses tabs and newlines as separators, while bcp allows the user to specify separators. This becomes necessary if the data contains any tabs or newlines, since they're not encoded.

I also found that some unique constraints that were fine in Postgres were violated in MSSQL, so I had to drop them. This was because NULL=NULL in MSSQL (ie. NULL is treated as a unique value), but not in Postgres.