New Nov 19, 2024

Planning Your PostgreSQL Migration: Best Practices and Key Considerations

Company/Startup Blogs All from Heroku View Planning Your PostgreSQL Migration: Best Practices and Key Considerations on blog.heroku.com

Your organization may have many reasons to move a cloud service from one provider to another. Maybe you’ve found a better performance-versus-cost balance elsewhere. Maybe you’re trying to avoid vendor lock-in. Whatever your reasons, the convenience and general interoperability of cloud services today put you in the driver's seat. You get to piece together the tech stack and the cloud provider(s) that best align with your business.

This includes where you turn for your PostgreSQL database.

If you’re considering migrating your Postgres database to a different cloud provider, such as Heroku, the process might seem daunting. You’re concerned about the risk of data loss or the impact of extended downtime. Are the benefits worth the effort and the risk?

With the right strategy and a solid plan in place, migrating your Postgres database is absolutely manageable. In this post, we’ll walk you through the key issues and best practices to ensure a successful Postgres migration. By the end of this guide, you’ll be well equipped to make the move that best serves your organization.

Pre-migration assessment

Naturally, you need to know your starting point before you can plan your route to a destination. For a database migration, this means evaluating your current Postgres setup. Performing a pre-migration assessment will help you identify any potential challenges, setting you up for a smooth transition.

Start by reviewing the core aspects of your database.

Database version

Ensure the target cloud provider supports your current Postgres version. When you’re connected via the psql CLI client, the following commands will help you get your database version, with varying levels of detail:

psql=> SELECT version();

PostgreSQL 12.19 on aarch64-unknown-linux-gnu, compiled by gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-6), 64-bit

psql=> SHOW server_version;

12.19

Extensions

Check for any Postgres extensions installed on your current database which are critical to your applications. Some extensions might not be available on your new platform, so be sure to verify this compatibility upfront.

psql=> \dx


List of installed extensions
-[ RECORD 1 ]--------------------------------------------------------------
Name        | fuzzystrmatch
Version     | 1.1
Schema      | public
Description | determine similarities and distance between strings
-[ RECORD 2 ]--------------------------------------------------------------
Name        | plpgsql
Version     | 1.0
Schema      | pg_catalog
Description | PL/pgSQL procedural language
-[ RECORD 3 ]--------------------------------------------------------------
Name        | postgis
Version     | 3.0.0
Schema      | public
Description | PostGIS geometry, geography, and raster spatial types and…

Configurations

Determine and document any custom configurations for your database instance. This may include memory settings, timeouts, and query optimizations. Depending on the infrastructure and performance capabilities of your destination cloud provider, you may need to adjust these configurations.

You might be able to track down the files for your initial Postgres configuration (such as pg_hba.conf and postgresql.conf). However, in case you don’t have access to those files, or your configuration settings have changed, then you can capture all of your current settings into a file which you can review. Run the following command in your terminal:

$ psql \ # Include any connection and credentials flags
    -c "\copy (select * from pg_settings) to '/tmp/psql_settings.csv' with (format csv, header true);"

This will create a file at /tmp/psql_settings.csv with the full list of configurations you can review.

Schema and data compatibility

Review the schema, data types, and indexes in your current database. Ensure they’re fully compatible with the Postgres version and configurations on the target cloud provider. The feature matrix in the Postgres documentation provides a quick reference to see what is or isn’t supported for any given version.

Performance benchmark

Measure the current performance of your PostgreSQL database. When you establish performance benchmarks, you can compare pre- and post-migration metrics. This will help you (and any other migration stakeholders) understand how the new environment meets or exceeds your business requirements.

When making your performance comparison, focus on key metrics like query performance, I/O throughput, and response times.

Identify dependencies

Create a detailed catalog of the integrations, applications, and services that rely on your database. Your applications may use ORM tools, or you have microservices or APIs that query your database. Don’t forget about any third-party services that may access the database, too. You’ll need this comprehensive list when it’s time to cutover all connections to your new provider’s database. This will help you minimize disruptions and test all your connections.

Migration strategy

When deciding on an actual database migration strategy, you have multiple options to choose from. The one you choose primarily depends on the size of your database and how much downtime you’re willing to endure. Let’s briefly highlight the main strategies.

#1: Dump and restore

This method is the simplest and most straightforward. You create a full backup of your Postgres database using the pg_dump utility. Then, you restore the backup on your target cloud provider using pg_restore. For most migrations, dump and restore is the preferred solution. However, keep in mind the following caveats:

#2: Logical replication

Logical replication replicates changes from the source instance to the target. The source instance is set up to publish any changes, while the target instance listens for changes. As changes are made to the source database, they are replicated in real time on the destination database. Eventually, both databases become synchronized and stay that way until you’re ready to cutover.

This approach allows you to migrate data with little to no downtime. However, the setup and management of replication may be complex. Also, certain updates, such as schema modifications are not published. This means you’ll need some manual intervention during the migration to carry over these changes.

#3: Physical replication

Adopting a physical replication strategy means copying the actual block-level files that make up your database and then transferring them to the target database machine. This is a good option for when you need the consistency of an exact replication of data and system steps.

For this strategy to work, your source and target Postgres versions must be identical. In addition, this approach introduces downtime that is similar to the dump and restore approach. So, unless you have a unique situation that requires such a high level of consistency, you may be better off with the dump and restore approach.

#4: Managed migration tools

Finally, you might consider managed migration tools offered by some cloud providers. These tools automate and manage many aspects of the migration process, such as data transfer, replication, and minimization of downtime. These tools may be ideal if you’re looking to simplify the process while ensuring reliability.

Migration tools are not necessarily a silver bullet. Depending on the size of your database and the duration of the migration process, you may incur high costs for the service. In addition, managed tools may have less customizability, requiring you to still do the manual work of migrating over extensions or configurations.

Data transfer and security

When performing your migration, ensuring the secure and efficient transfer of data is essential. This means putting measures in place to protect your data integrity and confidentiality. Those measures include:

Network and availability connections

Along with database configurations, you’ll need to set up the network with your new cloud provider to ensure smooth connectivity. This includes configuring VPCs, firewall rules, and establishing peering between environments. Ideally, completing and validating these steps before the data migration is important.

To optimize performance, tune key connection settings like max_connections, shared_buffers, and work_mem. Start with the same settings as your source database. Then, after migration, adjust them based on your new infrastructure’s memory and network capabilities.

Lastly, configure failover and high availability in the target environment, potentially setting up replication or clustering to maintain uptime and reliability.

Downtime minimization and rollback planning

Minimizing downtime during a migration is crucial, especially for production databases. Your cutover strategy outlines the steps for switching from the source to target database with as little disruption as possible. Refer to the list you made when identifying dependencies, so you won’t overlook modifying the database connection for any application or service.

How much downtime to plan for depends on the migration strategy that you’ve chosen. Ensure that you’ve properly communicated with your teams and (if applicable) your end users, so that they can prepare for the database and all dependent services to be temporarily unavailable.

And remember: Even with the best plans, things can go wrong. It’s essential to have a clear rollback strategy. This will likely include reverting to a database backup and restoring the original environment. Test your rollback plan in advance as thoroughly as possible. If the time comes to execute, you’ll need to be able to execute it quickly and confidently.

Testing and validation

After the migration, but before you sound the all clear, you should test thoroughly to ensure everything functions as expected. Your tests should include:

Post-migration considerations

With your migration complete, you can breathe a sigh of relief. However, there’s still work to do. Close the loop by taking care of the following:

Conclusion

Migrating your Postgres database between cloud providers can be a complex process. However, with proper planning and preparation, it’s entirely possible to experience a smooth execution.

By following the best practices and key steps above, you’ll be well on your way toward enjoying the benefits of leveraging Postgres from whatever cloud provider you choose.

To recap quickly, here are the major points to keep in mind:

Stay tuned for our upcoming guides, where we'll walk you through the specifics of migrating your Postgres database from various cloud providers to Heroku Postgres.

Scroll to top