AWS RDS Blog Insights: Performance Tuning

published on 30 December 2023

As an AWS user looking to optimize database performance, you likely agree that efficient resource utilization is critical for supporting application needs.

By exploring proven RDS optimization strategies, you can dramatically improve query speed and throughput while reducing costs.

In this post, we'll cover key areas like upgrading instance type, configuring read replicas, partitioning tables, enabling caching, and more. You'll discover practical steps to tune database configuration, hardware allocation, and data access patterns.

Introduction to Performance Tuning AWS RDS

AWS RDS provides relational database instances running engines like PostgreSQL, MySQL, MariaDB, Oracle, and SQL Server. Performance tuning optimizes database workloads by improving query speed and lowering latency for applications. It also reduces resource costs through more efficient utilization of RDS instances.

This section outlines key strategies to tune RDS performance covered in this article.

What is AWS RDS

AWS RDS is a fully managed relational database service for popular database engines. RDS handles database administration tasks like backups, software patching, automatic failure detection, and recovery.

RDS provides high availability with multi-AZ deployments. It offers scalable compute and storage capacity to meet application workload demands. Performance tuning helps optimize RDS to improve query speed and lower costs.

Why Performance Tuning Matters

  • Faster query performance reduces application latency and improves end user experience. Slow queries negatively impact user interactions.

  • Efficient utilization of RDS resources lowers costs for a given workload. Rightsizing RDS instance type, memory, and storage saves money.

  • Meeting spike demand is easier with a well-tuned system. Tuning allows supporting traffic surges without overprovisioning capacity.

Main Optimization Strategies

We will cover several performance tuning techniques:

  • Upgrading RDS instance type and memory/storage allocation
  • Implementing read replicas to scale read operations
  • Partitioning large tables to distribute I/O
  • Enabling caching to reduce load on storage
  • Analyzing slow query logs to identify optimizations

Tuning RDS requires benchmarking, load testing, and monitoring to quantify improvements. We will demonstrate using these tools to guide optimization.

Upgrading RDS Instance Type

Switching your RDS DB instance to a larger instance type with more CPU, memory, and network resources can significantly improve performance for resource constrained workloads.

Assess Current Utilization

Check your cloudwatch metrics for CPU, memory, and storage to identify if your DB instance is constrained on resources as that can limit SQL performance:

  • Monitor the CPUUtilization metric over time to see if CPU usage is frequently high, indicating a need for more compute capacity. Sustained values over 75-80% may warrant upsizing.

  • Review FreeableMemory to detect if available memory is often low. Frequently dipping near zero suggests insufficient memory allocation.

  • Track write IOPS using VolumeWriteIOPS to determine storage requirements. If nearing maximum IOPS, a larger instance type with higher I/O performance may be beneficial.

Select Appropriate Instance Size

Choose an RDS instance type adequate for your typical and peak workloads based on resource metrics data while optimizing costs:

  • Size up incrementally to the next logical instance type based on currently constrained resources. For example, if CPU maxes out, choose an instance type with more vCPUs.

  • Consider instance families like T3/T4 (burstable), M5 (general purpose) or R5 (memory optimized) based on your primary bottleneck.

  • Review AWS instance type pricing and select the most cost efficient upgrade that meets your performance needs.

Test New Instance Type

Validate performance gains on a test RDS instance before modifying production databases to confirm improved query speed:

  • Provision a new test DB instance using the desired upgraded instance type.

  • Replay a representative sample of production workloads and queries.

  • Compare performance metrics like query latency between old and new instance types.

  • If satisfactory, modify production RDS instance type and decommission original DB.

Configuring Read Replicas

Read replicas can enhance RDS performance by routing read queries away from the primary DB instance. Analyzing slow query logs helps identify read-heavy workloads suitable for replication.

Identifying Read-Heavy Workloads

When assessing if read replicas could improve query performance:

  • Enable RDS slow query logs to capture queries exceeding a defined execution time threshold.
  • Review slow logs over a representative period, such as 1-2 weeks.
  • Identify the most common/expensive read queries, noting:
    • % of total queries
    • Overall execution time
    • Read vs write operations
  • Queries that are primarily reads and retrieve large data volumes are good candidates for redirection to read replicas.

Creating Read Replicas

To launch one or more read replicas:

  1. In the RDS console, select the source DB instance.
  2. Choose 'Create read replica' and configure:
    • DB instance identifier
    • Compute and storage resources
    • Availability Zone
    • Encryption, monitoring, etc
  3. Click Create read replica.

Newly created replicas inherit security groups, parameter groups, and more from the source RDS instance.

Updating Connections

Modify application connection strings to leverage read replicas:

  • Programmatically check the endpoint currently in use when making a new connection.
  • If opening a read connection, direct to the replica endpoint.
  • If writing data, connect to the primary instance endpoint.

This improves read performance by shifting reads to the replicas. Monitor database load to ensure sufficient resources.

sbb-itb-6210c22

Partitioning Large Tables

Partitioning very large database tables can help improve query performance by splitting the data into more manageable chunks. Here are some tips on detecting problematic tables, choosing an appropriate partition key, and implementing table partitioning in Amazon RDS.

Detecting Problem Tables

  • Run SHOW TABLE STATUS to check table sizes and identify extremely large tables that may benefit from partitioning. Tables over 1GB or with over 1 million rows are good partitioning candidates.

  • Monitor slow queries with tools like Amazon RDS Performance Insights. Queries scanning large tables may indicate a need for partitioning.

  • Review table schema and access patterns. Tables frequently queried by date range or ID are ideal for partitioning on those columns.

Choosing Partition Key

  • Date columns are a common RDS partitioning key for time series data. Splitting data by month or year bounds query scope and speeds date range queries.

  • Auto-incrementing ID columns also make good partition keys. Define numeric ranges to shard data across partitions by ID subsets.

  • Optimal partition key depends on typical query filters and access patterns. Analyze query logs to determine best columns.

Implementing Table Partitioning

  • Use PostgreSQL/MySQL syntax to partition tables by RANGE, LIST, HASH or KEY. Define partition bounds and mapping rules.

  • Migrate data from original table into new partitioned table with INSERT INTO ... SELECT statement for seamless transition.

  • Set up new partitions and attach them to the partitioned table. Add monthly date range partitions or ID range partitions to support scalability.

  • Test queries against partitioned tables to validate performance gains before switching applications.

Enabling Query Caching

Enabling the RDS query cache can reduce database workload by storing result sets in memory for repetitive read queries instead of hitting storage. This allows subsequent executions of cached queries to return results much faster by avoiding disk I/O.

Identifying Cacheable Queries

To determine which queries may benefit from caching:

  • Enable RDS slow query logging to capture queries exceeding a defined execution time threshold. This helps uncover repetitive and expensive read queries.
  • Analyze slow logs to detect queries frequently run against largely static reference or lookup tables. These are good caching candidates since the underlying data rarely changes.
  • Focus on read-only or read-mostly workloads. Write-intensive queries are less likely to benefit.

Calculating Cache Memory Size

Once cacheable queries are identified, estimate potential cache memory requirements:

  • Check historical slow logs for the result set size of cacheable queries.
  • Sum the result set sizes for frequently repeated queries to approximate needed cache capacity.
  • Allocate sufficient memory to achieve a high cache hit rate without overprovisioning.

Configuring Query Cache

Using RDS parameter groups:

  • Set query_cache_type to 1 to enable caching.
  • Adjust query_cache_size based on required memory calculations.
  • Monitor cache hit rate metrics to fine tune as needed for optimal memory utilization.

Enabling caching for repetitive read queries can significantly improve query performance and reduce database load. Continually tune the cache configuration to ensure optimal memory usage and cache hit ratios.

Using Change Data Capture

Change data capture (CDC) can help reduce replication lag for RDS databases with heavy write volumes by asynchronously propagating data changes to read replicas. This allows read-only queries to be offloaded to replicas without experiencing stale data.

Measuring Replication Lag

Before enabling CDC, it's important to measure and confirm that replication lag is actually causing issues:

  • Check the replica_lag_seconds metric in CloudWatch to see current replication delay
  • Identify thresholds where lag starts impacting application performance
  • Determine if lag is from an overall lack of capacity or inefficient replication processes

If replica DB instances are falling too far behind on replicating changes, CDC may help offload some of that processing overhead.

Enabling CDC on Source DB

To use CDC with RDS, change data capture needs to be enabled at the database level on the primary RDS instance:

  • For MySQL, this enables the binary log with row format set to ROW
  • For PostgreSQL, this enables publishing data changes to logical decoding slots

This allows data changes written to the primary database to be asynchronously captured and streamed in near real-time.

Consuming CDC Log

With CDC enabled, applications can consume the CDC log and propagate data changes directly to read replicas:

  • Use services like DMS or Kafka to read the CDC log
  • Handle translating data changes and applying to replicas
  • Offload replication processes from the primary for reduced lag

This helps shorten the replication chain from primary DB -> read replicas and may significantly improve replica lag.

Conclusion

Using techniques like upgrading RDS instances, read replicas, partitioning, caching, CDC logs, and more can significantly optimize database performance. Continually monitor metrics to identify and resolve bottlenecks.

Key Optimization Takeaways

  • Upsize RDS instance type for improved CPU and memory.
  • Enable read replicas to offload read traffic.
  • Partition large tables for more efficient queries.
  • Implement Redis or Memcached caching layers.
  • Process CDC logs for changed data replication.

Importance of Ongoing Monitoring

Continually collecting and analyzing performance metrics is critical for:

  • Detecting database bottlenecks under load.
  • Tuning database configuration settings.
  • Right-sizing RDS instance types over time.
  • Identifying needed schema changes.

Ongoing monitoring and incremental optimizations ensure efficient RDS resource utilization even as application demands evolve.

Related posts

Read more