Sean Xie
3 min readOct 21, 2020

--

Why I choose AWS Aurora over RDS Postgres?

When we started to build our application 2 years ago, we chose RDS Postgres because RDS is managed by AWS. We have not noticed performance issues until recently when more and more customers start to use our application.

We are using Django framework for the backend Restful endpoints. In order to improve the performance of our application, Newrelic was used to monitor the performance of the backend endpoints. For endpoints that have a high # of database calls for each transaction, we optimized the code to reduce the # of database calls. We useQuerySet.select_related() and prefetch_related(). This is particularly important if you have a query that is executed in a loop, and could therefore end up doing many database queries when only one was needed.

Our 2nd effort is to tweak the API server. uWSGI is used for serving Django Restful API in conjunction with Nginx, which offers direct support for uWSGI’s native uwsgi protocol. Data may flow like this:

HTTP client ↔ Nginx ↔ uWSGI ↔ Python.

For more info, you can refer to this article https://www.techatbloomberg.com/blog/configuring-uwsgi-production-deployment/.

We also increased the # of concurrent TCP connections, and create swap space to handle more concurrent connections. Below is our /etc/sysctl.conf:

net.ipv4.tcp_keepalive_time = 60
net.ipv4.tcp_keepalive_intvl = 10
net.ipv4.tcp_keepalive_probes = 6
net.core.somaxconn=1024
vm.swappiness=60

Our 3rd effort is to tune our database, the main topic of this article. I am going to discuss Amazon Aurora architecture, and present some performance testing results using Aurora and RDS database.

Let’s dive deep into Aurora. Amazon Aurora architecture has three significant advantages over traditional database approaches:

(1) database engine and storage are separated, so that network and storage tier does not affect database performance.

(2) Aurora only writes redo log records to storage, which reduces network IOPS by an order of magnitude. By removing this bottleneck, Aurora obtains significant throughput improvements over the base MySQL codebase from which Aurora started.

(3) The most complex and critical functions (i.e., backup and redo recovery) have been moved from one-time expensive operations in the database engine to continuous asynchronous operations. Backup and crash recovery does not interfere with foreground processing.

To compare the performance of Aurora and RDS, I use a tool called Apache benchmark (https://httpd.apache.org/docs/2.4/programs/ab.html) to run stress tests against one of the endpoints. The endpoint GET /quotes/xxxx returns details of the quote with id=xxxx. It also returns the information of quote recipients and the sender of the quote. We need to join the quote table and the user table. In our tests, 2000 requests are sent with concurrency level = 100. Below shows the benchmark results.

  1. Amazon Aurora
Document Path:          /quotes/xxxx
Concurrency Level: 100
Time taken for tests: 127.412 seconds
Complete requests: 2000
Requests per second: 15.70 [#/sec] (mean)
Time per request: 6370.586 [ms] (mean)
Time per request: 63.706 [ms] (mean, across all concurrent requests)
Aurora database engine load is low

2. RDS

Document Path:          /quotes/xxxx
Concurrency Level: 100
Time taken for tests: 147.085 seconds
Complete requests: 2000
Requests per second: 13.60 [#/sec] (mean)
Time per request: 7354.262 [ms] (mean)
Time per request: 73.543 [ms] (mean, across all concurrent requests)
RDS database engine load can be rather high and bursty

When Amazon Aurora database is used, the time per request is 63.706ms. When RDS database is used, the time per request is 73.543ms. Aurora is a better choice than RDS.

--

--

Sean Xie

Hands on Technology leader, Principal/Chief Architect, and software developer. https://www.linkedin.com/in/shunxingseanxie/