Mysql – Why the time taken to fetch results of a query for 10,000 times from a RDS is so uneven? (experiment)

tl;dr : Why the time taken to fetch results of a SELECT statement for 10,000 times from a RDS when requested from an EC2 is so uneven?

Updated the question with results of small and medium RDS servers

While experimenting with AWS for checking time being taken for fetching the result of a SQL query I got the following very uneven result:

I wrote a PHP code to report me the time taken to fetch the reqult of a SELECT query for n times from the server.

while($flag<n)
    {
       $t=microtime(true);
       $result=$con->query($q);
       $t=microtime(true)-$t;
       $total+=$t;
       $flag++;
    }

Enviornment:

All transactions were done inside a private vpc in AWS
All servers are in different zones
Configuration of MySQL on each server is : MySQL on EC2: version = 5.6, on RDS : 5.5, query_cache_size = 16777206, query_cache_state = ON.
Database A Large databass ~= 5GB, Table queried had ~= 20000 rows.

Servers :

EC2 A, Availability zone : us-east-1e, Type : t2.micro.
EC2 B, Availability zone : us-east-1b, Type : t2.micro.
RDS Availability zone : us-east-1c, Type : db.t2.micro, db.t2.small (updated), db.t2.medium (updated)

Results :

Time taken for 10,000 loops of performing a SELECT query :

Requesting server B, Database server B

Results for 5 trials were : 20, 21, 20, 20, 21 (All in sec)

Requesting server A, Database server B

Results for 5 trials were : 33, 33, 33, 33, 3 (All in sec)

Requesting server A, Database server RDS (micro)

Results for 11 trials were : 272, 709, 49, 48, 711, 593, 47, 316, 153, 47, 636 (All in sec)

Requesting server A, Database server RDS (small)

Results for 5 trials were : 53, 54, 53, 158, 698 (All in sec)

Requesting server A, Database server RDS (medium)

Results for 5 trials were : 96, 123, 579, 252 (All in sec)

Why is the time taken by RDS in the test of 10,000 loops of the SELECT statement so uneven? And why is it so high than EC2 servers?

[I don't think its due to network, because when I did the experiment with lesser loops (1000 loops) the readings for EC2 -> RDS were 4, 5, 5, 5, 4.]

When I logged time for each fetch request I noticed the following:

For the case when it took 153 seconds for 10,000 loops on RDS:

Average time Taken by each query : 0.015419

No of Queries Took more than average time out of 10000 :1644

Total Time Taken by the Queries Which took more than Average Time to Complete : 119.364 (78% of total time)

For the case when it took 636 seconds for 10,000 loops on RDS:

Average time Taken by each query : 0.063605

No of Queries Took more than average time out of 10000 :8629

Total Time Taken by the Queries Which took more than Average Time to Complete : 628.6426 (98.8% of total time)

Edit1 :

I logged time for each fetch request in the cycle of 10,000 req
I noticed that after some requests the time for each req increases to ~0.07 (from ~0.003) sec. But this increases happens after a random number of requests. Like, sometimes after ~8000 requests and sometimes after ~3000 req. What could be the reason? Also when it takes ~45 sec for 10,000 req then CPU utilization in RDS is around 5% . While when it takes >100 sec then CPU is around 10-15%.

Edit2:

I upgraded the RDS server from t2.micro to t2.small and further to t2.medium. Again the performance was uneven:

Requesting server A, Database server RDS (small)

Results for 5 trials were : 53, 54, 53, 158, 698 (All in sec)

Requesting server A, Database server RDS (medium)

Results for 5 trials were : 96, 123, 579, 252 (All in sec)

Edit3:

I switched the RDS to another zone. Now the readings seem to be consistent. May be the problem was CPU stealing by some noisy neighbor.

Requesting server A, Database server RDS (small, in a different zone)

Results for 5 trials were : 156, 151, 151, 151, 151, 302 (All in sec)

Best Answer

I noticed you're using the db.micro instance. As with EC2, the micro instances are designed to be budget-friendly, but at a cost of performance. That said, you'll get a much worse performance when loading these types of servers versus the normal instances because the CPU time is given to the instance "last" compared to other instances sharing the same hardware.

To prove the point, run your tests again against a db.medium instance and you'll find it much more consistent.

Enviornment:

Servers :

Results :

Time taken for 10,000 loops of performing a SELECT query :

Edit1 :

Edit2:

Edit3:

Best Answer

Related Solutions

Nginx + PHP-FPM Benchmarking with JMeter – Spikes

Nginx – MySQL Execution Time Spikes

Related Topic