I have an Amazon (AWS) Aurora DB cluster, and every day, its [Billed] Volume Bytes Used
is increasing.
I have checked the size of all my tables (in all my databases on that cluster) using the INFORMATION_SCHEMA.TABLES
table:
SELECT ROUND(SUM(data_length)/1024/1024/1024) AS data_in_gb, ROUND(SUM(index_length)/1024/1024/1024) AS index_in_gb, ROUND(SUM(data_free)/1024/1024/1024) AS free_in_gb FROM INFORMATION_SCHEMA.TABLES;
+------------+-------------+------------+
| data_in_gb | index_in_gb | free_in_gb |
+------------+-------------+------------+
| 30 | 4 | 19 |
+------------+-------------+------------+
Total: 53GB
So why an I being billed almost 75GB at this time?
I understand that provisioned space can never be freed, in the same way that the ibdata files on a regular MySQL server can never shrink; I'm OK with that. This is documented, and acceptable.
My problem is that every day, the space I'm billed increases. And I'm sure I am NOT using 75GB of space temporarily. If I were to do something like that, I'd understand. It's as if the storage space I am freeing, by deleting rows from my tables, or dropping tables, or even dropping databases, is never re-used.
I have contacted AWS (premium) support multiple times, and was never able to get a good explanation on why that is.
I've received suggestions to run OPTIMIZE TABLE
on the tables on which there is a lot of free_space
(per the INFORMATION_SCHEMA.TABLES
table), or to check the InnoDB history length, to make sure deleted data isn't still kept in the rollback segment (ref: MVCC), and restart the instance(s) to make sure the rollback segment is emptied.
None of those helped.
Best Answer
There are multiple things at play here...
Each table is stored in its own tablespace
By default, the parameter group for Aurora clusters (named
default.aurora5.6
) definesinnodb_file_per_table = ON
. That means each table is stored in a separate file, on the Aurora storage cluster. You can see which tablespace is used for each of your tables using this query:SELECT name, space FROM INFORMATION_SCHEMA.INNODB_SYS_TABLES;
Note: I have not tried to change
innodb_file_per_table
toOFF
. Maybe that would help..?Storage space freed by deleting tablespaces is NOT re-used
Quoting AWS premium support:
But there is some obscure way to re-use some of that wasted space...
Again, quote AWS premium support:
OPTIMIZE TABLE is evil!
Because Aurora is based on MySQL 5.6,
OPTIMIZE TABLE
is mapped toALTER TABLE ... FORCE
, which rebuilds the table to update index statistics and free unused space in the clustered index. Effectively, along withinnodb_file_per_table = ON
, that means running anOPTIMIZE TABLE
creates a new tablespace file, and deletes the old one. Since deleting a tablespace file doesn't free up the storage it was using, that meansOPTIMIZE TABLE
will always result in more storage being provisioned. Ouch!Ref: https://dev.mysql.com/doc/refman/5.6/en/optimize-table.html#optimize-table-innodb-details
Using temporary tables
By default, the parameter group for Aurora instances (named
default.aurora5.6
) definesdefault_tmp_storage_engine = InnoDB
. That means every time I am creating aTEMPORARY
table, it is stored, along with all my regular tables, on the Aurora storage cluster. That means new space is provisioned to hold those tables, thus increasing the total VolumeBytesUsed.The solution for this is simple enough: change the
default_tmp_storage_engine
parameter value toMyISAM
. This will force Aurora to create theTEMPORARY
tables on the instance's local storage.Of note: the instances' local storage is limited; see the
Free Local Storage
metric on CloudWatch to see how much storage your instances have. Larger (costlier) instances have more local storage.Ref: none yet; the current Amazon Aurora documentation doesn't mention this. I asked the AWS support team to update the documentation, and will update my answer if/once they do.