Mysql – How to find and fix fragmented MySQL tables

MySQL

I used MySQLTuner which pointed out some tables were fragmented. I used

mysqlcheck –optimize -A

to optimize all tables. It fixed some tables but MySQLTuner still finds 19 tables fragmented. how can I see which tables are in need of defragmenting? Maybe OPTIMIZE TABLE will work where mysqlcheck didn't? Or what else should I try?

Best Answer

the short answer:

select  ENGINE, TABLE_NAME,Round( DATA_LENGTH/1024/1024) as data_length , round(INDEX_LENGTH/1024/1024) as index_length, round(DATA_FREE/ 1024/1024) as data_free from information_schema.tables  where  DATA_FREE > 0;

The "You must know" answer

first at all you must understand that Mysql tables get fragmented when a row is updated, so it's a normal situation. When a table is created, lets say imported using a dump with data, all rows are stored with no fragmentation in many fixed size pages. When you update a variable length row, the page containing this row is divided in two or more pages to store the changes, and these new two (or more) pages contains blank spaces filling the unused space.

This does not impact performance, unless of course the fragmentation grows too much. What is too much fragmentation, well let's see the query you're looking for:

  select  ENGINE, TABLE_NAME,Round( DATA_LENGTH/1024/1024) as data_length , round(INDEX_LENGTH/1024/1024) as index_length, round(DATA_FREE/ 1024/1024) as data_free from information_schema.tables  where  DATA_FREE > 0;

The DATA_LENGTH and INDEX_LENGTH are the space your data and indexes are using, and DATA_FREE is the total amount of bytes unused in all the table pages (fragmentation).

Here's an example of a real production table

| ENGINE | TABLE_NAME               | data_length | index_length | data_free |
| InnoDB | comments                 |         896 |          316 |         5 |

In this case we have a Table using (896 + 316) = 1212 MB, and have data a free space of 5 MB. This means a "ratio of fragmentation" of:

5/1212 = 0.0041

...Which is a really low "fragmentation ratio".

I've been working with tables with a ratio near 0.2 (meaning 20% of blank spaces) and never notice a slow down on queries, even if I optimize the table, the performance is the same. But apply a optimize table on a 800MB table takes a lot of time and blocks the table for several minutes, which is impracticable on production.

So, if you consider what you win in performance and the time wasted in optimize a table, I prefer NOT OPTIMIZE.

If you think it's better for storage, see your ratio and see how much space can you save when optimize. It's usually not too much, so I prefer NOT OPTIMIZE.

And if you optimize, the next update will create blank spaces by splitting a page in two or more. But it's faster to update a fragmented table than a not fragmented one, because if the table is fragmented an update on a row not necessarily will split a page.

I hope this helps you.