Overview of fragmented MySQL InnoDB tables

The major engine for MySQL is InnoDB, it complies with the ACID properties for a transactional database engine. Even if InnoDB is the most recommended engine for MySQL, it has also some caveats. The biggest criticism lies in the amount of disk space used. And even when we remove data, tablespaces don’t shrink.

This post is about those tables that are fragmented.

To find the top 10 of tables with free space (free space means gaps that may happen when for example large delete operations happened or if many pages had to be moved around), a simple query can be run:

SELECT CONCAT(table_schema, '.', table_name) as 'TABLE', 
       ENGINE, CONCAT(ROUND(table_rows / 1000000, 2), 'M')  ROWS, 
       CONCAT(ROUND(data_length / ( 1024 * 1024 * 1024 ), 2), 'G') DATA, 
       CONCAT(ROUND(index_length / ( 1024 * 1024 * 1024 ), 2), 'G') IDX, 
       CONCAT(ROUND(( data_length + index_length ) / ( 1024 * 1024 * 1024 ), 2), 'G') 'TOTAL SIZE', 
       ROUND(index_length / data_length, 2)  IDXFRAC, CONCAT(ROUND(( data_free / 1024 / 1024),2), 'MB') AS data_free 
FROM information_schema.TABLES  
ORDER BY data_length + index_length desc LIMIT 10;

2M records table

Now, let’s experiment to better understand this. I’ve created a table (with sysbench ) of 2M records. let’s see the output of the preview query related to this new table:

mysql> SELECT CONCAT(table_schema, '.', table_name) as 'TABLE', 
       ENGINE, CONCAT(ROUND(table_rows / 1000000, 2), 'M')  ROWS, 
       CONCAT(ROUND(data_length / ( 1024 * 1024 * 1024 ), 2), 'G') DATA, 
       CONCAT(ROUND(index_length / ( 1024 * 1024 * 1024 ), 2), 'G') IDX, 
       CONCAT(ROUND(( data_length + index_length ) / ( 1024 * 1024 * 1024 ), 2), 'G') 'TOTAL SIZE', 
       ROUND(index_length / data_length, 2)  IDXFRAC, CONCAT(ROUND(( data_free / 1024 / 1024),2), 'MB') AS data_free 
       FROM information_schema.TABLES  WHERE table_name='sbtest1';
+----------------+--------+-------+-------+-------+------------+---------+-----------+
| TABLE          | ENGINE | ROWS  | DATA  | IDX   | TOTAL SIZE | IDXFRAC | data_free |
+----------------+--------+-------+-------+-------+------------+---------+-----------+
| sbtest.sbtest1 | InnoDB | 1.92M | 0.41G | 0.00G | 0.41G      |    0.00 | 5.00MB    |
+----------------+--------+-------+-------+-------+------------+---------+-----------+

We can see that we have 5MB free and that the statistics shows an estimation of the size and the amount of rows.

We can also verify on the filesystem:

[root@mysql1 innodb_ruby-master]# ls -lh /var/lib/mysql/sbtest/sbtest1.ibd 
-rw-r-----. 1 mysql mysql 472M Dec 30 12:16 /var/lib/mysql/sbtest/sbtest1.ibd

Our table’s size is indeed 472MB .

I hacked innodb_ruby from Jeremy Cole (a tool that I really appreciate and that I find very useful especially when used to show the lsn heat map), to have the possibility to output

a map of the table showing the used and the free pages.

This is the output for our tables ( sbtest.sbtest1 ):

As you can see the 5MB of data free are mostly at the end (happening when allocating extra extends) but there are also some small gaps (I didn’t check yet why?).

Deleting the first 100k records

Now let’s delete the first 100k records and see what happens:

mysql> delete from sbtest1 order by id limit 100000;
Query OK, 100000 rows affected (4.10 sec)

As innodb_ruby reads from the table space file, it’s mandatory to flush all pages from the Buffer Pool to disk (all the changes must be applied to disk). The best way to achieve this is to

stop mysqld not using the default fast shutdown and restart it:

mysql> set global innodb_fast_shutdown=0;                                                                                                                                                                           Query OK, 0 rows affected (0.08 sec)
[root@mysql1 mysql]# systemctl restart mysqld

We can check now the if there is some free pages:

+----------------+--------+-------+-------+-------+------------+---------+-----------+
| TABLE          | ENGINE | ROWS  | DATA  | IDX   | TOTAL SIZE | IDXFRAC | data_free |
+----------------+--------+-------+-------+-------+------------+---------+-----------+
| sbtest.sbtest1 | InnoDB | 1.92M | 0.41G | 0.00G | 0.41G      |    0.00 | 25.00MB   |
+----------------+--------+-------+-------+-------+------------+---------+-----------+

And verify this with the output of the page:

We can definitely see where the allocated but unused space is located.

Deleting 200k random records

Let’s check by deleting more records but completely random:

mysql> delete from sbtest1 order by rand() limit 200000; 
Query OK, 200000 rows affected (4 min 3.42 sec)

Now let’s have a look at the unused space:

+----------------+--------+-------+-------+-------+------------+---------+-----------+
| TABLE          | ENGINE | ROWS  | DATA  | IDX   | TOTAL SIZE | IDXFRAC | data_free |
+----------------+--------+-------+-------+-------+------------+---------+-----------+
| sbtest.sbtest1 | InnoDB | 1.67M | 0.40G | 0.03G | 0.43G      |    0.07 | 25.00MB   |
+----------------+--------+-------+-------+-------+------------+---------+-----------+

mmm… !? less rows but the data_free is still the same (25MB). What does the output tell ?

We don’t see any difference !?

In fact, this is because the records deleted were certainly stored on a page with others records that were not deleted. So the page is still in use. Not completely but still in use, so not yet

totally free.

Deleting 100k records in the middle

OK… let’s then delete 100k sequential records in the middle of the table:

mysql> delete from sbtest1 where id> 1500000 order by id limit 100000;
Query OK, 100000 rows affected (5.36 sec)

Let’s verify the free space:

+----------------+--------+-------+-------+-------+------------+---------+-----------+
| TABLE          | ENGINE | ROWS  | DATA  | IDX   | TOTAL SIZE | IDXFRAC | data_free |
+----------------+--------+-------+-------+-------+------------+---------+-----------+
| sbtest.sbtest1 | InnoDB | 1.67M | 0.40G | 0.03G | 0.43G      |    0.07 | 48.00MB   |
+----------------+--------+-------+-------+-------+------------+---------+-----------+

This time we have more free space and we can see it:

However the size on disk is still the same (we didn’t gain back those 40MB):

[root@mysql1 mysql]# ls -lh /var/lib/mysql/sbtest/sbtest1.ibd 
-rw-r-----. 1 mysql mysql 472M Dec 30 13:03 /var/lib/mysql/sbtest/sbtest1.ibd

Deleting all the records

Now let’s see what happens if we delete all records from our table:

mysql> delete from sbtest1 ;
Query OK, 1600000 rows affected (1 min 36.80 sec)

We can check the free space using the query in Information_Schema:

+----------------+--------+-------+-------+-------+------------+---------+-----------+
| TABLE          | ENGINE | ROWS  | DATA  | IDX   | TOTAL SIZE | IDXFRAC | data_free |
+----------------+--------+-------+-------+-------+------------+---------+-----------+
| sbtest.sbtest1 | InnoDB | 0.00M | 0.38G | 0.03G | 0.41G      |    0.08 | 463.00MB  |
+----------------+--------+-------+-------+-------+------------+---------+-----------+

No more rows and a lot of free space.

Still no change on the filesystem (as expected):

[root@mysql1 mysql]# ls -lh /var/lib/mysql/sbtest/sbtest1.ibd 
-rw-r-----. 1 mysql mysql 472M Dec 30 13:09 /var/lib/mysql/sbtest/sbtest1.ibd

Let’s see the table’s illustration:

Optimizing

So we don’t have any records and we use a “lot” of diskpace, which is completely wasted. Can we do something ?

Yes we can optimize the table (the table doesn’t support it but will perform a recreate + analyze). We can so run one of these commands:

mysql> optimize table sbtest1;

or

mysql> alter table sbtest1 engine=innodb;

After such operation, we can finally see the result on the filesystem and on the page’s illustration:

[root@mysql1 mysql]# ls -lh /var/lib/mysql/sbtest/sbtest1.ibd 
-rw-r-----. 1 mysql mysql 112K Dec 30 13:14 /var/lib/mysql/sbtest/sbtest1.ibd

Conclusion

In conclusion, if diskpace is a concern for you (or if you are doing physical backups regularly), it could be interesting for you to check the fragmentation of your tables (when you use InnoDB). And if like me (and many others) you like InnoDB, please take a look at innodb_ruby if it’s not yet done :wink:

lefred's blog责编内容来自:lefred's blog (源链) | 更多关于

阅读提示:酷辣虫无法对本内容的真实性提供任何保证,请自行验证并承担相关的风险与后果!
本站遵循[CC BY-NC-SA 4.0]。如您有版权、意见投诉等问题,请通过eMail联系我们处理。
酷辣虫 » 综合技术 » Overview of fragmented MySQL InnoDB tables

喜欢 (0)or分享给?

专业 x 专注 x 聚合 x 分享 CC BY-NC-SA 4.0

使用声明 | 英豪名录