MySQL 中 varchar(50) 和 varchar(500) 有何區別,你知道嗎?
一、問題描述
我們在設計表結構的時候,設計規范里面有一條如下規則:
- 對于可變長度的字段,在滿足條件的前提下,盡可能使用較短的變長字段長度。
為什么這么規定?我在網上查了一下,主要基于兩個方面
- 基于存儲空間的考慮
- 基于性能的考慮
網上說Varchar(50)和varchar(500)存儲空間上是一樣的,真的是這樣嗎?
基于性能考慮,是因為過長的字段會影響到查詢性能?
本文我將帶著這兩個問題探討驗證一下
二、驗證存儲空間區別
1.準備兩張表
CREATE TABLE `category_info_varchar_50` (
`id` bigint(20) NOT NULL AUTO_INCREMENT COMMENT '主鍵',
`name` varchar(50) NOT NULL COMMENT '分類名稱',
`is_show` tinyint(4) NOT NULL DEFAULT '0' COMMENT '是否展示:0 禁用,1啟用',
`sort` int(11) NOT NULL DEFAULT '0' COMMENT '序號',
`deleted` tinyint(1) DEFAULT '0' COMMENT '是否刪除',
`create_time` datetime NOT NULL COMMENT '創建時間',
`update_time` datetime NOT NULL COMMENT '更新時間',
PRIMARY KEY (`id`) USING BTREE,
KEY `idx_name` (`name`) USING BTREE COMMENT '名稱索引'
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COMMENT='分類';
CREATE TABLE `category_info_varchar_500` (
`id` bigint(20) NOT NULL AUTO_INCREMENT COMMENT '主鍵',
`name` varchar(500) NOT NULL COMMENT '分類名稱',
`is_show` tinyint(4) NOT NULL DEFAULT '0' COMMENT '是否展示:0 禁用,1啟用',
`sort` int(11) NOT NULL DEFAULT '0' COMMENT '序號',
`deleted` tinyint(1) DEFAULT '0' COMMENT '是否刪除',
`create_time` datetime NOT NULL COMMENT '創建時間',
`update_time` datetime NOT NULL COMMENT '更新時間',
PRIMARY KEY (`id`) USING BTREE,
KEY `idx_name` (`name`) USING BTREE COMMENT '名稱索引'
) ENGINE=InnoDB AUTO_INCREMENT=288135 DEFAULT CHARSET=utf8mb4 COMMENT='分類';
2.準備數據
給每張表插入相同的數據,為了凸顯不同,插入100萬條數據
DELIMITER $$
CREATE PROCEDURE batchInsertData(IN total INT)
BEGIN
DECLARE start_idx INT DEFAULT 1;
DECLARE end_idx INT;
DECLARE batch_size INT DEFAULT 500;
DECLARE insert_values TEXT;
SET end_idx = LEAST(total, start_idx + batch_size - 1);
WHILE start_idx <= total DO
SET insert_values = '';
WHILE start_idx <= end_idx DO
SET insert_values = CONCAT(insert_values, CONCAT('(\'name', start_idx, '\', 0, 0, 0, NOW(), NOW()),'));
SET start_idx = start_idx + 1;
END WHILE;
SET insert_values = LEFT(insert_values, LENGTH(insert_values) - 1); -- Remove the trailing comma
SET @sql = CONCAT('INSERT INTO category_info_varchar_50 (name, is_show, sort, deleted, create_time, update_time) VALUES ', insert_values, ';');
PREPARE stmt FROM @sql;
EXECUTE stmt;
SET @sql = CONCAT('INSERT INTO category_info_varchar_500 (name, is_show, sort, deleted, create_time, update_time) VALUES ', insert_values, ';');
PREPARE stmt FROM @sql;
EXECUTE stmt;
SET end_idx = LEAST(total, start_idx + batch_size - 1);
END WHILE;
END$$
DELIMITER ;
CALL batchInsertData(1000000);
3.驗證存儲空間
查詢第一張表SQL
SELECT
table_schema AS "數據庫",
table_name AS "表名",
table_rows AS "記錄數",
TRUNCATE ( data_length / 1024 / 1024, 2 ) AS "數據容量(MB)",
TRUNCATE ( index_length / 1024 / 1024, 2 ) AS "索引容量(MB)"
FROM
information_schema.TABLES
WHERE
table_schema = 'test_mysql_field'
and TABLE_NAME = 'category_info_varchar_50'
ORDER BY
data_length DESC,
index_length DESC;
查詢結果
圖片
查詢第二張表SQL
SELECT
table_schema AS "數據庫",
table_name AS "表名",
table_rows AS "記錄數",
TRUNCATE ( data_length / 1024 / 1024, 2 ) AS "數據容量(MB)",
TRUNCATE ( index_length / 1024 / 1024, 2 ) AS "索引容量(MB)"
FROM
information_schema.TABLES
WHERE
table_schema = 'test_mysql_field'
and TABLE_NAME = 'category_info_varchar_500'
ORDER BY
data_length DESC,
index_length DESC;
查詢結果
圖片
4.結論
兩張表在占用空間上確實是一樣的,并無差別
三.驗證性能區別
1.驗證索引覆蓋查詢
select name from category_info_varchar_50 where name = 'name100000'
-- 耗時0.012s
select name from category_info_varchar_500 where name = 'name100000'
-- 耗時0.012s
select name from category_info_varchar_50 order by name;
-- 耗時0.370s
select name from category_info_varchar_500 order by name;
-- 耗時0.379s
通過索引覆蓋查詢性能差別不大
2.驗證索引查詢
select * from category_info_varchar_50 where name = 'name100000'
--耗時 0.012s
select * from category_info_varchar_500 where name = 'name100000'
--耗時 0.012s
select * from category_info_varchar_50 where name in('name100','name1000','name100000','name10000','name1100000',
'name200','name2000','name200000','name20000','name2200000','name300','name3000','name300000','name30000','name3300000',
'name400','name4000','name400000','name40000','name4400000','name500','name5000','name500000','name50000','name5500000',
'name600','name6000','name600000','name60000','name6600000','name700','name7000','name700000','name70000','name7700000','name800',
'name8000','name800000','name80000','name6600000','name900','name9000','name900000','name90000','name9900000')
-- 耗時 0.011s -0.014s
-- 增加 order by name 耗時 0.012s - 0.015s
select * from category_info_varchar_50 where name in('name100','name1000','name100000','name10000','name1100000',
'name200','name2000','name200000','name20000','name2200000','name300','name3000','name300000','name30000','name3300000',
'name400','name4000','name400000','name40000','name4400000','name500','name5000','name500000','name50000','name5500000',
'name600','name6000','name600000','name60000','name6600000','name700','name7000','name700000','name70000','name7700000','name800',
'name8000','name800000','name80000','name6600000','name900','name9000','name900000','name90000','name9900000')
-- 耗時 0.012s -0.014s
-- 增加 order by name 耗時 0.014s - 0.017s
索引范圍查詢性能基本相同, 增加了order By后開始有一定性能差別;
3.驗證全表查詢和排序
全表無排序
select * from category_info_varchar_50;
--耗時 0.875s
select * from category_info_varchar_500;
--耗時 0.866s
全表有排序
select * from category_info_varchar_50 order by name ;
--耗時 1.498s
select * from category_info_varchar_500 order by name ;
--耗時 4.875s
結論:
全表掃描無排序情況下,兩者性能無差異,在全表有排序的情況下, 兩種性能差異巨大;
分析原因
varchar50 全表執行sql分析
圖片
我發現86%的時花在數據傳輸上,接下來我們看狀態部分,關注Created_tmp_files和sort_merge_passes
圖片
圖片
Created_tmp_files為3。
sort_merge_passes為95。
varchar500 全表執行sql分析
圖片
增加了臨時表排序。
圖片
圖片
Created_tmp_files 為 4。
sort_merge_passes為645。
關于sort_merge_passes, Mysql給出了如下描述:
Number of merge passes that the sort algorithm has had to do. If this value is large, you may want to increase the value of the sort_buffer_size.
其實sort_merge_passes對應的就是MySQL做歸并排序的次數,也就是說,如果sort_merge_passes值比較大,說明sort_buffer和要排序的數據差距越大,我們可以通過增大sort_buffer_size或者讓填入sort_buffer_size的鍵值對更小來緩解sort_merge_passes歸并排序的次數。
四.最終結論
至此,我們不難發現,當我們最該字段進行排序操作的時候,Mysql會根據該字段的設計的長度進行內存預估, 如果設計過大的可變長度, 會導致內存預估的值超出sort_buffer_size的大小, 導致mysql采用磁盤臨時文件排序,最終影響查詢性能。