Mysql复杂查询总结

一、窗口函数查询

数据表结构

字段(user_invite表) 描述
id 主键编号
uuid 用户唯一编号
gift_quantity 金币数量
gift_type 礼物类型
state 状态(0.未成功1.成功)
create_time 创建时间
update_time 更新时间

涉及时间函数

函数 描述
to_days() 获取日期总共天数函数
now() 获取当前日期+时间函数,例如:2008-11-11 12:45:34
curdate() 获取当前日期函数,例如:2008-11-11
curtime() 获取当前时间函数,例如:12:45:34
dense_rank() 并列排序,不会跳过重复序号
rank() 并列排序,会跳过重复序号
row_number() 顺序排序,不跳过任何一个序号,就是行号
over() 窗口函数,必须定义别名,且只在高版本存在Mysql_5.7没有
partition by 为分组函数,跟group by雷同但只对其中某些字段做分组排序,例如:over(PARTITION BY a order by b),对over()内数据a进行分组后再按照b来排序
group by 为分组函数,进行分组聚合查询,只保留参与分组的字段和聚合函数的结果

按天统计state为不同值的总条数(以便做概率统计)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
SELECT
DATE_FORMAT( create_time, '%Y-%m-%d' ) DATA,
COUNT( * ) AS total,
sum( CASE WHEN state = 1 THEN 1 ELSE 0 END ) AS successTotal,
sum( CASE WHEN state = 0 THEN 1 ELSE 0 END ) AS failTotal,
concat(
ROUND( sum( CASE WHEN state = 1 THEN 1 ELSE 0 END ) / COUNT( * ) * 100, 2 ),
'%'
) AS successRate
FROM
user_invite
WHERE
-- 统计当天数据
-- TO_DAYS( CreateTime ) = TO_DAYS(NOW())

-- 统计近一周里每一天的数据
-- TO_DAYS (NOW()) - TO_DAYS(create_time) < '7' AND TO_DAYS(NOW()) - TO_DAYS(create_time) >= 0
-- 或者
-- DATE_SUB(CURDATE(), INTERVAL 7 DAY) <= date(create_time)

-- 统计近一个月里每一天的数据
-- DATE_FORMAT(create_time,'%Y%m') = DATE_FORMAT(CURDATE(),'%Y%m')
-- 或者 DATE_SUB(CURDATE(), INTERVAL 30 DAY) <= date(时间字段名)

GROUP BY
DATA
ORDER BY
DATA

查询结果例子:

image-20220923194143769

密集连续排行-(筛选出各礼物类型金币数量消耗前三的数据且包含金币相同)

1
2
3
4
5
6
7
-- 按照gift_type分组再按礼物数量降序
SELECT
*
FROM
( SELECT *, ( dense_rank () over ( PARTITION BY gift_type ORDER BY gift_quantity DESC ) ) AS sort FROM gift_order ) AS m
WHERE
sort <=3

查询结果例子:

image-20220923202607769

二、大批量插入优化

Mybatis推荐ExecutorType.BATCH的方式进行大数据批量插入(替代<foreach>标签插入)参考:官网

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
//<forearch>主要用于20~50条的批量数据插入,随着数据增长越多所需的执行时间也越多
SqlSession session = sqlSessionFactory.openSession(ExecutorType.BATCH);
try {
SimpleTableMapper mapper = session.getMapper(SimpleTableMapper.class);
List<SimpleTableRecord> records = getRecordsToInsert(); // not shown

BatchInsert<SimpleTableRecord> batchInsert = insert(records)
.into(simpleTable)
.map(id).toProperty("id")
.map(firstName).toProperty("firstName")
.map(lastName).toProperty("lastName")
.map(birthDate).toProperty("birthDate")
.map(employed).toProperty("employed")
.map(occupation).toProperty("occupation")
.build()
.render(RenderingStrategy.MYBATIS3);

batchInsert.insertStatements().stream().forEach(mapper::insert);

session.commit();
} finally {
session.close();
}

三、深分页优化

标签记录法(思路:记录这一次查询的id,将id带入下次查询快速定位)

1
select  id,name,balance FROM account where id > 100000 limit 10;

延迟关联法(思路:快速定位需要获取的id字段,再关联)

1
select  a.* FROM account as a,(select id from account where 条件 limit 100000,10) as a2 where a1.id=a2.id

四、切片大查询删除优化

一个大查询如果一次性执行的话,可能一次锁住很多数据、占满整个事务日志、耗尽系统资源、阻塞很多小的但重要的查询

1
DELEFT FROM messages WHERE create < DATE_SUB(NOW(), INTERVAL 2 MONTH);

优化后:

1
2
3
4
5
6
7
8
9
10
11
12
13
-- 定义存储过程
CREATE PROCEDURE DeleteData()
BEGIN
DECLARE rows_affected INT DEFAULT 1; -- 初始化为一个大于0的值以进入循环

WHILE rows_affected > 0 DO
delete FROM messages WHERE create < DATE_SUB(NOW(), INTERVAL 2 MONTH) limit 1000;
SET rows_affected = ROW_COUNT(); -- 获取受影响的行数
END WHILE;
END

-- 调用存储过程
CALL DeleteData()

五、切片大更新优化

主要适用于大批量更新表字段,并且表字段没有索引,会导致上升为表锁

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
-- 定义存储过程
CREATE PROCEDURE batchUpdate()
BEGIN
DECLARE done INT DEFAULT FALSE;
DECLARE current_pid INT;
DECLARE batch_size INT DEFAULT 100;
DECLARE offset INT DEFAULT 0;
DECLARE total_rows INT;
DECLARE cur CURSOR FOR SELECT id FROM `user` WHERE age > 200 LIMIT batch_size OFFSET offset;
DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = TRUE;

-- 获取总行数
SELECT COUNT(*) INTO total_rows FROM `user` WHERE age > 200;

WHILE offset < total_rows DO
OPEN cur;
read_loop: LOOP
FETCH cur INTO current_pid;
IF done THEN
LEAVE read_loop;
END IF;
-- 执行更新操作
UPDATE `user` SET `name` = CONCAT(`name`,"-01")
WHERE id = current_pid;
END LOOP;
CLOSE cur;
COMMIT; -- 提交当前批次的更新操作

SET offset = offset + batch_size;
SET done = FALSE; -- 重置 done 标志
END WHILE;
END

-- 调用存储过程
CALL batchUpdate()

六、查询各表的占用情况

1
2
3
4
5
6
7
8
9
SELECT
table_name ,
round(table_rows, 0) AS 'Rows',
round(data_length / 1024 / 1024, 2) AS 'Data MB',
round(index_length / 1024 / 1024, 2) AS 'Index MB',
round((data_length + index_length) / 1024 / 1024, 2) AS 'Total MB',
round(data_free / 1024 / 1024, 2) AS 'Free MB'
FROM information_schema.tables
WHERE table_schema = 'dj_put' ;