uid | status |
---|---|
1 | 1 |
2 | 2 |
1 | 2 |
1 | 3 |
2 | 3 |
查询只存在 status=1 的用户,这两种写法有什么区别吗
SELECT * FROM t where `status` = 1 and uid not IN (
SELECT uid FROM t as t1 WHERE `status` in (2,3) and t.uid = t1.uid
)
SELECT * FROM t where `status` = 1 and uid not IN (
SELECT uid FROM t as t1 WHERE `status` in (2,3)
)
场景是查历史用户,即 status != 2,3 的,以上是简化的模型,两个字段都有索引
上面的多了一个子查询和外表的条件,在表数据量大的情况下性能谁更好,leader 说第一种查询效率会更高,因为它只在满足外表的条件下进行了过滤。或者有没有其他性能更好的写法?
1
kkwa56188 2023-04-23 21:16:53 +08:00
不是做 Mysql 的, 感觉是 1 比较快.
其次 用 in (value1, value2) 通常是很慢的, 索引不一定管用. 我会换成: SELECT * FROM t where t.status = 1 and not exists ( SELECT uid FROM t as t1 WHERE status = 2 and t1.uid = t.uid ) and not exists ( SELECT uid FROM t as t2 WHERE t2.status = 3 and t2.uid = t.uid ) |
2
joetse 2023-04-23 23:03:51 +08:00 via iPhone
为什么不用 bitmap?
|
3
akira 2023-04-23 23:20:47 +08:00 1
status 区分度这么低,再来个 not in ,性能不好上来吧。试试这么考虑呢,虽然是有点取巧了
select uid, min(status) `mins`, max(status) `maxs` from t group by uid having `mins` = 1 and `maxs` = 1 |
4
512357301 2023-04-23 23:29:12 +08:00 via Android
用 group by 呢?
select uid, sum(if(status = 1,1,0)) one_cnt, sum(if(status = 2 or status = 3,1,0)) other_cnt from t group by uid having one_cnt > 0 and other_cnt = 0 没在机器上测试,反正我下意识的写法是这么写,楼主可以测试下。 PS:对楼主和楼上各位的子查询写法感觉很别扭,我一直用不惯子查询嵌套,更习惯 left join ,不过 mysql 出身的貌似都这么写,存在即是合理,我不争辩。 |
5
xuanbg 2023-04-24 09:49:51 +08:00
我会写成这样:
with have23 as (SELECT uid FROM t as t1 WHERE `status` in (2,3)) select * from t left join have23 h on h.uid = t.uid where h.uid is null; |