【SQL】查询已经连续7天登陆的用户

论坛 期权论坛 编程之家     
选择匿名的用户   2021-5-28 19:45   11   0

整体思路:

  • 用户登录日期去重-->row_number()函数分组排序获得排序编号-->日期减去编号——根据每个用户count(值)判断连续登陆天数

具体实现:

  • 用户每天登陆次数可能不止一次,所以需要将日期去重。
  • 用row_number()函数将用户分组,按照去重后的日期排序后计数得到登录次数。
  • 然后用日期减去计数得到结果(如果每次减去的结果相同,则表明用户一直处于连续登陆登录状态)。
  • 最后按照用户和结果分组,得到相同值出现的次数,此时得到的值大于等于7的就是连续7天登陆的用户了。

实现步骤:

  • 日期去重
select DISTINCT dt as dt,id from tb_log;
  • row_number()函数分组排序获得排序编号
select *,row_number() over(PARTITION by id order by dt) as cnt 
from 
(
select DISTINCT dt as dt,id 
from tb_log
)a;
  • 日期减去排序编号得到结果值
select *,date(dt)-cnt as diff_
from 
(
    select *,row_number() over(PARTITION by id order by dt) as cnt 
    from 
    (
        select DISTINCT date(date) as dt,id 
        from log_data
    )a
)b;
  • 按照用户和结果分组
select 
 id,count(1) as cnt 
from 
(
 select *,date(dt)-cnt as diff_
 from 
 (
  select *,row_number() over(PARTITION by id order by dt) as cnt 
  from 
  (
   select DISTINCT date(dt) as dt,id 
   from log_data
  )a
 )b
)c 
group by id,diff_
having count(1)>=7;
  • 思路梳理
iddtrow_cntdt-row_cntdiff
user_id120201101120201101-120201100
user_id120201102220201101-220201100
user_id120201103320201101-320201100
user_id120201104420201101-420201100
user_id120201105520201101-520201100
user_id120201106620201101-620201100
user_id220201101120201101-120201100
user_id320201102120201102-120201101
user_id320201103220201103-220201101

应用:

计算用户的连续登陆天数【根据uid和(时间-分组编号)分组聚合】

SELECT
uin,
COUNT(uid) AS days_
FROM(
 SELECT
 uid,
 dt,
 row_number() OVER(PARTITION BY uid order by dt asc) AS rn
 FROM
 (
     SELECT 
     uid,
     statis_day  
     FROM tab_user_log 
     WHERE dt>= 20170101    
     AND dt <= 20180901
 )a 
)
GROUP BY uid, date_sub(dt,CAST(rn AS INT))

分享到 :
0 人收藏
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

积分:3875789
帖子:775174
精华:0
期权论坛 期权论坛
发布
内容

下载期权论坛手机APP