MySQL删除复杂的重复数据的解决方案(一条数据项中包含多个值的情况)

步骤:

1.建立一个序列表并初始化数据

2.对有重复数据的字段项进行列转行

3.删除列转行后的重复记录

4.用group_concat函数将需要去重的字段项转化成以逗号分隔的字符串

5.对原表进行关联更新

6.删除去重数据创建的辅助表

数据演示,仅供参考

建表,初始化数据

create table user1_practice(
id int not null auto_increment primary key,
user_name varchar(3),
`over` varchar(5),
mobile varchar(100));
insert into user1_practice(user_name,`over`,mobile) 
values ('唐僧','旃檀功德佛','12112345678,14112345678,12112345678'),
('猪八戒','净坛使者','12144643321,14144643321'), 
('孙悟空','斗战胜佛','12166666666,14166666666,18166666666,18166666666'),
('沙僧','金身罗汉','12198343214,14198343214');

查看表数据:select * from user1_practice;

 在表中我们可以发现user_name为唐僧的mobile有两条重复数据12112345678,user_name为孙悟空的mobile有两条重复数据18166666666,下面按步骤来将这两条记录去重

一.建立一个序列表并初始化数据

create table tb_sequence(id int not null auto_increment primary key);
insert into tb_sequence values(),(),(),(),(),(),(),(),();

查看表数据:select * tb_sequence;

 此表只包含了一串自增id序列号

二.列转行后的表user1_trans1(user1_practice)

create table user1_trans1 as 
select a.id,user_name,`over`,replace(substring(
substring_index(mobile,',',a.id),
char_length(substring_index(mobile,',',a.id-1))+1),',','') as mobile 
from tb_sequence a 
cross join(select user_name,`over`,concat(mobile,',') as mobile,
length(mobile)-length(replace(mobile,',',''))+1 as size 
from user1_practice b) b on a.id <= b.size;

查看表数据:select * from user1_trans1;

三.删除user1_trans1表中的重复记录

delete a from user1_trans1 a 
join (select user_name,`over`,mobile,count(*),max(id) as id 
from user1_trans1  
group by user_name,`over`,mobile having count(*) > 1  ) b
on a.user_name = b.user_name and a.`over` = b.`over` and a.mobile = b.mobile 
where a.id < b.id;

查看表数据:select * from user1_trans1;

 四.用group_concat函数将mobile转化成以逗号分隔的字符串

create table user1_trans2 as 
select user_name,`over`,group_concat(mobile) as mobile 
from user1_trans1 group by user_name,`over`;

查看表数据:select * from user1_trans2;

 五.对原表user1_practice进行关联更新

update user1_practice a 
inner join user1_trans2 b on a.user_name = b.user_name
set a.mobile = b.mobile; 

查看表数据:select * from user1_practice;

 六.删除tb_sequence,user1_trans1,user1_trans2等辅助表

drop table tb_sequence;
drop table user1_trans1;
drop table user1_trans2;

此文来源于慕课网地址:https://www.imooc.com/video/8281

 

 

 

原文地址:https://www.cnblogs.com/zhukf/p/13408754.html