How to import large quantities of Excel data into MySQL accurately and quickly after being deduplicated?

problem description

customer table users has the following fields:

clipboard.png

  • there are 5 sales groups. Each sales group imports at least one batch of 100000 items of data into the table every month, and can only import the data of its own sales group
  • within the same sales group, the mobile phone numbers of customers with normal status (state=0 or 1) cannot be repeated. Repeat discarding
  • customer"s mobile phone number can be repeated among different sales groups
  • the number of normal customers per sales group is estimated to be within 10 million

requires accurate and fast import into users table

methods and problems you have tried

  • currently, the data in the users table is synchronized to redis according to the sales group ID, and the mobile phone number is saved with the set collection
SADD users:1 13100010001 13100010002 13100010003
SADD users:2 13100020001 13100020002 13100020003
SADD users:3 13100030001 13100030002 13100030003
  • when importing, first write the excel table data to redis as file ID (assumed here is 8), and get the set collection mobiles:8
  • then make the difference with the original set collection of the sales group users:1 to get the difference diff:1 .
  • finally write the difference to mysql
SADD mobiles:8 13100010001 13100010004
SDIFFSTORE diff:1 mobiles:8 users:1

[ problem ] there are several places where the management side can add or delete users table, coupled with unknown reasons, resulting in data inconsistency between mysql and redis

what result do you expect?

is there a better way to import data into mysql accurately and quickly?

Jun.30,2022

in fact, the main problem is that the redis and mysql data are inconsistent. If consistency can be guaranteed, the present method is also feasible.

if there is no guarantee, you can only use mysql to verify whether the mobile phone number exists.


unique index of gid and mobile, then read excel table, directly batch insert table


you can write a program to synchronize MySQL and redis, and use crontab to synchronize with regular execution

MySQL Query : SELECT * FROM `codeshelper`.`v9_news` WHERE status=99 AND catid='6' ORDER BY rand() LIMIT 5
MySQL Error : Disk full (/tmp/#sql-temptable-64f5-1c11f86-32dc7.MAI); waiting for someone to free some space... (errno: 28 "No space left on device")
MySQL Errno : 1021
Message : Disk full (/tmp/#sql-temptable-64f5-1c11f86-32dc7.MAI); waiting for someone to free some space... (errno: 28 "No space left on device")
Need Help?