While executing an INSERT
statement with many rows, I want to skip duplicate entries that would otherwise cause failure. After some research, my options appear to be the use of either:
ON DUPLICATE KEY UPDATE
which implies an unnecessary update at some cost, orINSERT IGNORE
implies an invitation for other kinds of failure to slip in unannounced.
Am I right in these assumptions? What's the best way to simply skip the rows that might cause duplicates and just continue on to the other rows?
Best Answer
I would recommend using
INSERT...ON DUPLICATE KEY UPDATE
.If you use
INSERT IGNORE
, then the row won't actually be inserted if it results in a duplicate key. But the statement won't generate an error. It generates a warning instead. These cases include:PRIMARY KEY
orUNIQUE
constraints.NOT NULL
constraint.If you use
REPLACE
, MySQL actually does aDELETE
followed by anINSERT
internally, which has some unexpected side effects:REPLACE
.DELETE
are executed unnecessarily.correction: both
REPLACE
andINSERT...ON DUPLICATE KEY UPDATE
are non-standard, proprietary inventions specific to MySQL. ANSI SQL 2003 defines aMERGE
statement that can solve the same need (and more), but MySQL does not support theMERGE
statement.A user tried to edit this post (the edit was rejected by moderators). The edit tried to add a claim that
INSERT...ON DUPLICATE KEY UPDATE
causes a new auto-increment id to be allocated. It's true that the new id is generated, but it is not used in the changed row.See demonstration below, tested with Percona Server 5.5.28. The configuration variable
innodb_autoinc_lock_mode=1
(the default):The above demonstrates that the IODKU statement detects the duplicate, and invokes the update to change the value of
u
. Note theAUTO_INCREMENT=3
indicates an id was generated, but not used in the row.Whereas
REPLACE
does delete the original row and inserts a new row, generating and storing a new auto-increment id: