Jan 20, 2016

Delete duplicates and keep only the latest record

This SQL code could be helpful since it’s a general requirement to remove duplicates while keeping latest record.

Considered example, we have a primary key called Id and Account Number is the field we are concentrating on.

Though this code is a loop, this works quickly since loop goes only for records with duplicates.

DECLARE @AccountNumber uniqueidentifier
DECLARE @ID VARCHAR(50)

DECLARE _cursor1 CURSOR for
 SELECT A1.AccountNumber
 FROM Tbl_Account as A1
 WHERE A1.AccountNumber is not null
 GROUP BY A1.AccountNumber
 HAVING (SELECT COUNT(A2.AccountNumber) FROM  Tbl_Account A2 WHERE A2.AccountNumber = A1.AccountNumber)>1

OPEN _cursor1

FETCH NEXT FROM _cursor1 INTO @AccountNumber
WHILE (@@FETCH_STATUS = 0)
BEGIN

    SET @ID = NULL

    SELECT TOP 1 @ID = Id FROM Tbl_Account WHERE AccountNumber = @AccountNumber ORDER BY [ModifiedOn] DESC
    
    DELETE FROM Tbl_Account WHERE AccountNumber = @AccountNumber AND Id <> @ID
 

FETCH NEXT FROM _cursor1 INTO @AccountNumber
END      
CLOSE _cursor1

DEALLOCATE _cursor1

Enjoy!