Wednesday, June 13, 2018

A Solution for Removing duplicate records from parent table and update child tables accordingly in oracle

Think a requirement from your manager like below -


"We are trying to clean up the ORDERs table. We want to remove the duplicates records in this table with the same PRODUCT_ID, EFFECTIVE_STATE_DATE combination which are active (RECORD_STATUS=1, DEACTIVE_DATE is null and EFFECTIVE_END_DATE is null).

We only want to retain one record of such combination (oldest ORDER.PRODUCT_ID) and mark others as deleted (RECORD_STATUS=99).

When we mark any row as deleted, we need to go to the ORDER_DETAILS table and
Check if there’s any record with that PRODUCT_ID which is going to be marked as deleted. If there’s any record, that record in ORDER_DETAILS table needs to get updated with the retained PRODUCT_ID for that combination."

Solutions- 

You can use this as a template to solve this type of requirement.  This is not a real time working script. I modified the working script with sample tables (ORDERS AND ORDER_DETAILS) . please change your tables name and columns name as necessary and fixed the script if needed.

We need to go through few steps for this.

1)     Creating a temp table WITH RANK () - First TEMP table



create table TEMP_TABLE_WITH_RANK as
SELECT  order_id,product_id,effective_start_date,record_status,
       RANK ()
       OVER (PARTITION BY product_id,effective_start_date
             ORDER BY product_id)   rnk  ---------------RANKING them in this combination  --we will keep only  rnk=1 rest of the records need to marked deleted.
  FROM ORDERS
WHERE (product_id,effective_start_date) in (   
        select product_id,effective_start_date ----------Fetching all records including all duplicates and original
                      from ( select product_id,effective_start_date, count(*)  ------Finding dulicate orders in this combination
                                from ORDERS t
                                where t.RECORD_STATUS=1
                                    and t.DEACTIVE_DATE is null
                                    and t.effective_end_date is null
                                group by product_id,effective_start_date
                                having count(*) >1 )) ;
                                                                             
                                                                             
                                                                             


2)  Creating second temp table - Summary table with self joining to make a relation with the DELETING order_id and KEEPING order id.                       


                              

CREATE TABLE temp_table_summary                 
AS
select a.product_id,a.effective_start_date, b.order_id need_to_changed_order_id , a.order_id keeping_order_id
from  TEMP_TABLE_WITH_RANK a, TEMP_TABLE_WITH_RANK b
where a.rnk=1
and b.rnk<> 1
and a.product_id = b.product_id
and a.effective_start_date = b.effective_start_date ;





3)  ORDER_DETAILS  Update statement -First update statement

    --using second temp table - temp_table_summary

   

UPDATE order_details a 
   SET ORDER_ID =
          (SELECT keeping_order_id
             FROM temp_table_summary    b
            WHERE b.need_to_changed_order_id = a.ORDER_ID)
WHERE EXISTS
       (SELECT keeping_order_id
             FROM temp_table_summary    b
            WHERE b.need_to_changed_order_id = a.ORDER_ID) ;
           

4) ORDERS table update statement --Second update statement

  --using first temp table        



update ORDERS
set record_status=99 --deleted
where order_id in (select order_id from TEMP_TABLE_WITH_RANK where rnk<>1) ;



No comments: