Comparing a value from a column in dataframe2 with a value in a column in dataframe1 and if found delete the dataRow in the Dataframe2

Hello Community,
Hope you are doing well and staying safe.

I’m trying to achieve the below
“Comparing a value from a datarow’s column(eg. Id) in dataframe2 with all the value in a column(eg. Id , but check the value in all data rows) in dataframe1 and if found delete the dataRow in the Dataframe2…”

Let me explain in detail.

An excel file with multiple columns and 1 column is the ID column, the input excel file can have multiple rows with for a single ID value.

An excel (with status report) for the inputs. This excel will have 2 sheets, one for Success and one for Failure.

I take individual rows from the input data, process it and will have either of the 2 outcomes (success or failure).

What I’m trying to implement:

I have a requirement to process the input excel, lets assume I’m going to create output an excel file with 2 sheets namely Success and Failure with 3 columns each (id, Status, Comment), and lets say I build two data frames to capture the success and failure data based on the output of the transactions I process from the input excel.

Now say if I have same ID that I had processed multiple times (as the input source for me is an excel which can have multiple rows for the same ID and I can’t filter this out as the other corresponding columns value might be different).

Based on the transaction processing, I build 2 Dataframes (1 for success records status and 1 for failure records status)

Now lets assume that the process of an ID repeated multiple times (lets say I processed the same ID 3 times as I had 3 rows in my input), now say that 1 row got successfully processed and 2 rows failed, now in the DataFrames I build, 1 row will correspond to this ID in the successDF and 2 rows will correspond to this ID in the failureDF.

What I’m intending to do is, if a ID that is present in failureDF is also present in successDF, I should delete the corresponding dataRow in the failureDF.

Eventually when I build the output Status Excel file, it will have 2 sheets success and failure, and for the above case I just need the status corresponding to the ID in the success status sheet (and the 2 rows in the FailureDF should not be present in the failure status sheet, as this ID ran successfully once), to put in simple terms if a ID processed successfully atleast once, then that ID should not be present on the failure status sheet(even though if transactions for that ID failed).

It would be great if somebody can give me an efficient way to do this.
Thank you very much in advance. Stay safe.

D Logan

Hello Community,
I came up with a solution for this.
Thanks for your time.

Best Regards,
D Logan

In case anyone ever comes across a similar problem, it is always nice to share the solution.

Hello Community,

The solution I built consists of the below steps,

  1. I created a new DT filteredFailureDT with schema similar to failureDT for storing the filtered values of the failureDT.

  2. I had the failureDT as input and did a for each for the entries in failureDT.

  3. I checked if the row in scope is present in the successTransactionsDT in a if logic inside the for each,
    I used the below condition

successTransactionsDT.AsEnumerable().Any(Function(x) x(“Request Number”).ToString = inProcessRequestNumber)

  1. If the row in scope is not present in the successTransactionsDT, meaning the transaction is only present in failureDT, (which logically will be in the else condition of the if loop), I did a invoke method to add the dataRow in scope, to the datatable built in the step 1 filteredFailureDT.

  2. In this way, after all the forEach transactions of the failureDT are processed, now the filteredFailureDT will have all the transactions that was present in failureDT, but that were not present in the successTransactionsDT.

  3. To build the Output Excel Status Report:
    Now I use, the Datatables successTransactionsDT and filteredFailureDT, to create my status excel file will have 2 sheets, 1 sheet for success scenarios that will have values, that were in the successTransactionsDT and 1 sheet for failure scenarios that will have values, that were in filteredFailureDT.

Screenshot of the forEach Logic:

Hope this helps. Have a wonderful day ahead and happy automation.
Stay safe :slight_smile:

Best Regards,
D Logan

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.