Hope you are doing well and staying safe.
I’m trying to achieve the below
“Comparing a value from a datarow’s column(eg. Id) in dataframe2 with all the value in a column(eg. Id , but check the value in all data rows) in dataframe1 and if found delete the dataRow in the Dataframe2…”
Let me explain in detail.
An excel file with multiple columns and 1 column is the ID column, the input excel file can have multiple rows with for a single ID value.
An excel (with status report) for the inputs. This excel will have 2 sheets, one for Success and one for Failure.
I take individual rows from the input data, process it and will have either of the 2 outcomes (success or failure).
What I’m trying to implement:
I have a requirement to process the input excel, lets assume I’m going to create output an excel file with 2 sheets namely Success and Failure with 3 columns each (id, Status, Comment), and lets say I build two data frames to capture the success and failure data based on the output of the transactions I process from the input excel.
Now say if I have same ID that I had processed multiple times (as the input source for me is an excel which can have multiple rows for the same ID and I can’t filter this out as the other corresponding columns value might be different).
Based on the transaction processing, I build 2 Dataframes (1 for success records status and 1 for failure records status)
Now lets assume that the process of an ID repeated multiple times (lets say I processed the same ID 3 times as I had 3 rows in my input), now say that 1 row got successfully processed and 2 rows failed, now in the DataFrames I build, 1 row will correspond to this ID in the successDF and 2 rows will correspond to this ID in the failureDF.
What I’m intending to do is, if a ID that is present in failureDF is also present in successDF, I should delete the corresponding dataRow in the failureDF.
Eventually when I build the output Status Excel file, it will have 2 sheets success and failure, and for the above case I just need the status corresponding to the ID in the success status sheet (and the 2 rows in the FailureDF should not be present in the failure status sheet, as this ID ran successfully once), to put in simple terms if a ID processed successfully atleast once, then that ID should not be present on the failure status sheet(even though if transactions for that ID failed).
It would be great if somebody can give me an efficient way to do this.
Thank you very much in advance. Stay safe.