Linq : Join Datatable ("Assign: Object reference not set to an instance of an object")

Hi , I have 2 excel with huge data (10,000+) rows. I want to do a join and get the data so I have written Linq. However I am getting an error while running the code I have attached my code and the test files. Can some one please help me to understand what is wrong in my query.

I am getting an error “Assign: Object reference not set to an instance of an object.”

Main.xaml (10.4 KB)
New Stock Statement (Draft) (5).xlsx (1.8 MB)
test.xlsx (8.4 KB)

DT_Final is to prepare in advance e.g. with build datatable activity and configured for the target data column structure

He doesn’t need to Build it. Merge will build it. He needs to initialize DT_Final by putting new system.data.datatable into the default value.

LINQ is not working in that way
But feel to implement it and share the working XAML with us.

Oh I missed the Linq part, sorry. Just saw Join Datatable and thought he was using activities.

As per usual, I don’t see why it’s being complicated with Linq when it’s obviously a challenge, and activities would work fine. Sounds like another example of someone who thinks they need to, and thinks they are, avoiding loops by using Linq.

If you’re trying to merge two Excel files, can’t you just read one of them into a datatable with Read Range and then write it to the other file with Write Range? Or if you need to manipulate before writing the merged file, read both into datatables, manipulate with activities, then write both to the merge file. There doesn’t sound like a reason to try to merge the datatables.

Hi, I was able to fix the issue that I was getting by assigning it. Attached code. However is there any better way? If I use UiPath join activity it is taking too much time so I am going for Linq. Any better way so that I can have the better performance?
Main.xaml (11.4 KB)

Why do you need to join the datatables? If I have files A.xlsx and B.xlsx and want to combine them into C.xlsx I would…

  • Read Range A.xlsx into myDT
  • Append Range myDT into C.xlsx
  • Read Range B.xlsx into myDT
  • Append Range myDT into C.xlsx

There’s no reason to even have two datatables, let alone merge/join them. Maybe there’s something about your process I’m not understanding?

And if for some reason you do need to merge them, then…

  • Read Range A.xlsx into DT_A
  • Read Range B.xlsx into DT_B
  • Merge Datatable DT_A and DT_B into DT_C
  • Append Range DT_C into C.xlsx

I agree. what you are saying also makes sense. However my req is

  1. I need to extract certain data from the table for each Material Code (Specific column)
  2. Once I extract the data I need to place it in one sheet on specific rows.

With your logic after append I will have to read the merged sheet and copy specific column at specific location. Did I make sense?

Then your process becomes…

  • Read Range A.xlsx into DT_A
  • Read Range B.xlsx into DT_B
  • Filter Datatable DT_A to remove unwanted rows, or otherwise manipulate the data to fit your desired result
  • Filter Datatable DT_B to remove unwanted rows, or otherwise manipulate the data to fit your desired result
  • Merge Datatable DT_A and DT_B into DT_C
  • Append Range DT_C into C.xlsx

…although maybe for your process it would be better to merge into DT_C then manipulate its data.

You don’t need to write to the sheet then read it again. Do all your data manipulation in the datatables before writing to the final sheet.