PDF to Excel need advise

Hi All, Please clarify
I have the below details in excel in 2 sheets

I have to find the ID in PDF for the car name mentioned in Excel and for the ID’s there is a price mentioned in “Codes” Sheet in excel, So that needs to pulled and updated in Vehicle sheet in Price Column

So, Please advise which option i should use to complete this activities (I attached PDF and Excel file here)Auto Sales.pdf (205.0 KB) RPA.xlsx (13.6 KB)

At first stage we can process the PDF as following

  • prepare an empty datatable with the columns BrandModel, Year, ID

  • read in the PDF

  • extract the data with RegEx


  • use the information from the different groups and add it to the datatable

on next stage:

  • use techniques like lookup / join DataTable (suggested to do it with LINQ) and use the combined column information for identifying the different rows within the datatables:
    • dtVehicle: Col-Brand+Col-Line And Col-Year From — dtPDFData for identifying the ID
    • using the ID for retrieving the price in dtCodes

Just have a search on the forum and you will found several solutions for this

I would also recommend to exclude any Duplicates from dtPDFData as it would lead to wrong matches for the ID

Find starter help here:
PDF_Case_001.xaml (8.2 KB)

1 Like

Thank you very much, Its really Helpful.
I still getting some error when i tried, Please advise

ensure following:

have a look on the variable used for the datatable (PDF) in the screenshot. Use the proper variable name (this one from building data table) and check in general that all variable are correct declared and used. So try again

I Checked, But again getting the same error on the Assign function. Please help

Attached the workflowMain.xaml (10.0 KB)


check your read pdf text activity. There is no output variable assigned. use this variable name that you will use in the regex statement (looks like it is strPDF)

Thankyou very much Bro, Now it ran successfully but as you suggested i need to do lookup or join data tables to see the information in output.

is this one of the selector method?