Multiple pdf specific data extraction

hello, suppose i have a multiple pdf in same folder and all pdf have different format but some terms are same like invoice number, invoice data in all pdf i want to extract invoice number and invoice date and store into csv for all pdf. Please help
here is the invoice.wordpress.pdf (42.6 KB)
invoice 2.pdf (171.1 KB)

I checked the pdfs and i see the invoice date is different term in the pdf.

You have to read all PDFs in For Each loop and then you will have to extract data from String Manipulation according to terms available

so how to deal with it.Help please

see this. :point_right:Main.xaml (12.0 KB)
This will extract invoice data and number for all pdf of format type 1(wordpress.pdf)

The second pdf has line editable text boxes. With the above xaml, the OCR will not pick those fields as they are not text/images.
Can you confirm if the pdf has invoice with edit able fields?:thinking:

Hi Sandeep, if each invoice has a different format and moreover some of them need OCR, I think it’s by far easiest to use a data capture API to get the fields. An example of a tutorial using such API is at https://rossum.ai/blog/2018/07/30/automating-data-extraction-from-invoices-using-rossum-api-and-uipath/