Diffeerent Format Pdfs Extraction

Respected Sir/Ma’am,

I want to extract Different format Pdf data extract to excel
Please let me know how can i do this?

HI @badal_patel

You can try with Document understanding method

Here is the Docs and Video Link

can i run this flow using robot?

please if your pdf is different format and different type of pdfs then go with regex based extraction

step 1 read pdf with ocr

step 2 build expression for extract data

step3 build data table

step3 add your collection of data by array of data with add data row

Yes @badal_patel , Before that we need to classify the values in the Document type. Just go through the document


Cab you please send a simple flow?

Hi @badal_patel

Can you share the sample Pdf template here


@badal_patel Check the sample workflow below.After opening this workflow make sure you install below packages . You have to change the Document Understanding API Key with the one that you have in your Orchestrator



InvoiceProcessing - Copy.zip (101.5 KB)

Mahavir_Inv_290720 - 786.pdf (100.6 KB)

Can you please send a solution

Hi Badal,
Can you please let me know what data you want to extract from the pdf?
I will help you extract the data using regex.

Billing Address (Only Company name),Invoce Num, Invoice date,
Item Code, HSN ,Quantity,Unit Price, CGST,SGST,IGST,Total Amount,
Gross Total

Can you please check & provide solutions