Here i want to extract Invoiceno , date and file name from multiple pdfs , pdf are tagged , which method i should use to get relevant text data. later i want to write data in an excel file.
@raja.arslankhan
Sequence.xaml (13.7 KB)
Invoice1.pdf (92.9 KB)
Invoice2.pdf (93.1 KB)
Invoice3.pdf (93.1 KB)
Invoice4.pdf (93.1 KB)
project.json (1.6 KB)
Hi @Vidhi_Patel
You can try with Regex Expression
Check out the XAML file
ExtractDatafromPDFRegex.xaml (10.7 KB)
Output
PdfFile1.xlsx (7.3 KB)
Regards
Gokul
@Vidhi_Patel your pdf have same template so I will recommend use pdf packages and get data through OCR.
Give me output data which you want to extract
Thank you gokul , but i don’t understand Regular expression much. let me know if any other way exists.
Which OCR Engine should i use? Uipath screen OCR?
You can easily achieve via regex expression. it is a simple way to achieve the output. Have you tested the XAML file? @Vidhi_Patel
Hi @Vidhi_Patel
You can easily understand the Regulars expression tutorial
You can also work on this
Regards
gokul
Read PDF With OCR: Error performing OCR: Invalid API key specified UiPathOCRInvalidApiKey , Getting this error while scraping what i should write inside API property.
This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.