I got this from net but got stuck in the middle. Can someone guide me to solve this one?
Read the pdf
Fetch the invoice values
Create a new folder with date
Copy the invoice pdfs into the new folder and Create new excel sheet (File name should be date
specific)
Create a header row and then feed the invoice data into it
Send email (with excel attachment)
Note: I have used screen scraping but sometimes It is not catching the perfect values. So could anyone guide me what activities i should use for these 6?
Thank you in advance.
Regards,
Pranav
It depends on the type of PDF we are extracting from. If it is only text based then you can use Read PDF text and apply data manipulation functions on the result string in order to fetch the appropriate data. If it is a scanned PDF then you can utilize Read PDF with OCR.
As said above you can apply string manipulation functions like substr/strlen or apply regex to fetch the right data.
Hi qwerty123,
Thank you for the quick reply. for (1), these documents are scanned and I did use the PDF with OCR activity but for some PDFs, it is not catching the exact value. Is there anything else you can help me with this issue?
And by the way, thank you very much for the guidance. I look forward to hearing from you.
Regards,
Pranav
Hi
That is indeed a genuine issue as it is a known fact that no OCR gives 100% correct results. Besides a lot of other constraints like quality of PDF, visibility of text can affect the result generated.
You can try Google cloud /Abbyy cloud OCR as they give the best results compared to standard Google/Microsoft OCR. But they are licensed versions so you wont be able to use it for free.
Exactly. One last question, there are total 8 PDFs, so do i need to record a sequence for each PDF or there is some activity which can record for all the PDFs?
I think if all 8 PDFs have same layout then you can have 1 sequence carrying the logic to read required data from it. You can make it dynamic enough so that it will work on all PDFs.
If all the 8 PDFs follow different layout then you might have to work on data extraction individually.
Hi qwerty123,
Actually they do have different format. So I need to record it differently but I can copy and paste the email activity, can’t i?
Regards,
Pranav