Searching multiple Pdf files and excel

Hi All,

Please help me out in this.

I am trying to 1. Read the PDF invoice (By scrapping) files one by one and 2. Check the extracted invoice number in the excelsheet. 3. After getting the exact match I want to read the same row from starting to end. 4. Next go to next pdf and open read the invoice and so on…

Attached the xaml,PDFs and excelsheet.

Invoice 1.pdf (28.8 KB) Invoice 2.pdf (28.8 KB) Name.xlsx (8.1 KB) Scenario3.xaml (7.9 KB)

1 Like

Hi @sachinsm

So I have prepared the complete workflow for you which gives the exact output what you want and fulfills all your requirement.

Below is the working workflow with all files.
Invoice 1.pdf (28.8 KB)
Invoice 2.pdf (28.8 KB)
Name.xlsx (8.1 KB)
Scenario3.xaml (19.8 KB)

Output :-
image
image

Hope this may help to solve your query
Definetly mark as solution & like it. :innocent:

Happy Automation :raised_hands:

Best Regards
Er Pratik Wavhal :robot::man_technologist:t4: :computer:

1 Like

@Pratik_Wavhal, thanks a lot :slight_smile:

One small change, If possible can you help me on this.
Instead of background running/ regex can we open PDF and use this GetValue/Get Text activity.

Its my mistake I did not mention this in my question, really sorry for this.

HI @sachinsm

Actually PDF Automation itself is done in background.
And in your scenario which you told now will be somehow like that you have to always keep open the particular PDF from which you want to extract data.
Sometimes it may fail also if the window is in small size so the Invoice no part will gets hide behind.
So in such scenario it may fail.

Hope this may help to solve your query
Definetly mark as solution & like it. :innocent:

Happy Automation :raised_hands:

Best Regards
Er Pratik Wavhal :robot::man_technologist:t4: :computer:

@Pratik_Wavhal: You are correct and 100% agree with you, but in one of my scenario I need to open the PDF.

Hi @sachinsm

So you want the pdf should be open manually and then the bot will perform operation
right ??

Happy Automation :raised_hands:

Best Regards
Er Pratik Wavhal :robot::man_technologist:t4: :computer:

Hello @Pratik_Wavhal: First thanks a lot for your reply.
No, PDF should open (When you run the workflow/start process/etc…) one by one and perform the compare in excel, then open second PDF then check in excel. I need to do for only 2 PDF files.

Hi @sachinsm

Yes it is possible. I will update and share the workflow soon.

Happy Automation :raised_hands:

Best Regards
Er Pratik Wavhal :robot::man_technologist:t4: :computer:

@Pratik_Wavhal Thanks a lot :pray:.
Attached excel for your reference, only two rows need to match with the PDF file (Invoice no and Customer ID).

Actual what I wanted to do is:

  1. Open PDF file get the Invoice no (Get text activity I am using)
  2. Go to the excel sheet, check the where the Invoice no is present
  3. When identified the Invoice no in the excel, and in the same row match remaining data with PDF file.
  4. Step 1 repeats.

Please let me know if you need any other information.

Name.xlsx (8.0 KB)

Hi @sachinsm

I tried a lot for the steps as you mention. But for pdf, selector is not working for Get text activity. Actually the selector is getting static so its only working for any one pdf by using Acrobat Reader. No such tags are getting in the UI Explorer which will make it dynamic.

Even if you do it by OCR atlast you will have to use regex only to extract the Invoice No.
So its best to do by using Read PDF Activity.

One thing can be happen that Just to show the Invoice is opening in PDF you can do it to show by using Start Process Activity but atlast search the invoiceNo can be done by using the method that i have already implemented.

Hope this may help to solve your query
Definetly mark as solution & like it. :innocent:

Happy Automation :raised_hands:

Best Regards
Er Pratik Wavhal :robot::man_technologist:t4: :computer:

@Pratik_Wavhal: Thanks a lot for the try :pray: :pray: I am also trying for it but not able to do. Is it possible to share whatever you have the workflow?.

I will go with above approach provided by you :slight_smile: .

HI @sachinsm

Sure.
temp.zip (74.1 KB)

Happy Automation :raised_hands:

Best Regards
Er Pratik Wavhal :robot::man_technologist:t4: :computer:

@Pratik_Wavhal

I am abel to get the Invoice ID for both the PDF files. Attached the workflow, can you help on this if possible.
Scenario4.xaml (31.5 KB)

Below changes was done.

<wnd app='acrord32.exe' cls='AcrobatSDIWindow' title='Billing Invoice Template - Adobe Acrobat Reader DC' />
<wnd aaname='Document Pane' cls='AVL_AVView' title='AVScrolledPageView' />
<wnd cls='AVL_AVView' title='AVPageView' />
<ctrl role='row' idx='7' />
<ctrl name='*  ' role='text' idx='2' />

Hi @sachinsm

First of all thnx for providing the selector that works 100% perfect.
May i know from where you got the extra tag or the idea which tag must be der somewhere that will help us to get the exact position of Invoice No from Invoice PDF Doc ??

Below is the perfectly working files which meets all your requirements. Just have a look to it.
temp.zip (64.9 KB)

So you can mark this as a solution now and like it.like it. :innocent:

Happy Automation :raised_hands:

Best Regards
Er Pratik Wavhal :robot::man_technologist:t4: :computer:

1 Like

@Pratik_Wavhal Invoice document are Structured document. Correct :slight_smile:
And the pdf file which you have provided are not scaned copped. so we can use Get text activitie to capture the text. When you use get text activity you will be getting thie selector, when u click on invoice no. and when you open new invoice the position of the invoice no will be same. but number will be changing. so he put * like ctrl name=’*’

ctrl name='2170 ’ role=‘text’

Hi @vijaybrijmohan14

You are absolutely correct. I agree with you.

But the Selector which @sachinsm gave me afterwards includes “idx=‘2’” which i was not getting in my UI Explorer. I am talking about this tag only.

Otherwise the entire selector i was also getting and same i also did using * as wildcard to make it dynamic but after making it dynamic and when i do the validation for that particular selector in which “idx=‘2’” was not present then at that time it was indicating to the date which is beside Invoice No. So that was wrong.

Thats why i asked that how @sachinsm consider “idx=‘2’” should be include in the selector and so on by which it will work perfectly.

Btw Thnx bro for explaining the useful stuff again.

Happy Automation :raised_hands:

Best Regards
Er Pratik Wavhal :robot::man_technologist:t4: :computer:

@Pratik_Wavhal
image

might be he have added row idx

Hi @vijaybrijmohan14

Yeah. Thats what i want to get know how we can do it if any tag is not present then on what basis or the idea which tag must be der somewhere that will help us to get the value we want to extract.
If you are aware then let me know

Happy Automation :raised_hands:

Best Regards
Er Pratik Wavhal :robot::man_technologist:t4: :computer:

@Pratik_Wavhal we need to try all combination from the UI selector so we can get best possible match which can work in all similar case.

@vijaybrijmohan14

Yup. Thnx.

Happy Automation :raised_hands:

Best Regards
Er Pratik Wavhal :robot::man_technologist:t4: :computer: