Get similar data repeatedly in loop from within pdf file

Hi,
I have been working on a project where I have to get group of similar data repeatedly from a large pdf file. How can i use loop activity within a pdf file. Please need urgent help

Hello @Rimsha_Aizaz1

I hope you are trying to get it from some similar kind of pdf files. Then do as below.

  1. use assign Directory.GetFiles(”folder path”). It will give you the list of files.
  2. use for each activity and loop through each file.
  3. to extract the value you can use either regex or get text activity.

Regex- use get pdf text activity and get the data to a string variable. Then use regex expression in Matches activity.

Get text- open pdf in pdf reader, then use Get Text activity to get the required data. You can tag the data to respective anchors for more accuracy.

Another method is document understanding. You can either use the ML model in uipath or you can create your own model.

Hi @Rimsha_Aizaz1,

Welcome to community.

There is a very useful method to do this. You can follow the doc and videos.

https://www.youtube.com/results?search_query=uipath+document+understanding+demo&sp=EgQIBRAB

Regards,
MY

Hi

Welcome to uipath forum

You can use loop like FOR EACH, WHILE, DO WHILE and parse through each line read from a pdf file

But to choose the exact one we need to look at the final output we are going to get

Here you want to get similar set of data from a pdf file which requires lot of training and ML Skills

Try with DOCUMENT UNDERSTANDING with AI CENTER

Check this doc for more details

AI Center

About AI Center™.

Or

If your scenario is even simple then I would ask for elaborate explanation about the scenario

Cheers @Rimsha_Aizaz1

thank you for your kind support . i just wanted to use get text activity to capture similar set of data each time, i was trying to make the selector dynamic but unfortunately its not validating

im unable to find the build in template

@Rimsha_Aizaz1

Can you plz confirm whether you trying to capture same set of data from pdf and the labels are also static??

Also plz share the selector which you tried here. If there are some dynamic attributes you can either remove it or give *

here is the snapshot of the pdf file i want to read. Please help