How can I use OCR to automatically read the data in the PDF (e.g. name, date, ID number), and then transfer the data to the specified excel?

Hello, I am a UiPath newbie and I have a question for you. How can I use OCR to automatically read the data in the PDF (e.g. name, date, ID number), and then transfer the data to the specified excel?

Because the location of PDF image may be different, so I hope UiPath can read the
“keyword” instead of the location. Finally, since I am a newbie and not familiar with the system, can you please list the steps or system pictures for my reference? Thanks

There’s the PDF picture sample

There’s the excel sample ,thank you

Hi @sally

After extraction you have to use regular expressions to extract the required data and then you can send to the excel.

But you can’t directly extract the required data by using OCR.

Hope it helps!!

1 Like

Hi,

OCR can’t extract all the data automatically. You have to train OCR by indicating elements or read pdf using OCR & then do string manipulation to have your data.

Thanks

1 Like

Hi @sally

Welcome to the UiPath Community!

To achieve this you can use Document Understanding - Introduction (uipath.com)

Thanks,
Ashok :slight_smile:

1 Like

@sally

Wwlcome to the community

Let me give you some definition

Ocr is used to read or identify the data and store it…but not to extract the required sta alone…

Ocr is a tool to read images pdf etc but not to extract only specific info

You either need to combine with string manipulations or document understanding a more advanced concept to get thw required data you need

Cheers

1 Like

@Anil_G @Jayesh_678
Hello,

Thanks for your reply (-^〇^-)

Sorry, I’m not very familiar with OCR, so it’s not a good question.

May I know how to combine with string manipulation to get the required data I need?(=^▽^=)

@sally

  1. Use read pdf text activity or read ocr text activity to read the pdf data into a string variable…
  2. Write the data into a text documnet and check how the required data is coming out
  3. Depending on that use split or regularexpressions

Cheers

@Anil_G @Jayesh_678
Thanks
It may be a little difficult for me to understand🤔, I need some time to try (because I am a beginner)⊙﹏⊙.

If I really can’t finish, can you give me more detailed steps​?

Cheers

1 Like

@Anil_G @Jayesh_678
May I know which activity can use in my case?(I find the [extract pdf text] and [split text],but I am not sure which activity can I use between those.)

Also, How can I set up the [variable] like variable type, scope, Default ?

May you answer me

Thanks

@sally

You have to use both…first to read…second to extract

Go to variables panel to create variables adjust type scope etc

Would suggest go through academy trainings first to understand basics and then try

Cheers