How to extract specific data like invoice no, supplier code from multiple invoice pdf with same layout?

Hello All,

Anyone can share their workflow with me on how to extract specific data from pdf to excel? Could someone can assist me? I need to extract invoice no, invoice date etc … from the pdf and put it into excel file. Thanks.

you can extract by applying the regex to the file after converting pdf to text.

1 Like

Have a read here

1 Like

Thanks @Jersey_Practical_Sho … I am NOT getting the desired text after using the OCR. Using regex and getting specific data would be the next step :slight_smile:

1 Like

Hello Pankit , how to apply regex?

hi vishal, how to apply regex?

best way is to learn it, I think personally it will be quite time consuming but possible in regards to extracting the information.

hi @Jersey_Practical_Sho

Any simpler solution available other than using regex to extract data ?

If you used something like this Extract Semi Structured Document from the UiPathTeam.MLModel.Activities, you can out put to a file in the form of a list

Invoice Number…