Extract information from a PDF

Hi guys

I am working on a project in which I have several PDF files that are in image format and I have to extract certain data that is not in order, I managed to make a loop that reads the file with OCR and passes it to a notebook what I can not to do is that from that notebook I can find a word and that I copy what follows example.

NIT 123
I look for the Word Nit and that brings me the 123 the nuemro will always change the word not

please give me a help how to do this thanks

You can loop through all of the lines in the text file with a For Each activity.
The TypeArgument should be String and the Values should be File.ReadAllLines(filepath)

Then inside the loop you can do an If statement and check whether the current line contains “NIT”. If it does you can split the line by spaces and grab the last item, which should be your number.

Hola @DanielMitchell , thank you for your prompt response. I did what you tell me, check the lines of the file but can not find the word NIT

I did it this way

image

Let’s say item is equal to “NIT 123”.

item=“NIT” fails because item isn’t equal to NIT.
Instead, do item.Contains(“NIT”)

@DanielMitchell

Works well you find the word Sorry I’m a little new in Uipath you help me know how I could divide the line by spaces so that I take the last element

@cristian_urrego

Str = “Robotics Process Automation”

Str.split(" ".TocharArray)(0) - Robotics

Str.split(" ".TocharArray)(1) - Process

Str.split(" ".TocharArray)(2) - Automation

1 Like

@cristian_urrego you can refer to @lakshman’s answer. The String Split method splits a single string up into an array of strings. You can then loop through them or process them however you want.

For your specific case, if the line is “NIT 123” then you can do
item.Split(" ".ToCharArray)(1) to split the line into “NIT” and “123” and grab the second item. (Indexes start at 0 so index 1 is second piece).

2 Likes

@DanielMitchell It works perfect. Thank you very much for your help you are very crack

1 Like

Hola @lakshman

Muchas gracias me funciona perfecto

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.