I am automating a process, where I need to get data from a scanned Invoice… I am using relative scrapping and using an image for getting the coordinates of the data to be fetched…
Since this image is changing in each Bill, I am not getting any data from the bill.
I have seen an action, get OCr text position… How can I use this, to get the position of the text ; instead of image itself and get text from a relative position from that retrieved position?
since it is changing everytime, the relative text position and get text will not give expected results. If you have anything constant near that text which you want to capture, then we can use the OCR’s to get the entire text from the image and then we can substring the required text.
Can you attach the image if it doesn’t contain confidential data and the text you want?
Usually relative scrapping works by using a element or image as constant positioned term with which the text is obtained from the relative scrapped region and the region can have any text or numbers.
So the constant anchor positioned term must be same always but here it’s getting changed…
Since it is a scanned invoice it very easy to retrieve the term we want with simple Read OCR PDF activity which usually reads the entire scanned off and gives us the text in that pdf as a variable output of type string
— with which we can get the text we want using string manipulation
—so initially let’s try to fetch the details using Read OCR PDF activity and get the output with a string variable and use a simple assign activity to get the term we want using string manipulation like split or regex method.
Simple as it is buddy
Kindly try this and let know for any queries or clarification
Cheer @Pankit
Fine its mentioned like he is trying to get a text relative to anothe text, but not sure from where, well it can be a web application, desktop application, etc
but here we are in pdf, which can be handled with this scrape relative only when the application is installed in the machine and more over the page where that text appears must be a known factor and we might miss it at some cost, it may be in one page for the first time and in another page for next time,
So we can handle this with Read OCR PDF activity that would fetches all the details in the pdf and later we can get the required one based on string manipulatin