Regarding extracting content from a pdf

Anusha_Makam · December 24, 2018, 12:13pm

Hello,

I am working on PDF automation and found challenges in scraping content from a perticular field called ‘Item desc’. The position of this field varied in each pdf and OCR Scraping dint help. Can someone suggest how i can scrape efficiently.

Thank You,
Anusha

TimK · December 24, 2018, 1:25pm

You can search for the text within the image and use anchors to get information from the field next to it

Anusha_Makam · December 24, 2018, 1:29pm

Hello,
Thank you for the quick reply.
Yes, I tried that as well.
But as I said earlier, the position of the parameter keeps changing over many pdf s.So it wasnt suitable.

Thank you

Vinutha · December 24, 2018, 3:16pm

Use find image activity and find the image for ‘Item desc’ in anchor part of anchor base activity
In the Action part of anchor base do the OCR scraping for value of ‘Item desc’

Anusha_Makam · December 26, 2018, 4:14am

Hello,

I tried that too. Its taking the position of the ‘Item Desc’. Doesnt work.

Thank you

indra · December 26, 2018, 6:23am

@Anusha_Makam Can you share one dummy pdf file so it will be easy to give solution.

Anusha_Makam · December 26, 2018, 6:38am

Hello,

invoice.pdf (159.4 KB)

Sometimes the items can be 1 and sometimes many.
Thank you!

indra · December 26, 2018, 7:11am

@Anusha_Makam Here you go its working using Regular Expression

PdfExtraction.zip (154.3 KB)

kantheshm · December 27, 2018, 6:10am

Hi Indra,

Your method helped me a lot,but my only concern is to scrape the amount,grand total out of the that PDF.Is there any way to get the data out it by using your method??

Plz refer the attached screenshot.

Regads,
Kanthesh

Topic		Replies	Views
Help with extracting from pdf Help	14	2756	March 10, 2019
I need to extract all the details from invoices Studio studio , question , activities_panel	20	1082	August 28, 2023
Data scrapping for Pdf Studio uiautomation	39	1878	March 5, 2022
PDF Scrapping get data from PDF Studio pdf , studio , question , landing_screen , pdf-extraction	1	168	March 20, 2024
How to get the specific data from the pdf using ocr Help studio	10	5512	June 1, 2019

Regarding extracting content from a pdf

Related topics