How to extract the data from https://acme-test.uipath.com/assets/TestData/checks/18/Check-Request-For-19521694.pdf?

shalini23 · April 3, 2019, 7:31am

Hi All,

Let me know how to extract the data from https://acme-test.uipath.com/assets/TestData/checks/18/Check-Request-For-19521694.pdf? i have tried option like Get text, screen scraping and Read PDF text also, Nothing workout.Kindly provide solution for the same.

Jan_Brian_Despi · April 3, 2019, 7:36am

Hi @shalini23
You can use Read PDF Text

You just need to add the UiPath.PDF.Activities package.
Thanks and regards,
Despi

shalini23 · April 3, 2019, 7:51am

i have installed packages and used Read PDF text activity. I want to extract particular data from PDF. Apart from sub string method is there any other way to extract data from PDF?

Jan_Brian_Despi · April 3, 2019, 7:53am

Yes. You can.
You can use ABBYY FlexiCapture

Thanks and regards.

shyamm · April 3, 2019, 7:59am

plz try with cv activities

shalini23 · April 3, 2019, 8:01am

Let me know how i have to use that flexicapture inside my xaml file

shalini23 · April 3, 2019, 8:02am

Hi Shyamm,

I have tried screen scraping also…But it is not working…Thanks

Jan_Brian_Despi · April 3, 2019, 8:04am

There is a Connector that can be downloaded in UiPath Go!
https://go.uipath.com/component/abbyy-flexicapture-connector-for-uipath-31cc20
You can give this a try.

shalini23 · April 3, 2019, 8:12am

Hi Shyamm,

PFA image for your reference. In screen scraping, it is not extracting particular data.

shyamm · April 3, 2019, 8:16am

try with computer vision activitieshttps://forum.uipath.com/t/ai-computer-vision-is-now-available-for-preview/95730

shyamm · April 3, 2019, 8:21am

hey i’m Getting Output

shalini23 · April 3, 2019, 8:28am

Hi Shyamm,

Thats Great. Let me know what you did? can you share that file? i want to see.Is there any package i have to install for getting CV activities?Currently it is not coming in my activities panel.

shyamm · April 3, 2019, 8:41am

i’m just used screen scrapping only.
better to try with cv activities. plz go thrw above link you will find full details

shalini23 · April 3, 2019, 8:48am

Hi Shyamm,

I have tried the same.But i am not getting the result. PFA image for your reference. do i have to change anything in selectors?

shyamm · April 3, 2019, 9:01am

I tried with chrome not adobe.
How many pdf files you want to read, can i know that?

shalini23 · April 3, 2019, 9:03am

Hi Shyamm,

In ACME URL ,I have to download the type - W12 - open status PDF file and have to extract the value also.

Mariusz · April 3, 2019, 9:15am

@shalini23

I had similar problem with downloaded PDF files while going through dev training. If you have a problem with scraping partial text in PDF, there’s a note in the introduction to lesson 10 on how could you solve this issue.

Note 1: If the PDF is opened with Adobe Reader DC Acrobat, there might be a few steps to take before you can extract specific elements using UiPath studio methods. Start Acrobat and press Ctrl+k. That opens the Preferences pop-up. Select Reading, out of the categories on the left panel. Verify that the drop down Reading Order options is set to the Acrobat recommended option, ‘Infer reading order from document (recommended)’, ‘Page vs. Document’: should be set to ‘Read the entire document’ and ‘Confirm before tagging documents’ should be unchecked. Then on the left panel, click Accessibility. In the Other Accessibility Options section, check the first two boxes if they are not already checked: ‘Use document structure for tab order when no explicit tab order is specified’, ‘Enable assistive technology support’, and click OK.

Note 2: If you still have problems with extracting specific elements from the PDF file opened with Acrobat Reader DC, you could try an older version of Acrobat DC (any version starting with 18 should be fine https://www.adobe.com/devnet-docs/acrobatetk/tools/ReleaseNotesDC/index.html# ). Acrobat DC is updated automatically on computer to the last available version. In some of the latest versions (starting with 19) there could be problems with accessibility, Adobe Reader is slowly dropping support for untagged documents. Steps to follow:

uninstall the current version of Acrobat Reader DC

install the base release of Acrobat Reader DC https://www.adobe.com/devnet-docs/acrobatetk/tools/ReleaseNotesDC/continuous/dccontinuous.html#dccontinuous

install a patch to any of the versions starting with 18

disable Adobe Reader auto update How to Disable Automatic Update in Adobe Reader DC - WinTips.org

It didn’t worked for me tho. As I remember I used OCR scraping and string formatting to get the data. You could also use Computer Vision activities, which shyamm told you earlier about. This technology looks pretty neat, but I did not have use it in the development yet.

shyamm · April 3, 2019, 9:16am

Better to go with read pdf activity and get substring values

shyamm · April 3, 2019, 9:36am

BlankProcess.zip (26.7 KB)

I’m Attaching small work flow which is done by using cv activities

shalini23 · April 3, 2019, 10:15am

i am not able to download the same. Let me know any package i have to install for getting CV activities in UIPath

Topic		Replies	Views
I try to extract a specific data from pdf Studio pdf , question	2	859	March 7, 2020
Extract pdf Off-Topic Discussions	3	938	July 26, 2019
Extract data from pdf document Help pdf , activities , question	18	2142	February 3, 2020
German invoice data extraction pdf Help	3	1109	January 22, 2020
Read Specific Data From PDF Help	19	2440	September 24, 2019

How to extract the data from https://acme-test.uipath.com/assets/TestData/checks/18/Check-Request-For-19521694.pdf?

Related topics