How to extract financial pdf document and save it to excel without using Document Understanding

Lalitha_Selvaraj · November 27, 2023, 1:43pm

How to extract financial pdf document and save it to excel without using Document Understanding and regular expressions?

rikulsilva · November 27, 2023, 2:14pm

Hi @Lalitha_Selvaraj

I think there is no other good options for this without third party solution. Extracting information from PDF requires exception mechanism, validation and so on.

You can try this activity

Extract Tables from PDF - RPA Component | UiPath Marketplace | Overview

Lalitha_Selvaraj · November 27, 2023, 2:38pm

Hi @riku_silva Thanks for the response. I’m having an financial pdf document(I dont have any tables in my document). I need to extract data and save it to excel. Is there any way to extract data without using document understanding.

postwick · November 27, 2023, 2:42pm

Why do you keep saying without document understanding? DU is how you do this kind of thing.

The only other option is text manipulation which is tedious and requires the PDF is a text PDF (ie not a scanned document).

rikulsilva · November 27, 2023, 2:50pm

Hi @Lalitha_Selvaraj

Thank you for clarification

To extract text from PDF you can use UiPath.PDF.Activities, but you need to organize the information before send to excel, right? So you need string manipulation or a powerful tool to deal with it for you

There is no activity that extract exactly you want and save to excel without effort to customize

Lalitha_Selvaraj · November 27, 2023, 2:51pm

No @postwick . Our client is not accepting document understanding. since it required license.

Lalitha_Selvaraj · November 27, 2023, 3:12pm

Thanks @rikulsilva. I’ll try with text manipulation.

postwick · November 27, 2023, 3:17pm

No it doesn’t. You can do it all locally.

rikulsilva · November 27, 2023, 3:25pm

@postwick

Just to check

the mentioned activities is only of OCR purposes, right? To transform the PDF in readable text for machine. To train a model and make prediction you still need license, right?

postwick · November 27, 2023, 4:05pm

No, the only thing that is cloud based is the OCR/digitization but that can be done locally with the link I shared. Everything else you need to extract data is done locally with the DU activities.

Anil_G · November 27, 2023, 4:13pm

@Lalitha_Selvaraj

Welcome to the community

you can use regex based extractors…string manipulation is also part of regex

cheers

Lalitha_Selvaraj · November 27, 2023, 5:22pm

Thanks @Anil_G . I will check on that.

Lalitha_Selvaraj · November 27, 2023, 5:28pm

Thanks @postwick. for eg, If I need to extract data for around 500 pdfs , shall I use document understanding for that?
There’s no need for license right?

postwick · November 27, 2023, 6:29pm

You can use DU for that, yes. Set up your taxonomy (at least one document type) and then use the Classify Document scope with Keyword Classifier to identify the document type. Put your fields in the taxonomy under each document type and use Regex to identify where each data point is.

postwick · November 27, 2023, 6:41pm

I suggest creating a new project from the Document Understanding template to see how to do these things. Just add the local server package, and in Project Settings set local server to true.

Lalitha_Selvaraj · November 28, 2023, 5:34am

Thanks @postwick. I will check on that…

sai_gupta · November 28, 2023, 5:59am

@Lalitha_Selvaraj
try this :

Use the “Read PDF Text” activity in UiPath to extract text from the PDF, and then use the “Write Range” activity to save the data to an Excel file.

cheers…!

system · December 4, 2023, 9:54am

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Extract PDF data to excel without Document Understanding Activities excel , studio , regex , question , pdf-extraction , pdf-conversion	4	643	June 2, 2023
I need to extract all the details from invoices pdf and line item describtion quantity and all the fields and i need to do this for all pdf files in the folder Studio studio , question , activities_panel	23	3158	June 30, 2021
How to extract a table from pdf to excel Studio excel , activities	18	6609	July 19, 2023
How to extract table from pdf file without using document understanding and regex to an excel sheet Studio	3	2463	February 1, 2024
How to extract data from digitize pdf Studio studio , question , activities_panel	4	32	March 28, 2025

How to extract financial pdf document and save it to excel without using Document Understanding

Related topics