How to convert Pdf file to excel and save in one location?

Hi @arivu96
HI @shankm
Hi @Dominic

can you please solve with this issues and guide me to know this scenario

Thanks in advance
Regards
RaviDevaraj

Hi @RaviDevaraj,

Refer this
Main (32).xaml (11.7 KB)

i just corrected your errors in the xaml file.

but i have few question in your flow.

i dont know in Write line you are given dt.Rows.count but where you are getting the value of the datatable.

Regards,
Arivu

Hi @arivu96

I have given dt.Rows.count but where you are getting the value of the datatable. ?

i need to get data from PDF File only not from DataTable !!

as per my knowledge i saw few examples and theory part i think pdf file can be convert as word only giving so many examples but convert to excel are rare ?

"so it is possible for convert PDF file to EXCEL file "

you have share the xaml file, if i run that shows an error message Like this

Error in For each loop also : see the below error screenshot too

i asked the same question to you in the previous post too.
ok now i got pdf to excel right?

Regards,
Arivu

HI @arivu96
ok now i got pdf to excel right? yes @arivu96

it is possible to do ?

Regards
RaviDevaraj

1 Like

Hi @RaviDevaraj,

yes its possible

Regards,
Arivu

HI @arivu96

wow Great, what are the steps to follows and then what are the controls need to use to achieve this “convert PDF file to Excel and i need to save in one location”

Regards
RaviDevaraj

Hi @RaviDevaraj,

Refer this xaml file
Main (32).xaml (9.8 KB)

Regards,
Arivu

Hi @arivu96

the xaml file running successfully ?
but in the project folder i have two excel file shown below:

in both excel empty only there is no data getting from PDF File too ?
why what is the reason we didn’t getting a data from the PDF FILE TO EXCEL File ?

Regards
RaviDevaraj

Hi @RaviDevaraj,

Read PDF text activity → select your pdf file

Write CSV → choose your excel/csv file

Regards,
Arivu

Hi @arivu96
these are details i gave see the two screenshots:

Again why we are using the DataTable here ? see the output screen too 234 records showing ?

In that i am get a data in a single column only is not like in pdf what we expect ?

output of excel sheet find below:
Untitled.xlsx (10.6 KB)
Please find the excate pdf file format how it is ? it has 10 column and two customer name and 12 records only.
see the attchment of pdf file too .
1090444_COMS_Commission Statement[1].PDF (148.3 KB)

Hi @arivu96

i gave filename as untiled But here two files is there like that

see the file type too that is also difference?

  1. In Microsoft Excel Worksheet in that there is No Data [i created this one only]
  2. In Microsoft Excel Comma Separated Values File in that there is a Actual Data [All Data]
    image

Hi @arivu96
Are you there ?

see the previous post and this post to to queries ?
we can able to do the further changes in that xaml file for my requirements
can you please help me ?

Regards
RaviDevaraj.

Hi @arivu96

Are you there?

we can find out any solution for these issues ?

above Previous two post i mentioned?
Regards
RaviDevaraj

Hi @arivu96

Reply me ??
We have a solution for that issues in Excel we are facing right now.
Above posts I mentioned?
Regards
RaviDevaraj

HI @RaviDevaraj,

Sorry for the late reply.

yes you can get like that only.

if you want same as pdf you need to do formatting after reading the pdf then you need to write in excel file.

Regards,
Arivu

Hi @arivu96

As you mentioned above post “if you want same as pdf you need to do formatting after reading the pdf then you need to write in excel file.”

How can we able do that process in Excel ? [Reading the PDF File and writing in Excel formate]

Regards
RaviDevaraj

Hi @arivu96

can you please help me with this issues for me ?

Regards
RaviDevaraj

Posting this here also, since I solved this for him in private for anyone else seeking ideas on PDF extraction.


This almost made my head explode, but I got it working.

I’m not entirely sure if it meets all your requirements, but it formulates the text to a table with all the headers and places the check number, date, and amount in the filename. Much of it is driven by vb.net lambda (ie .Select, .Where, .Split).

I did my best to create it as dynamically as possible, but with limited testing there could still be some glitches.

I created many variables to make maintenance easier on you, and also Annotated all the key coding parts to help describe what it is doing.

In summary, here are the steps it takes:
— Reads text and fixes any newline characters that will cause manipulation issues
— Extracts all top text with headers
— Removes all top text with headers from the full text to store data to be added to table
— Extracts all Account numbers and names since they only appear once
— Extracts Account Totals and removes from text
— Extracts Check Number at bottom
— Extracts Check Date at bottom
— Extracts headers only and outputs it to CSV, then reads back to a data table
— Removes accounts and names from data text to prepare for For Each loop
— Split data text by page and loops through each page
— Loops through each account and continues on current one if end of page occurs
— Loops through each line
— Adds items to “|”-delimitted string and as item array adds to table
— Moves to next account when Paid amount sum equals Account total
— Once all text has been looped through, Write table to CSV

PDFTextExtraction.xaml (46.6 KB)

I have attached workflow and screenie of output results.

Regards!

Clayton.

3 Likes

It is not possible to covert into excel but u can paste the data in text file from that link the .text file to excel sheet and enable auto refresh when the file is open