Want to Merge Data from Multiple PDFs into a Single PDF per Account Number

samantha_shah · May 3, 2024, 3:44am

Hello everyone,

I’m working on a UiPath automation where I need to extract and merge data from two PDF files, PDF A and PDF B, into a single PDF file for each account number. The files are input in a non-fixed order and contain scattered data across various pages.

Details:

PDF A: Contains account numbers and related data.
PDF B: Contains account numbers, related data, and a program date.
The data related to account numbers are not consistently located on the same pages in either PDF.

the challenge is how to effectively consolidate this scattered data from both PDFs into one coherent PDF file per account number.

my logic: * I use a DataTable to capture the program date from PDF B, intending to merge this with the data from PDF A for the corresponding account number.

However, I am encountering a significant challenge: the DataTable is nullified when the workflow processes the second PDF, which leads to data loss from the first processed PDF.

Is there an alternative approach or strategy that could better handle the data consolidation from these randomly paginated PDFs?

Anil_G · May 3, 2024, 4:24am

@samantha_shah

Use for each file in folder and loop through files

Then use a read pdf tect activity and extract the account number…now using one more for eqch file in folder on an out folder where the merged files are saved search if the pdf with account number identified is already present if present then merge the new pdf on to old…if not found then just copy the file and rename the file with account number

So this ensures that the first file will be copied and named after account number and from the second gile with same account number is merged with the first one

Hope this helps

Cheers

samantha_shah · May 3, 2024, 5:35am

Hi @Anil_G thanks for the reply

please see my workflow steps:

data on both the pdfs are different and we want all of that data, only common thing is account number.
**naming convention : sample: **
FD_044554_Customerlnvoice_________Final Pricing May 2024_________x_

where 0445544 is the account number and May 2024 is the program date.

1.) both my pdfs are run one by one, (order is not fixed)in the same sequence((using regex)).
2.)for example: once the pdf(B) is there in the sequence → AccountNumber is extracted, Program date is extracted(using regex)and page numbers are counted(using counter).
3.)all this data is added to a datatable DT. (DT has multiple entries for same account number since the data in the pdf is scattered and not present in single page)
if you can suggest a better way here.for step 3.
4.) once i have my DT ready, i am running a for each row in DT loop.
for every row in DT where account number is same , I extract the Programdate, PageNumber and that common Accountnumber and try to create a pdf (using Extract pdf).
the range of the Extract pdf is given as for example(currentrow(pagenumber))
5.) once the complete DT is iterated a single pdf file is created per account number as above naming convention. in sample
6.) the pdf is moved to completed number
→
now when the second pdf starts running, the datatable is reintialized at the start of sequence because this workflow is invoked again, from the main file
and
all the above steps are repeated
but i donot have the program date for the account number now, because only pdf B has that data.

My doubts:
1.)so how can i utilise the previosuly creatted pdf files naming convention where i can get the program date ??
2.)or else how can i use the previosuly created datatable to extract program date for that account number?
but program date is only available in one of the pdf-(PDF B).

3.) or how can i use your idea or expertise ??

kindly help me to think better

Anil_G · May 3, 2024, 5:43am

@samantha_shah

Send that datatable as argument and make direction as in/out…so that the previous data is also retained and new data is also added

Cheers

system · May 6, 2024, 5:44am

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Merging Data from Multiple PDFs into a Single PDF per Account Number in UiPath Studio datatable , pdf , activities , question	10	203	May 3, 2024
Combining Data from Multiple Pages into One PDF per Account Number Studio pdf , studio , question , help	2	148	April 25, 2024
Need Help Reducing Processing Time for Splitting and Merging PDFs by Account Number in UiPath Studio pdf , studio , question	1	152	May 15, 2024
How to merge bulk multiple pdf into single file Activities datatable , excel , uiautomation , pdf , activities , question	4	470	June 21, 2023
Extracting Tables from Multiple PDFs Studio datatable , excel , pdf , studio	1	961	April 1, 2020

Want to Merge Data from Multiple PDFs into a Single PDF per Account Number

Related topics