Extract pdf data table to excel - Error when Table is Empty

Let me begin by saying I am very new to UiPath (my first automation ever) and automation in general. I followed a basic video online and I have reports with a date on the 2nd pdf page and a table I am extracting on the 4rd page of the pdf. I am pulling the data from the pdfs and placing into excel with the date beside each row extracted from the table.

Everything works great until I come to a report where the pdf table is empty (empty table with no data in it). The pdf is a daily report template that is exactly the same each day just some days the table will contain date and some days it is just an empty table on the pdf page. The table can have 20 rows of entries or 0 entries but when I have 0 entries the everything stops and the error cannot find table 2

I would like to have it where if the table is empty on that report it creates a row in the excel file that contains only the date but blank in the fields where the table is empty but am willing to settle for just skipping that pdf and moving on to the next.

I have included a screenshot of part of the process below. The error occurs at the first “Merge Data Table” because it cannot find table 2 when empty. (New users can only upload one item and no attachments…)

What activities can I add and what would the activities need to contain for me to be able to accomplish this goal? You may have to be quite specific because I am new to code. I have about 10,000 identical reports that I need to process so help is going to be greatly appreciated!

I am also going to be creating may future automations so hope to learn quickly! I also only have 50 days left on my trial to prove this is worth the investment to my company and need to learn to write these codes before then…

Hi @chris5163 ,

We could Check the Tables Count present in the Dataset using the Expression :

dataSet.Tables.Count

Moreover, I believe you are Merging all the Tables that were found during the Extraction together,

In that case, We could Loop through the Dataset.Tables using a For Each Loop, and use the Merge Datatable inside the Loop to Merge the tables one by one to the Output datatable.

Also, Could we look at the expanded For Each Before the Merge Datatable ?

1 Like

This For Each was a test that was put in originally that really just shows me what is being pulled in the output window but I have not deleted it since I wanted to see how far it made it through before crashing.

6

You are correct this is creating multiple tables and then merging them together at the end then writing them to the row. I am not sure that this is necessarily the cleanest way to do it but was the best youtube video I found to generally do what I am trying to do.

What you are saying above makes sense to me but I will have to begin looking up how to make a loop to merge together.

Your help is greatly appreciated.

Here is the top half of the process from the original post with everything except the next process below this one is the original screenshot if this helps for context.

@chris5163 , You Can Check the Below Image on Implementing the Loop :

But I would think we would also need to consider the availability of Simple fields as well, so we would first Require to Filter the Datasets with Only the Tables, then use it in the loop to add to Output Datatable.

Currently, Do we have the extracted Data exported to an Excel file ?

If so, Could you provide the Sheetnames of that Excel ?

Yes, the data is working and comes over to excel when the tables contain data. I have not renamed the sheets for right now it is set to Sheet1. I am calling the file with a variable strExcelPath.

The excel file columns generated are :

Date | Item | Quantity | Unit | Location

Hopefully this answered your question?

@chris5163 Could you Implement the Looping Logic as Suggested and Let us know if you are able to handle the Empty Table data values as well.

I am guessing I have not understood everything you are suggesting but I received the error within the look at the top merge data tables when I inserted the For Each. Maybe I am using the wrong For Each as well since I do not have the MaxIterations in my properties?

@chris5163 Change the Source of the Merge Datatable value to item, as item will contain each table from Dataset.Tables

This has allowed the process to move forward and processed the forms and added a blank line with the date for the table that was empty!

However, it is now writing everything to the excel file twice. It is also adding an extra line in the excel file for each report that have data in the table that contains the date and then the work “table” then the next row below it processes the table correctly, just twice.

Any thoughts on what I should delete or rewrite to remove the doubling and or the “table” row?

Thank you so much! I really appreciate your help!!!

@chris5163 Before the Merge Datatable, Could You use a Write Line Activity and use item.TableName as it’s value.

Also, Could you Provide us Screenshots of the Current Output if possible ?

I added the Write Line Activity before merge data table in the loop with item.TableName and it resulted in the same output.

I think I see where the value “Table” is coming from. In the document template the table was “payments” and it is adding a column and putting in “Table” as the result in the spreadsheet.

@chris5163 ,Could you Show us the Output Panel as to what values are being Printed ?

The Payments is a Separate Table I believe and we would require to Separate that Table with the Other Table.

If that’s the case, we would require two Output Datatables to store the data from two different tables.

We were not able to look at your documents, Hence, In this way we were trying to Understand the Document Structure.

So what is being pulled out is simply “Date” as text and then a table called “Payments” with columns named Date | Item | Quantity | Unit | Location.

Below is the output from the run:

11

@chris5163 We do see from the Output, that there are Simple Fields as well, Along with Payments Table.

We may want to Separate the Simple Fields and Payments into Different datatables.

For Keeping only the Payments Table only, in the earlier mentioned For Each value, we need to modify the value from dataSet.Tables to the value below :

dataSet.Tables.Cast(Of Datatable).where(Function(x)x.TableName.ToString.ToUpper.Equals("PAYMENTS"))

Also Let us know, How do you want the Output to appear.

From the Above Output, Do we have all the Extracted Data ? Else Provide us with the Expected Data and we can work on it.

I believe you do have an idea about the Extractio and Export Process now. Do let us know if you require further assistance.

This solved my issue and everything is coming over to the spreadsheet exactly like I need it!

Thank you so much for all of your help today!

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.