How to extract the table names from the pdf

Sample Data1

SlNo DataDetails Nos Amount Reviews
1 example1 1 2500 4
2 example2 2 3200 2
3 example3 1 1200 1
4 example4 1 2300 4

Sample Data2
SlNo DataDetails Nos Amount Reviews
1 example11 1 3500 4
2 example12 2 4200 2
3 example13 1 5200 1
4 example14 1 7300 4
SampleData1,Sample Data2 are the names of the table in my pdf …If im having totally 5-7 tables in my pdf how can i extract the table names(sample data1 etc…) from that pdf… Can anyone please help me to solve this issue.

Try Tabel Extraction, it’s present in modern design.

Hi @Chippy_Kolot, I hope you are doing well.
For your use case: Unfortunately, there is no activity in UiPath to read tables directly from PDFs. (As of today.) That was the bad news. The good news is that you can get to the contents of the PDF. Either you get the data (as flat text) directly with UiPath.PDF.Activities.ReadPDFText or you have to use OCR.

But you can proceed with----

  1. Extract text from the PDF document with UiPath.PDF.Activities.ReadPDFText.
  2. Create an array, where the elements are the lines in the document. (Split using Environment.NewLine and option StringSplitOptions.RemoveEmptyEntries)
  3. Go through lines in a loop (ForEach) until the table header is found. (StartsWith or Contains etc.)
  4. The next row belongs to the table as long as it contains a tab. (Otherwise the table is over.)
  5. Split current row by tab and store it in an array: The elements of the array are the individual cells of the row.

I hope, this idea help.

Thanks & Regards,
Shubham Dutta

1 Like