PDF Table extraction

Cristian_Negulescu · February 28, 2021, 7:36pm

Hello Anand,
In this video, I have 17 use-cases for extracting tables from PDF and write data in Excel and I have also exampels with multiple pages:

45:50 File 10 PDF with multiple columns that have multiple lines + multiple pages
1:17:10 File 19 PDF with multiple pages and columns with multiple lines

Code:

github.com

cristinegulescu/startUiPathFromSalesforce/blob/master/PDFdecode.txt

        'FILE1
        Dim strtmp As String
        strtmp = strin.Substring(strin.IndexOf("Number"), strin.IndexOf("Subtotal") - strin.IndexOf("Number")).Trim
        strout = strtmp.Replace(" ", "|")

        strtmp = strin.Substring(strin.IndexOf("Subtotal") + 8)
        strpar = strtmp.Substring(0, strtmp.IndexOf(Environment.NewLine)).Trim


        'FILE2
        Dim strtmp As String
        Dim strout As String
        strout = "Col1|Col2|Col3|Col4"
        strtmp = strin.Substring(strin.IndexOf("Vacancies") + 11).Trim
        For Each line As String In strtmp.Split(New String() {Environment.NewLine}, StringSplitOptions.RemoveEmptyEntries)
            If (line.Length > 3) Then
                If (IsNumeric(line(0))) And (line(1) = " ") And (line(2) = " ") Then
                    strout = strout + Environment.NewLine + line.Replace("  ", "").Replace("  ", "|").Trim
                ElseIf (line(0) = "") And (line(1) = " ") And (line(2) = " ") Then
                    strout = strout + line.Replace("  ", "$").Trim()

This file has been truncated. show original

Thanks,
Cristian Negulescu

Topic		Replies	Views
Extracting one table from PDF Robot	5	888	February 24, 2021
Extract PDF tabular data and save to excel Certification pdf	11	1941	February 28, 2021
How to extract tables when multiple pages in pdf file Studio studio , question , activities_panel	9	316	November 23, 2023
Extract PDF tabular data Studio datatable , excel , pdf , activities , data_scraping	10	1410	February 24, 2020
Table extraction pdf Studio studio , question , activities_panel	3	367	June 20, 2023

Most Active Users - Yesterday
Anil_G
ashokkarale
jinal.shah
Gautham_Pattabiraman
postwick
chandreshsinh.jadeja
vrdabberu
Ajay_Mishra
sven.wullum1
Vyshnavi_Nalumachu
More details...

PDF Table extraction

Related Topics