Read PDF Text changes table order in the last PDF page?

jcab · June 19, 2018, 11:36am

Hi,

I’m using the “Read PDF text” to extract the text of a 6 page PDF file into a string.
And when printing this string using writeline, I notice that the table order of the last PDF file is changed in the string: imagine that the last PDF File has Table1 on the top and Table2 on the bottom and the extracted string shows first the text belonging to Table2 and then Table1.

Does anyone know what is happening here?

Thanks in advance.

Bharat_Kumar · June 19, 2018, 12:05pm

This happens sometimes because the pdf which you are using is exposing the data in a similar manner.

You can check by copying the whole data of the pdf and pasting it in a notepad. It will paste the data in the same manner.

Check if it is happening for all the PDF, if yes, then you can build your logic on the same format that read PDF Text activity is giving

jcab · June 19, 2018, 1:07pm

@Bharat_Kumar, thanks for your reply.

Copying and pasting the PDF text directly to notepad gives a completely different result than using the “Read PDF Text” activity. So that action is not comparable.
Really don’t understand what is happening to the last page and if this will always occur.

Topic		Replies	Views
Read PDF Text Changes - No Longer Matching FullText Help	3	905	February 10, 2020
Read PDF Text Strange Anomaly Activities pdf , activities , question	5	915	October 1, 2021
Query related to PDF extraction through Document Activities activities , question , document_understanding	4	395	January 25, 2023
Order of elements are different at reading of PDF with ReadPDFtext Help	3	869	February 15, 2019
Get Text activity writes only last page of a PDF file Help	1	856	August 6, 2019

Most Active Users - Yesterday
Anil_G
ashokkarale
Ajay_Mishra
Gautham_Pattabiraman
BHUSHAN_NAGAONKAR1
vrdabberu
ABHIMANYU_THITE1
lrtetala
samantha_shah
shyamala_shyamu
More details...

Read PDF Text changes table order in the last PDF page?

Related Topics