Read PDF Text changes table order in the last PDF page?

pdf

#1

Hi,

I’m using the “Read PDF text” to extract the text of a 6 page PDF file into a string.
And when printing this string using writeline, I notice that the table order of the last PDF file is changed in the string: imagine that the last PDF File has Table1 on the top and Table2 on the bottom and the extracted string shows first the text belonging to Table2 and then Table1.

Does anyone know what is happening here?

Thanks in advance.


#2

This happens sometimes because the pdf which you are using is exposing the data in a similar manner.

You can check by copying the whole data of the pdf and pasting it in a notepad. It will paste the data in the same manner.

Check if it is happening for all the PDF, if yes, then you can build your logic on the same format that read PDF Text activity is giving


#3

@Bharat_Kumar, thanks for your reply.

Copying and pasting the PDF text directly to notepad gives a completely different result than using the “Read PDF Text” activity. So that action is not comparable.
Really don’t understand what is happening to the last page and if this will always occur.