Hi all, is it possible to get 100% accurate data after comparing two pdf files and find out unmatched data??
It depends on the pdf quality if the quality is good than you can get 100% accuracy.
For matching the data you needs to do string manipulation.
there are many reasons behind let me tell you one by one
–First the file format, both the pdf must be of same format, either fully as native or fully as image or even if one has both native and image in it , the another one must have the same
–Second is the text alignment, if the alignment even if slightly changes it will affect the output comparison report
–Third is the number of pages, if the number of pages differ we wont get the exact result
–Fourth is if the pdf has any tabular data in it , we need to make sure that both pdf has the table at exact position and of similar structure
–this applies not only to table but also to the text position as well, and it helps us if we are trying to get the value of a specific term
hope this would help you
@Palaniyappan Thank you so much for enlightening me on this…I appreciate…
do we have any queries on it to be discussed buddy
@Palaniyappan…Yes, I still need your guidance in order to fix my compare pdf files and find out unmatched data with line number and page number…If you can help me with this …It would be big relief for me…Thank you once gain.
may i know what issue you were facing
@Palaniyappan…I am able to get 50-60% unmatched data and I am also getting line number but line number is coming alternate for ex: 1,3,5,7 etc…Shall I attach my workflow for your better understanding??