I was wondering if anyone new an answer to my issue here. I generate PDF documents using XML Data from my website. I then send it through a service to be digitally signed by individuals. After that I have made a process that reads the document with Google OCR. It takes the information it gains and inputs them into their appropriate locations. Since there is XML data located in the PDF from when it was generated and I know you can parse XML data out of a pdf (I don’t know how accurately), but is their a way to automate the parsing of XML data from a PDF. I just am not of a fan of using OCR of any kind because it can be so unpredictable.
I personally do not know how to do it, but was curious.
Would you try the suggestions from these threads and let us know the results?