1-- data – “07-December-2021” after (Purchase Order Date )
2-- “ATPL/PO.NO.039/1112” after (Customer Ref No)
3-- " Pasrt Software India Pvt. Ltd" after (Customer Name : )
4- “HW296-Go-Global for Windows Remote Access Software Concurrent User License”
If you are not using Document Understanding or some other OCR you will need to use Read PDF Text and then a number of regex statements. And, this option will only work on this document. If you present other documents against this option and they have different structures the second option will not work
You can read the pdf and then use the regular expression to fetch the value, but it will work only for similar pdf where the static fields remain same (like Purchase Order Date, Customer Name :… )
I tried for few of the fields and values are coming properly.
(?<=Purchase Order Date\s+)([\d-\w]+)
(?<=Customer Ref No\s+)([\w/.\d]+)
(?<=Customer Name :\s+)(.*)(?= Shipping)
Another way of fetching the goods details is Data Scrapping
Scrap the data and update the selector to accepts all the similar pdfs.
Filter the data using below condition (Column-5 is “Unit Price”) and create new DataTable
This DataTable will have only goods details from all similar invoice, which can be easily fetch by another workflow (pass this datatable as argument and write the logic to fetch all the information).