Extract Data from pdf: Unable to tell which period


I got this sample pdf with 2 periods (in column) of data.
pdf.pdf (23.0 KB)

  1. I extract all text using “read pdf”

4th Qtr 19 1st Qtr 20

V Sales $

  1. Toy

a) Revenue 189,500
b) Revenue2 200 300

  1. I plan to use RegExp to extract the data for each period.
    For Revenue2, I can easily tell 4th Qtr 19=200, 1st Qtr 20=300
    But for Revenue, I can’t tell which period 189,500 belongs to.
    I realise when 1 period data is blank, we can’t tell which period the other data belong to. Is there anyway to to overcome this?

I also tried to extract specific elements with “Anchor Base” but it doesn’t work on my pdf files.

Thank you


Can I use the Document_Understanding package to help solve this problem?

Is there any example of how to use the Document_Understanding package?

Thank you