I got this sample pdf with 2 periods (in column) of data.
pdf.pdf (23.0 KB)
- I extract all text using “read pdf”
4th Qtr 19 1st Qtr 20
V Sales $
a) Revenue 189,500
b) Revenue2 200 300
- I plan to use RegExp to extract the data for each period.
For Revenue2, I can easily tell 4th Qtr 19=200, 1st Qtr 20=300
But for Revenue, I can’t tell which period 189,500 belongs to.
I realise when 1 period data is blank, we can’t tell which period the other data belong to. Is there anyway to to overcome this?
I also tried to extract specific elements with “Anchor Base” but it doesn’t work on my pdf files.