I need your help, Im reading a PDF Invoice and i need to get some items, included in a table.
Do you have any Idea how can i get them ? because when i spliting the text there are a lof tof spaces and is not working.
Thank you in advance.
Handling PDF data varies on a case-by-case basis. If the invoice has the same format every time, there is likely a way to handle this scenario. Is there a way you can share the text you’ve scraped from the PDF with NPI and proprietary data replaced with dummy data?
In this video, I extract tables from PDF and write data in Excel:
0:25 Install PDF Activities
1:10 READ PDF text, Get PDF page count, Extract PDF
5:40 Read PDF with OCR
6:55 Join PDF and Manage PDF passwords
9:30 Extract Images From PDF and Export PDF as Image
12:00 Extract table from PDF use-cases 1 replace some spaces with | (one column has multiple words)
24:00 Run the robot to see the result
25:40 Extract Table from other PDF use-cases 2 delimiter is 2*spaces " " easy split
31:50 Extract Table from complex PDF use-cases 3 unstructured data the logic will be based on IsUpper and IsLower
40:25 Extract the price value from PDF