Hi, I’m facing issue to extract data from PDF table. I tried suggestions from here but not able to solve the issue. Need help!
My data is confidential, but I will use sample file from other thread to explain - https://aws1.discourse-cdn.com/uipath/original/3X/0/8/08d920acd8924b1c5153f06859df13f22f60cb3b.pdf (see table on page 2). My PDF table is similar with few more columns. Using “Read PDF text” I get string with rows separated by new line and columns separated by space. I can use split string to separate the rows by new line, but the problem is when trying to separate the columns. I have no pattern that I can use to split the columns. Using above table, imagine some driver names 2 words but some have 3-4. And car name has no fixed pattern. Imagine another column called “Team” which also has entries with 2-4 words. So I think to get the data into datatable, I need to split columns in some other way.
I tried getting robot to open the pdf and scrape data. Data scraping is giving me weird ouput. I am thiking to use loop+get text to change selector to point to different cells of the pdf. But using UI explorer, not able to find attributes that point to different cells of the PDF table. I don’t know how to explore PDF structure in more details.
Please help! Thank you