How do I read a PDF file in StudioX and extract the data to an Excel file

In StudioX I need to read a PDF file that contains a table and extract it to Excel. How is this done in StudioX? I’ve been able to use the Activities ‘Read PDF Text’ and ‘Write PDF Text’, but unsuccessful in writing the PDF to Excel. I have StudioX 2020.4.1 installed.

At this time the way to work with PDF files in StudioX is to open the file in a PDF reader like Acrobat and use the App/Web UI Automation activities to extract data. You can provide the name of the file as an argument on the “Use Application” activity to open the file you need.

5 Likes

Hi Andrew, thank you. That worked.

Is it possible to do the same with a PNG file ?, for example, in the video by Alex Sanz (Reboot Skills Course Week 1 Topic What is RPA) you see an example of the execution of an automation to extract data from a scanned document and the data is recorded in a Word file, it may also be that the data is recorded in Excel. Could this be done with StudioX ?, how could I do it ?.
Thank you.

Extracting text from an image requires OCR activities that haven’t yet been designed and optimized for use with StudioX. So at this time, if you need to extract text from an image, you will likely need to add those activities using the Studio profile. If you generally prefer StudioX, you can create the project in StudioX, switch to the Studio profile to add these capabilities and then open it again with the StudioX profile and continue editing it there. See the Get OCR Text documentation.

Thank you very much for the answer!. :slightly_smiling_face:
Excellent recommendation. I will read the documentation and do an exercise to learn.
Thank you. :+1:

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.