Hello UiPath community,
I am currently working on a project that involves extracting text data from a dynamic betting website in real-time. I have been having difficulty getting the text to render correctly and maintain its structure.
The best solution I have come up with so far is to save the website as a PDF, open it with Abbyy FineReader 16 OCR, and manually crop and align the text before saving it as an Excel file. However, this process is time-consuming and not ideal for real-time workflow.
I am hoping someone in the community has experience with similar tasks and could provide guidance on how to automate the process using UiPath. Specifically, I would appreciate any advice on how to extract the text data while preserving its structure and how to incorporate it into my Microsoft 365 environment for further analysis.
Thank you in advance for your help!
I dont know the purpose of your task and I didnt checked that out, but did you try to get txt from that pdf just with read pdf text activity and preserve text format option on?
ive tried alot of things. whats happening its from a site that layers its text like 5 layers high than places random invisable text and non text over the top of displayed to attempt in stopping the data from being captured i guess. its data from behind a paying wall. i have the ability the chance my layout which helps as the less i grab at once the easier it is to remove in structure. its a part of data set that i analyze along with a heap of others. strsight up i usually do all my work with a pen, pad and a rule nut it quickly gets exhausting so sorry if im asking something seemingly silly as a request. im 3 days solid at this and maybe lack of knowledge and sleep. i just see how nice its rendered on the site i save it from and maybe im being to much of a perfectionist for my abilitys to handle. either way thanks for your reply. i will go do that right now encase i have not already do so. cheers