Word Document Specific Information Extraction

excel

#1

Hi I have been using uipath for 2 weeks now and I am trying to figure out whether uipath is able to extract specific information (like module name, module code) etc from multiple word documents and save in excel. The problem I am having now is that I am using get text, full text with ocr, screen scrapping all these tools to extract the information that I need from a single word document. After that I would like the program to run through all 63 word documents that I have with similar information that I want to extract. Apparently I am able to read only 1 word document but I can;t loop through 63 documents. Please help i have been trying for 2 full weeks and I am stuck.


#2

Hi there @Jovian_Low,
If you place each Word document inside the same folder, you can certainly loop through each via:

For Each file In Directory.GetFiles("PathToFiles")

  • Perform necessary functionality

Next

Within the For Each, you can access each file path via:

file.ToString

With that said, it would be better to leverage Orchestrator, iterating through each file and adding a queue item, with the appropriate file path.

Then, use the ReFramework, retrieving each queue item, working it and updating it, before getting the next.

Thanks in advance,
Josh


#3

Hi, is it possible if you give me a simple example as an elaboration? Because I do not quite understand how it works… Like how do you do the for each file in directory.GetFiles(“PathToFiles”) etc. What I currently did is, the sequence: an input dialog to input how many word documents you want to extract, build data table for creating the columns in excel spreadsheet, assign fileList to the folder directory, a do while loop with all the get text with ocr and screen scrapping and then i have a variable val with default 0 and i assigned val = val + 1 after the condition, my condition for the do while is val<CInt(num). num is my variable for the input dialog and val is what i created as default 0. after the do while loop i added data row and write csv. Could you elaborate from here regarding to what you said ?


#4

Hi there @Jovian_Low,
Would you be able to provide your source files?

It may be easier to explain :smiley:

Thanks in advance,
Josh


#5

Hi, I may have to upload the file to you on monday as its saved on another pc at my workplace. Sorry and thanks for replying sir!


#6

Hi there @Jovian_Low,
That’s not a problem in the slightest!

Have a fantastic weekend.

Thanks once again,
Josh


#7

Hi @Mr_JDavey ,
New users cannot upload any attachments so I screenshot the entire process down.

![image%202|354x499]
Hopefully this provides enough information for you to understand! So what I am trying to do is extracting 63 word documents from 1 folder and getting specific information from each of the word document like module code, module name etc from all 63 word documents and there is different formatting of these word documents like some word docs may not have module code for example. After extracting it, I want to save it as excel file so all 63 word document informations would be stored in an excel file. Hopefully you are able to provide some guidance or help! :slight_smile:


#8

sry can only upload 1 image at a time. Heres the second image


#9

Third image:


#10

variable image:


There is 1 variable missing in the image which is num(int32) which is used to store the value keyed in the input dialog.