First of all; I just started my RPA journey and have no previous experience with programming. That said I find it very interesting and hope to some day be able to contribute to this community. I would be forever grateful if someone could help me out with my problem.
I am trying to create the following process:
I have a folder with a bunch of documents (“Data/Input”). Some are .pdf and others are .doc/.docx. I want to be able to read all of these documents and search for some specified keywords. If there is a match in the document for the specified keyword(s), I want to save the file into another folder.
To make it easy to change the keywords I would preferable like to store the keywords in a Config file.
So far I have been able to get the files, but as there are different activities for reading PDF and .doc/.docx i tried see if the string contained .pdf or not, and based on that tried to use the correct activity. For some reason, it thinks that non of the files contains “.pdf”. Furthermore, I am struggling to find out how to proceed based on this information - how can I select the correct files to read (pdf/word).
Again, it someone could help me out that would be really great.
Could you (or anybody else) by any chance show me how to choose the different files. Let’s say that a item in the string array “FullPath” contains .pdf. How do I choose this specific file in the loop? I prefer not to use OCR, but the Read PDF text activity. I have the same question in regards to the reading of the word documents.
You can write Directory.GetFiles(“DirectoryPath”,“*.pdf”) such code will fetch files with specific extension. While you are in a loop and you have condition for specific file all the operation would work on that file only.
You can perform operations on different type of files in same loop only
So it’s not possible to get all the files first, and then read them accordenly? You have to first only get the .pdf files etc.? I tried changing my workflow based on your first suggestions, but as you can see I am not able to run it.
No no that was an example if you want a specific file type if you wont pass that parameter you will get all the files from a folder.
You are already having the full path of the file when you are in For Each loop so in that loop you do not need to pass Directory.GetFiles again assuming that you would be only having PDF and Word files. You can simply pass the item like this
Aha - that makes a lot of sense. Sometimes it’s really easy to make things more complicated than neccesary. Thank you so much!
I have one more question though; in the copy file activity i tried putting in the location of the folder I wanted to move the file to, but for some reason I have not done it correctly. Ideally it would go the output folder Data\Output.
Looking closer at it though it might not acually be the output location that is wrong but rather the input item? Let me know if you want me to post an updated workflow.