Read PDF from sub folder

I am having multiple PDF in multiple sub folder…
Now i want to read PDF with OCR and extact the the PDF and i am having an excel file in each sub folder… i want to write on particular sheet of excel in each sub folder…

Please let me know the flow…

Regards

@Satyam_Shrivastava

First use for each folder in folder

Inside that use for each file in folder and filter with pdf to get only pdf files

To get excel just for once run for each file in folder with xls as filter …then you can leverage reading and writing

Cheers

Hi @Anil_G
thanks for your reply…
here i am attaching my workflow…
please let me know where i have mistaken,
Main.xaml (29.1 KB)

Regards,

i have multiple pdf file in each folder
i am merging all pdf of each sub folder into one that sub folder… then i want to read merge pdf of each sub folder…and write in excel of each subfolder
then i want to run a VBA on each excel…

@Satyam_Shrivastava

check the reference

@Satyam_Shrivastava below code help you solve your isssue

i


if (!Directory.Exists(folderPath))
        {
            throw new ArgumentException("Folder path does not exist.");
        }

        // Get all PDF files in the folder and subfolders recursively
        var pdfFiles = Directory.EnumerateFiles(folderPath, "*.pdf", SearchOption.AllDirectories);

        foreach (var filePath in pdfFiles)
        {
            // Use a library like iTextSharp or PDFSharp to read the PDF content
            // This example demonstrates reading the file name for simplicity
            Console.WriteLine($"Reading PDF: {Path.GetFileName(filePath)}");

            // Use your preferred PDF reading library here to process the content
            // ...
        }

Hie @Satyam_Shrivastava so to read multiple file from a folder create a string varible and pass the folder path now create a variable of arrayof string and write the syntax in the value field Directory.getfiles(and pass the string folderpath) this way you can access of multiple file and to loop through each file use for each and read pdf with ocr and use get text with ocr to extract data same way you can access the excel sheet from the subfolders… and if you have multiple files which include excel or any file and you want only pdf you can pass this syntax (Directory.GetFiles( FolderPath,“*.pdf”)"
cheers Happy Automation