C# code for converting scanned pdf into excel

Hi everyone ,

Can anyone provide me c# code to convert scanned pdf into excel.

Please help me…

Thanks in advance

Hi @bhanusai_sajja

=> Download UiPath.PDF.Activities dependency from Manage Packages.
Steps: Go to Manage Packages → All Packages → Type UiPath.PDF.Activities → Click Install → Click Save.

=> Use Read PDF with OCR activity to read the PDF and store the output in a variable say str_ExtractedText. You can any OCR engine like Tesseract OCR.
image

=> Use Regular Expressions to extract the required data from the text.

=> After that you can write the extracted data back to excel using Write Range Workbook.

If possible share the PDF so that I can help you create an flow in C#.

Regards

Hi @vrdabberu ,

Here is the sample document. Please help me out to convert this to excel using c#.
image-based-pdf-sample.pdf (224.0 KB)

Hi @bhanusai_sajja

Can you specify what has to be written as output in Excel.

Regards

@vrdabberu ,

I want to extract the all text into excel and you can ignore any images.

1 Like

Hi @bhanusai_sajja

Check the below zip file. Let me know if you have any queries.
BlankProcess8.zip (313.9 KB)

FLOW:

You can see the output in Excel. Before running the flow delete the excel file and run the flow.

Regards

@vrdabberu ,

I need C# code and i want to do this in visual studio through C# code.