How to get specific text from a pdf

Hello all,
I am trying to get particular text(QuoteNumber) from a pdf from a folder called ‘Data’ in my current directory. I am using anchorGetDatafromPDF.xaml (6.8 KB) base to get my text but somehow i am not getting the desired output.Kindly help me out.

Below is the pdf image for reference.


Thanks in advance.

1 Like

Hi @BXP

Use read pdf text with ocr activity and pass the value as string

Use is match and pass the input as strinput and pattern as (?<=QUOTE NUMBER:).* you ll be only that value

Thanjks
Ashwin S

1 Like

Hi @BXP,
Its tough to find using anchor base it may not be right.
1.Use the read PDF with OCR or Read PDF Text Activity from the pdf activities package in the manage packages.
2.Then you can extract the required text from the output string.
Cheers.
If you find it useful mark it as solution and close the thread.
Any queries ping me bro…
Vashisht.

Hello @AshwinS2,
Thanks for your suggestion, I am working on this so after the condition in Is Match satisfies i am getting the result as boolean but how do i get the Quote number next to it?

Regards,
BXP

Hi @BXP,
you can use the split function

@BXP,

I have created a workflow based on your requirement.
Please unzip the folder and check it out whether it is giving solution or not.

Thanks,
Mohanraj.SSplit.zip (268.1 KB)

1 Like

hello @Vashisht @Mohansadaiyapillai,

My pdf document has two pages with multiple line, so its hard to use split function.

Thanks,
BXP

Can you share the PDF?

Hey,

I dont have the access to share the pdf but i can send you the .txt file extracted from pdf.

Regards,
BXP

Hi @BXP

Try to use Flexi Abby if you have multiple format of pdf?

cheers :slight_smile:

Happy learning :smiley:

3 Likes

Yeah please share the txt file

Hello @Vashisht ,

The issue is resolved, i used Read PDF Text activity and converted to string and later used Regex to get that particular text from the pdf text.
In Assign activity

QuoteNo=System.Text.RegularExpressions.Regex.Match( PdfData,“(?<=QUOTE NUMBER:).*(?=DATE:)”).value

Then i got my quote number.

Regards,
BP

2 Likes

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.