Compare two pdf files using Regex

Can anyone guide how to compare two pdf files using Regex?


Could you please tell more details like what you are trying to compare?

1 Like

Thank you for your quick response…I have two pdf files and need to compare and find out the differences between them…here is the sample pdf along with workflow…pdfcompareNew.xaml (14.1 KB) pdf-sample.pdf (7.8 KB) pdf-sample1.pdf (90.4 KB)

1 Like


Use Read PDF Text activity and will give you output as String right. Then you can use string manipulation functions to compare text in both PDF files.

Usually regex helps us only when we are trying to match a specific term or set of specific terms in our pdf
So for that we need to read the odd with either read pdf or read pdf ocr activity which will gives us a string variable as output
—now those string variables can be passed as inputs to MATCHES activity which is generally used for regex
And use specific expressions to get the output we want
For more details on MATCHES ACTIVITY

And to practice with strings and regex expression this will help you a lot where we can keep our string inputs copy pasted and test with expression which can be used here in the above matches activity as expression

Hope this would help you
Cheers @Rajnish

1 Like

@Palaniyappan thank you buddy…I appreciate your help…Could you please help me to correct my flow graph in order to compare two pdf and find out differences? I am attaching my flow graphpdfcompareNew.xaml (14.8 KB) pdf-sample.pdf (7.8 KB) pdf-sample1.pdf (90.4 KB) xaml file along with two pdf sample file…please help me with this I am stuck on this from last 1 week…Thank you.

Hi @Rajnish

As @Palaniyappan and @lakshman indicated the Regex is used to compare particular text!

-Do you have any thing particular that needs to be compared?
-Do you have a fixed format on the PDF that needs to be compared?

If these things are in fixation, yup it’s possible to compare the text and find if it’s mapping correctly or not. Hope this helps

1 Like

@Shubham_Varshney Thank you for your response, I am attaching sample pdf along with xaml file to compare two pdf files and find out differences… please go through and help me to correct this since I have been working on this from past two weeks and kind of stuck…thank you in advance.pdfcompareNew.xaml (14.8 KB) pdf-sample.pdf (7.8 KB) pdf-sample1.pdf (90.4 KB)

After seeing the PDF’s I saw that it would be best if you do simple compare and then just jot out what’s wrong and what’s correct!!!

This is doable but would be hectic!!!

@Shubham_Varshney Thank you for your suggestion, do u mind to explain simple compare as I am a beginner?

Split what you get from PDF and read word by word using the array you got!!!

If it matches next for both else mark what didn’t match and move to next word. I hope this helps you out! :slight_smile:

Cheers and have a good day :slight_smile:

1 Like

Thank you so much…Will try and let you know…meanwhile you could go through Workflow which I have provided in above text as .xaml file and make me correct to fix this…Thank you once again.