Compare two PDF and show dissimilarities in Output

I need to Compare PDF/TEXT/WORD etc file and find out the dissimilar data from two files in output, is it possible, if yes please provide the solution…For example: I have 2 PDF files and want to read both of them and find out the dissimilarities between them in output.

2 Likes

yah thats possible
–if that pdf is read on whole or we are getting any specific details we can compare them
–like if we are obtaining specific details from pdf using get text (if we are able to select the terms as individual elements) or with screen scrapping option (if its a image)
or with anchor base activity and storing the output to a string variable
–we can compare those two string variables using a if condition like this
in_value1.ToString.Equals(in_value2)
if true will go to THEN part where we can include any activity we want

OR

if we are reading the whole pdf either with read pdf or read pdf ocr activity we will be getting a string output variable
–say in_str1 for first pdf and in_str2 for second pdf
–then use a assign activity like this
out_str1_array = in_str1.Split(Environment.Newline.ToArray())
and another assign activity like this
out_str2_array = in_str2.Split(Environment.Newline.ToArray())
where both out_str1_array and out_Str2_array as string array variable

–now use a for each loop and pass the above variable out_str1_array as input and change the type argument as string in the property panel and change the variable from item to item1
–inside use another for each loop and mention the variable out_str2_array as input and change the type argument as string in the property panel and change the variable from item to item2
–inside the inner loop use a if condition like this
item1.ToString.Equals(item2.ToString)
this will check each line of both pdf whether they are equal or not. and if equal wll go to THEN part where we can include a break activity so that once matched it wont loop again or will go to ELSE part

hope this would help you
kindly correct me if i m wrong and let know for any queries or clarification
Cheers @Rajnish

@Palaniyappan thanks buddy will try and let you know…I really appreciate your help.:grinning:

Sure
Cheers @Rajnish

@Palaniyappan, I am able to read and find whether docs are similar or not but unfortunately unable to get dissimilar data in output…if possible could you please help me with Flow graph so that i can solve my issue…will be thankful to you.

@Rajnish,

If you are expecting word by word difference, try this-
StringDifferences.xaml (7.9 KB)

Hope this helps!

1 Like

@Madhavi, Thank you, will try and let u know…appreciate your help.

2 Likes

@Palaniyappan, bro with the help of you I am able to find out the difference but not 100%, I am able to get only 50-60% unmatched data…I am attaching the flow graph for your reference , please guide me for further correction.pdfcompare.xaml (14.0 KB)

1 Like

Hi,

Could you please fix this, facing difficulty in mismatch data.There is two pdf (old & new) I want to compare it and highlight the mismatch data in new pdf.

But is there any optimal solution or instruction???