I’m looking to process a lot of PDFs. I want to be able to compare each PDF to a list of previous PDFs to check it is not a duplicate.
I figured a checksum/hash algorithm is in order to provide each PDF file with a unique identifier, I will then likely store these in a CSV file or text document and compare each PDF to the list of hashes.
- Is this the best way to implement this
- I tried an Invoke code block mentioned in another thread for MD5 comparison, but I think I have wrongly implemented it. Could anyone explain how to do it a little more clearly?