We have a use case where there is need to extract personal sensitive data from legal documents. This data then needs to be anonymized, meaning that the personal sensitive data will be masked.
The documents are both structured and unstructured in nature. Ideally there is also document management in place to store the documents and share them.
Ideally it is automatically suggested what personal information should be masked, leaving as much as possible readable and understandable for the recipient. Could this be done with a supervised ML model? Any suggestions which one? Or do I need to look more for NLP solutions that integrate well with UiPath?
For the masking itself I was considering an RPA workflow. Using this, you could update the original documents anonymizing them. You can find the elements, capture text region (if needed depending on doc type) and perform an update. Any thoughts on this approach and challenges you foresee?
To what extent would UiPath be able to provide the capabilities requested above?
Thanks for your inputs!