I am using UiPath for a few weeks now, but I got stuck at a thing I need to do.
I am trying to upload documents in a CMS form, and add to additional metadata.
Most of the data is in an Excel file, but I need to extract the date the document is published from the
document itself. The document is a pdf, and I also have html files which are derived from the pdf’s.
I am trying to extract the date from the documents, so I can add them as metadata to the CMS form. I need the last date that is in the document. Furthermore the pdf’s are all different in size, varying between 5 and 20 pages, and the date is always somewhere at the end, but not exactly at the same place.
The date (above the blue line) I need always comes after "ter openbare zitting van ", in each document. (I put a red line underneath it)
The text is in Dutch, as well the format for the date. I do not need OCR to get the text, I can copy it.
Is there anyone who has an idea how I can assign the date to a string variable, so I can eventually paste it into the form in need to upload the document in?
I already tried anchor bases together with “get text”, but so far unsuccesfull. If you can help me, I will be very happy!
Thank you in advance!