Read PDF text contains unprocessable characters


#1

Hi guys
the goal is to extract certain elements from a freely formulated text. Regex is used for this purpose. Source for the text is a PDF. If the activity “read PDF” is used, all - characters (hyphens) in the string are replaced by the UTF-8 hex character FFFE. I can no longer address this character with the UiPath Regex engine. In Regex101 I can match the character with \x{FFFE}. The UiPath Regex engine is hindered by this character, all my expressions are invalid by the randomly appearing character. Please help!

I’ll try to include a sample text in this post. But I don’t know if the character will be accepted by the forum.
Example:
Elisabeth Wenger, Gehrenstrasse 36, 8266 Steckborn, hat folgendes Baugesuch eingereicht:
The character ist in the german word “Bau [ here!] gesuch”


#2

It seems like the Forum doesn’t accept this character. Also I’m not able to upload a File. So here is a link with a sample text phrase: