[REGEX] - expression works correctly in regexstorm.net but UiPath output is empty

Hi everyone,

I’m building an automation that reads the body of a set amount of e-mails, and then uses REGEX patterns to extract the hyperlink contained in the body.

The following regex expression is functioning in .NET Regex Tester - Regex Storm but the output in UiPath comes up empty. I used regexstorm.net because I’ve read it’s the most reliable way to test expressions compatible with UiPath.

REG Expression:
(?<=endereço: \r\n\r\n).*?(?<=3d%3d)

Output of e-mail body:
\r\n\r\n \r\n\r\nSaphetyDoc \r\n\r\nElectronic Invoicing \r\n\r\nFatura : ZF2E 1/1190001179 \r\n\r\n https://doc-server-qa.saphety.com/Doc.WebApi.Services/api/Entity/logo/stream/PT500498601/20211011154009.3c508e68-1f95-4096-a7bc-3ce4f0153255 \r\n\r\n \r\n\r\nCaro(a) Utilizador(a), \r\n\r\nA entidade CP - Comboios de Portugal (PT500498601) enviou o seguinte documento eletrónico via SaphetyDoc: \r\n\r\nDocumento \r\n\r\nFatura : ZF2E 1/1190001179 \r\n\r\nData do documento \r\n\r\n2021-10-11 \r\n\r\nPode obter o documento no seguinte endereço: \r\n\r\nDocClient http://url3710.saphety.com/ls/click?upn=qCkLLb9cYeFFz87dkwx3qDuniWNDSjbM8atB-2FM80jq0qgPRFnKwHl0sm-2B9bCpVp9leGBDPGS5s9Z5kZx5SV6PnnC0kxN-2FARlmAVn-2FwM9ffx1eqPXcZFLOMYk7mzRJRHRi8TfXLbguJPDEy9sNv8Uh8x8MkBfzY0hDqSjXOQqjCYBQZ5x1IBqF6ucu-2BzhvDa0IgdYuUIOIcnbZQNoPjeeVNvjWVPBYC47myOzSrUdVgOAQbRkzrW3nfxqIow3AzAELTdLUShDngu6vVeDlipmWVU3D1TmokkFuiEp3KvgcxkDhIolXmM7j8ihHmWRUAitdeD1HJPymb22nl0zsUxmumXLD13J7tgDFcXxbnTMdxSGK8x0lpBkHb2-2F8VkkCPzVKrvetUwnxttChUPwo95ndU6CkhB7f3t6tW8EVxS9QDFqp8UZIRXnXXGNyteDYSNdOC3Se0Xr5oLdlKbc6mJD-2F4vvtxIvWoXgpRzb3iJJPSOJENk6gvSbHSfWQd8OFLggrW03nBTI4436rkExjrF6bA-3D-3DYB1i_gC44qKU9iQWarZN-2FIpDhJa25YRzjf-2BlziikefbClqRj8jyHIFy-2B7qhO1a7XvTlFW-2BNk7BgvmsARNW4oiOOhpxXexemX8kYLVkoqbCGcbvORJ4GGxpuTTQa-2B4CDWyBXnRvLXlus-2FwiNYKoBB3ga8JkOx0T29k6oAPvV5hnRYRAtrN3dyTllKT6q75dn3-2Fi6z9FsNfnYR8JtxmarJTc-2FbWChCg-2B5BdV0cJ3tqGhUmTxCY-2FfICdLTATs9HDd0T2TAPn-2Fcrl6XYAVZNfWBPVhRcpoMv2nunx56j2nGtQFv3FNZOhqZzfhu39vWFPRQIL8DYIa-2FgjvC49kwkRHqEf-2Br5Opg-3D-3D \r\n\r\n\r\n\r\n \r\n\r\n http://url3710.saphety.com/ls/click?upn=qCkLLb9cYeFFz87dkwx3qHPRrpMvX9mJF-2B86ehsUAqq3lAfCQ4llJN8AFpfaMotYpg87_gC44qKU9iQWarZN-2FIpDhJa25YRzjf-2BlziikefbClqRj8jyHIFy-2B7qhO1a7XvTlFW-2BNk7BgvmsARNW4oiOOhpxXexemX8kYLVkoqbCGcbvORJ4GGxpuTTQa-2B4CDWyBXnRvLXlus-2FwiNYKoBB3ga8JkOx0T29k6oAPvV5hnRYRAtrN3dyTllKT6q75dn3-2Fi6z9bSWyEp4VZUhLDtPYqUBOwPaUgcOlJ3UPgWFj88-2Bipbi9yQEwyr8Ewf1BFKWGKzGxpqjCgS9paSdDTxuHeEGOHsjq-2BFM8TX90p9RbxyLoNJE-2FLI-2BBQcXazSnButbrEENCwpHUGvT2L1INSUelQH0Z3Q-3D-3D http://url3710.saphety.com/ls/click?upn=qCkLLb9cYeFFz87dkwx3qOLzO4uYtLtQOPV-2F4FeQlJ0wyQisR1cH57KlfDoYHotUZcg__gC44qKU9iQWarZN-2FIpDhJa25YRzjf-2BlziikefbClqRj8jyHIFy-2B7qhO1a7XvTlFW-2BNk7BgvmsARNW4oiOOhpxXexemX8kYLVkoqbCGcbvORJ4GGxpuTTQa-2B4CDWyBXnRvLXlus-2FwiNYKoBB3ga8JkOx0T29k6oAPvV5hnRYRAtrN3dyTllKT6q75dn3-2Fi6z988F62ltdka5TDz-2B8SdzA-2B5IvR9hcfPYtTgeTqk-2FY8W2IE8PpUEnmxzh4CSUkHFZc6FriCowAuYVL39U-2BgXeChv5o-2Bp1luhSA55rwAOLmCC7dazdwJEIYW53HF8GPk9hZJC2-2F2FFLb8d2CwTmd-2BAK-2Bw-3D-3D http://url3710.saphety.com/ls/click?upn=qCkLLb9cYeFFz87dkwx3qIYWtOhOCRVEphjcTCjgFUD45gg-2FuXnt4rVJMTtHKEzpX57nrM-2F5ziLOVpEgTle9RQ-3D-3DFZ7__gC44qKU9iQWarZN-2FIpDhJa25YRzjf-2BlziikefbClqRj8jyHIFy-2B7qhO1a7XvTlFW-2BNk7BgvmsARNW4oiOOhpxXexemX8kYLVkoqbCGcbvORJ4GGxpuTTQa-2B4CDWyBXnRvLXlus-2FwiNYKoBB3ga8JkOx0T29k6oAPvV5hnRYRAtrN3dyTllKT6q75dn3-2Fi6z90TGmozpbgnnc59BNqVieZ-2Bod0b87qd11YQTXBYm5K2BCUOG-2F89spq0pnlPMhiCK0UQrh39gw-2FzOn3GRBeNqL-2FT8y-2BpRdccPFFiwBZZ8c-2Fk-2FmIOnTY0XGgbBXlbF89vZ1e5SuvP2rtFtCClLL2oqdcA-3D-3D http://url3710.saphety.com/ls/click?upn=qCkLLb9cYeFFz87dkwx3qOjeuNOFUEnlPKySFmoKDFKGvVQ5hJd-2FSui7Zw85BmhDw0pO_gC44qKU9iQWarZN-2FIpDhJa25YRzjf-2BlziikefbClqRj8jyHIFy-2B7qhO1a7XvTlFW-2BNk7BgvmsARNW4oiOOhpxXexemX8kYLVkoqbCGcbvORJ4GGxpuTTQa-2B4CDWyBXnRvLXlus-2FwiNYKoBB3ga8JkOx0T29k6oAPvV5hnRYRAtrN3dyTllKT6q75dn3-2Fi6z9R7xRBjzwAMI62rWbwCmOpLjEYxWWSkwhZFlvGG9oIZUmxYBo4wDG0ria7ZUAbmd6FCSHcnXJYhky5Z7hx89WmrJmPOP3xOxG1wgFeZ14r4Pva6tuYbsIOAKnM9SPEOwCeS5qdPpi-2BZZ4CSzI5tqsEQ-3D-3D http://url3710.saphety.com/ls/click?upn=qCkLLb9cYeFFz87dkwx3qKbrZqFS8-2F-2FdOZO2Cq1LiKwAxiuFYg-2Fv-2BXVUaFFOQKZ14Dyr_gC44qKU9iQWarZN-2FIpDhJa25YRzjf-2BlziikefbClqRj8jyHIFy-2B7qhO1a7XvTlFW-2BNk7BgvmsARNW4oiOOhpxXexemX8kYLVkoqbCGcbvORJ4GGxpuTTQa-2B4CDWyBXnRvLXlus-2FwiNYKoBB3ga8JkOx0T29k6oAPvV5hnRYRAtrN3dyTllKT6q75dn3-2Fi6z9RiRmeBw8JPtcCqj9e8Qn80D2g0DGpuhuHIDFjLlVImO23nNsr61qbvOune-2FbG1TQx6svqJfBKvdpUd4aylRob-2BiN1SxPR5d1ZfS8l4KNlh3TAdUovLTeDrweQWDEUmwjiV1IR5S3LGEiQFBCSTosSw-3D-3D \r\n\r\nSaphety- Electronic Invoicing \r\n\r\nsaphety.com http://url3710.saphety.com/ls/click?upn=qCkLLb9cYeFFz87dkwx3qNqF1SClfMN2GAPn8T-2FLkuzfDl2GkERAUUFFOk399cODWqHD_gC44qKU9iQWarZN-2FIpDhJa25YRzjf-2BlziikefbClqRj8jyHIFy-2B7qhO1a7XvTlFW-2BNk7BgvmsARNW4oiOOhpxXexemX8kYLVkoqbCGcbvORJ4GGxpuTTQa-2B4CDWyBXnRvLXlus-2FwiNYKoBB3ga8JkOx0T29k6oAPvV5hnRYRAtrN3dyTllKT6q75dn3-2Fi6z9Uob8ynVYZ3iXKfuIqpHkSnBEnvnhItPq0lkUN21oxWaPNZqTLU8NqEQeKU8SlO7FrZPHxR1Ukk9DF5mAhsMNcqaZiTXi2hoCRqCBrh-2FzJ06AROzuKWSz5uIcGCTOwXzsvEgUiITbzaxQg5LQ7X80QA-3D-3D \r\n\r\n \r\n\r\n \r\n

Am I missing something obvious?

Thanks in advance.

Hey @andre.f.pires

Can you please provide the Sample, the expected Output and as much information on the Pattern as you can.

In the meantime, I believe the problem was that you were not ‘escaping’ the characters of “\r” and “\n”. To ‘escape’ a character you need to insert a “\” in front of it. However, a “\r” means a return, a “\n” means a new line. This confuses things when you have them in square brackets [ ]. See my Potential Solution 2 and 3 where I have handled the letters r,n and \ individually. The source file convert the “\r” and “\n” accordingly but when imported into UiPath they are literal text values.

Potential Solution 1:
Preview this pattern:
(?<=endereço: \\r\\n\\r\\n).*?(?<=3d%3d)

Potential Solution 2:
Preview pattern here
(?<=endereço: [rn\])[^rn\].?(?<=3d%3d)
Comments: This solution will not matter how many ‘\r’ or ‘\n’ there are in the text AS LONG as the start of the link doesnt begin with an r,n or \ which it could…

Potential Solution 3:
(?<=endereço: [rn\\][\w])[\w]{2,}.?(?<=3d%3d)
Preview link
This pattern will work regardless of the first letter in the hyperlink.

Hi,

I think you want to extract URL after endereço Keyword, right?
I recommend to use the following expression, because 3d%3d in your pattern will be needed in the URL. %3d means “=” and it’s base64 padding character. So if orginal data length changed, length of padding characters might be also changed from 0 to 3.

System.Text.RegularExpressions.Regex.Match(yourString,"(?<=endereço:[\s\S]+?)https://\S+").Value

Regards,

Much appreciated, that works just fine.

Thank you!

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.