Leg 1 of 2 | Singapore (SIN) to Dubai (DXB) | Operated by Emirates (equipment owner - Emirates) what is the expression i need to express if i want to extract Singapore (SIN) in the pdf file using uipath assign? an expressison to get rid of text beside it


this is the imagine from the pdf but how to extact Singapore (SIN) and Dubai (DXB)

@HongRui_Zhang

Use read pdf text or Read pdf with ocr if the pdf is scanned
Use regex or string manipulations to get the data you want

Hi @HongRui_Zhang

Welcome to Community!!

1.Use Read Pdf activity
2.By using Regex you can extract that fields

I hope it helps!!

1 Like

@HongRui_Zhang

Can you provide that pdf file then we can extract that fields.

1 Like

Hey @HongRui_Zhang !! Can you please share the pdf so that we can work on a solution.

1 Like

Hi @HongRui_Zhang

Singapore

(?<=Leg 1 of 2\s*\|\s+).*(?=\s+to\s*[A-Z]+)

Dubai

(?<=Leg 1 of 2.*\s*\|\s+.*\s+to\s+).*(?=\s+\|)

I hope it helps!!

1 Like

Hey @HongRui_Zhang

You can try this regex expression
(\w+(?: (\w+))?)\s+to\s+(\w+(?: (\w+))?)\s+(?=|)

image

In order to extract the SINGAPORE (SIN) and DUBAI (DXB), we can make use of the following assign statements

assign origin = text.Split({“to”},StringSplitOptions.None)(0)
assign destination = text.Split({“to”},StringSplitOptions.None)(1)

image

Output
image

Hi @HongRui_Zhang ,

You can use the split function to get this done. First, split the str by ‘|’ and then by ’ to '.
Please follow the below steps-

  • Assign the above text to str variable.
str= "Leg 1 of 2 | Singapore (SIN) to Dubai (DXB) | Operated by Emirates (equipment owner - Emirates)"
  • Apply split function “|” as written below to get “Singapore (SIN)”
Split(Str.Split("|"c)(1)," to ")(0)
  • Apply split function as written below to get “Dubai (DXB)”
Split(Str.Split("|"c)(1)," to ")(1)

Now this solution will also work in case the destinations’ names are changing.

Below is the screenshot for your reference.

Regards,
Ashutosh Gupta

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.