Unable to extract these values from PDF

Hi Everyone,

I have a .PDF file.
i want to extract - certain fields

image
this format remains same but the values gets different.

  1. i did -read .pdf, saved to text file
    2.regex for 1st field is done for the 2nd one -“Platform Type”
    image

Please help me

Regards,
Seema

@Seema_S - is it possible to share the text ? When you are reading the psf please use preserve format to true that will give better result.

Preserve - for pdf read activity right, okay done
Sure i wil share the text file
SCR_PDF.txt (13.9 KB)

@Seema_S - Thanks for sharing the text file. So you just want the platform type from this right? If yes, then please check below…

@prasath17
thank you for the regex.
is it a regex tool? can you please tell me from where i can get it? it will be more helpful.
i want other fileds as well.
image

also i have to extract above all values
Position A, position B,position C,
position A Type, position B Type, position C Type.

Regards,
Seema

@Seema_S - It’s .NET Regex Tester - Regex Storm … For my Regex I always use this…and it’s works perfectly with UiPath…

Let me look at your other fields…

@Seema_S - Could you please see below… I did not find the same text as you shown in your screenshot…looks like its different file…

@prasath17
oh okay
yes its different, but still you have found the EFEM Type. but i want the 1st one - i.e
EFEM Type :6.4E

well i went to the link .NET Regex. but their how shall i find the regex… could you please help me with how to use this tool

@Seema_S - You have to paste your text and write the regex…

Please refer this post on Regex…

https://itnext.io/regular-expressions-tricks-you-should-know-2976c7bd1be3

Best Video in youtube

2 Likes

@Seema_S - Here you go…

@prasath17
Hey i want them one by one, not as a whole…
Position A,
position B,
position C,
position A Type,
position B Type,
position C Type
Chambar code A

All the values corresponding to them should be save in different different strings.
Example
Position A : XYZ
then i want only XYZ

i dont know how to get regex :frowning:

You didn’t mentioned anywhere in your requirement before, If you could provide a clear requirement (next time) it would be easy for us to provide the right solution…

Sure, i will extract only values for the above strings…

So Sorry , sure will take care next time.
Thanks Prasanth

Hi .@Seema_S … No worries…Cool…

Here you go…

Regex Approach: First I took everything in between “Order Configuration and Chamber Code B” and then I took that output & created another Regex to extract all the string after :

Using System.Text.RegularExpressions.Regex.Matches wrote a assign statement to extract the MatchValues.

Then using ForEach use to iterate through the matches to printed the required output.

Output:
image

XAML: Regex_Seema.zip (37.9 KB)

Hopes this helps…