How to Extract Data from Word File

please see this png file

after reading pdf text or image the content may change or it differs
thats the reason we are asking you to show the sample!
anyway you can extract firm name like this (?<=Firm Name).*(?=Read Office:)
and address like this (?<=Address:).*

1 Like

hi the first image i attached that file is in word format i have attached the file in image format because there is no option to attache the word file in forum

convert it to string and use the above regex!

Hi
Did we try with the method suggested here

Cheers @sachin_sharma

Hi @sachin_sharma,

(?<=Address)\n.* use this to get Name value and put it into variable
(?<variable.Trim)\n.* you will get office name.

hi thanks for reply but first see the the output format of word document

this is the output i get from word file now suggest what can i do

hi @ mitesh_parmar
please see the output file i attache

No buddy
Actually this method converts the word to excel and from excel we can access the row value we want
Did we try that method

Cheers @sachin_sharma

Hi @ Palaniyappan
i try using send hot key ctrl+g ,ctrl+c but hoe i get the output for ctrl+c command

1 Like

Fine use a START PROCESS activity and pass the filepath of word document as input to Filename property

Now use a SEND HOT KEY activity and mention the key as ctrl+a
Then another hot key with key as ctrl+c

after copying inside the word document use a START PROCESS activity where pass the filepath of new excel file you have created to the property FileName
This will open the excel file and now use a SEND HOT KEY ACTIVITY with key as ctrl+v

Cheers @sachin_sharma

No output file is attached ? @sachin_sharma please attached it again.

HI @ mitesh_parmar please see the message box image i attached

Hi Palaniyappan thanks for giving for time but i don’t understand your 3rd paragraph line what are you saying

Fine
Here you go with an xaml
hope this would help you
ss.zip (9.7 KB)

Cheers @sachin_sharma

1 Like

@sachin_sharma
Use this it will ignore your unicode data from text (?<=address:)\n[^\u0000-\u007F].*
and for next use same (?<=variable:)\n[^\u0000-\u007F].*

Or use Regex.Replace(s, @“[^\u0000-\u007F]+”, string.Empty) to replace all unicode word with empty data and use earlier regex to get your data.

Hi @ mitesh_parmari attached my xaml file please see and correct me where i am wrong

hi mitesh_parmar PFAExtract_Data.xaml (10.0 KB)

Hi Palaniyappan i try your code and add some delay it select all data from word file but it does not paste in Excel File

Fine
is it happening when done manually
@sachin_sharma