This is the data extracted from the pdf and is stored in the text file.I am not able to get the datas in a seperate column for each name in the excel file.
xxxx xxxx xxxx N.
S/o. xxxxxxx,
3/109, xxxxxxxx Post,
xxxxxxxxx District.
Roll No. 2569/2005
DOE : 28-12-2005
DOB: BG:
xxxx xxxx xxxx N.
S/o. xxxx xxxx xxxx,
16/6, xxxx xxxx xxxx,
xxxx.
Ph xxxx-xxxxxx
Roll No. 218/72
DOE :15-11-1972
DOB: BG:
(post deleted by author)
Please provide the output you want to extract
Hi @Renejit_Vs
Provide sample input and expected output. I can help you out with regular expressions.
Regards
Input
Abdul Raheem S.
42C, Colony, West Cross Street,
Ramavarmapuram, Nagercoil.
Cell : 9489320924
Roll No. 607/78
DOE : 20-12-1978
DOB : 21-05-1954
BG:
Do you want to extract the whole text. Please specify.
[A-Za-z]+[\s\S]*?BG\:
You can use the above pattern in Find Matching Patterns and run a For Each loop for that and print the currentItem.
Regards
Abdul Raheem S.
42C, Colony, West Cross Street,
Ramavarmapuram, Nagercoil.
Cell : 9489320924
Roll No. 607/78
DOE : 20-12-1978
DOB : 21-05-1954
BG:
This is the input in the excel i i want them in separate cells like Name in one cell and S/O in one cell and so on. i have around 1000 of these one after the other.
The output i would like to have
PDF.xlsx (8.5 KB)
i i want them in separate cells like Name in one cell and S/O in one cell and so on. i have around 1000 of these one after the other.
PDF.xlsx (8.5 KB)
Hi Friend,
You can use regular expression for each Value to extract the data from your pdf file. Assign each value in separate variables and then write them in excel file under there respective columns.
for eg: first use read pdf text activity assign a variable for that and then use assign activity with below command.
System.Text.RegularExpressions.Regex.Match(PDF,“(?<=Roll No. ).*(?=\n)”).ToString.Trim
Hope it will work!
Le me know if you need more help.
Thanks.
The regex is taking the DOE along with the roll no
Pass your created variable in place of PDF in this expression. The variable you have created in Read Pdf text activity.
System.Text.RegularExpressions.Regex.Match(PDF,“(?<=Roll No. ).*(?=\n)”).ToString.Trim
Can you show me the expression you are using? And also show me the read pdf text activity variable which you are passing in its Properties panel.
Ok. Also show me the assign expression
System.Text.RegularExpressions.Regex.Match(input,“(?<=Roll No. ).*(?=\n)”).ToString.Trim
Hi @Renejit_Vs
Retype the double quotes from the expression
is there a way i can get all the roll number values instead of 1 and i am getting the DOE along with Roll No. i have multiple roll number in this single file
Roll No. 2569/2005 DOE : 28-12-2005
Roll No. 218/72 DOE :15-11-1972 and so on.




