Regular expression for extracting the table data from pdf

Hi all,

Pdf file has table data, how extract the pdf data from text file.
test.txt (275 Bytes)
Demo file has been attached. Please help me.

Regards,
Lakshmi

HI @lakshmi.mp

Can you share the expected output

Regards
Gokul

Hi @Gokul001 ,

Engine model PW217E, Serial No. BC0127, TSN 4381.6, Workscope Overheat.
Reason for removal: Overheat/Major Referbishment (ARO).

If row 1 and row 2 are same. Above mentioned is the output.
If row 1 and row 2 are different, then
Engine model PW217E / PW217B, Serial No. BC0127 / AC0125, TSN 4381.6 / 1425.9, Workscope Overheat / Cold.
Reason for removal: Overheat/Major Referbishment (ARO).

Thanks,
Lakshmi

Hi @Gokul001 ,
pdf data written into text file from there, how to fetch the above mentioned fields.
Please help me.

Regards,
Lakshmi

Hi @lakshmi.mp

Use Read PDF activity Store the Output as ReadPDf

How about this expression?

To Get Engine model

System.Text.RegularExpressions.Regex.Match(ReadPDF.Trim,"(?<=Engine\smodel\s)\S+").Tostring

To Get Serial No

System.Text.RegularExpressions.Regex.Match(ReadPDF.Trim,"(?<=Serial\sNo.\s)\S+").Tostring

To Get TSN

System.Text.RegularExpressions.Regex.Match(ReadPDF.Trim,"(?<=TSN\s)\S+").Tostring

To Get Reason For Removal

System.Text.RegularExpressions.Regex.Match(ReadPDF.Trim,"(?<=Reason\sfor\sremoval:\s)\S.+").Tostring

Regards
Gokul

1 Like

Hi @Gokul001 ,

Engine model , serial number, TSN, workscope not able to extract.
Able to extract Reason for removal.
These fields data present in the next line. Engine model , serial number, TSN, workscope.
What changes i need to do.

Thanks,
Lakshmi

Can you share the PDF file @lakshmi.mp

Hey!

Could you please paste the exact Input here…

Regards,
NaNi

@Gokul001 , @THIRU_NANI
I can’t share pdf , i can share sample text file.
test.txt (275 Bytes)
Please look on it.
Regards,
Lakshmi

This is the expected output.
Regards,
Lakshmi

Hey!

Can you check this and let me know

System.Text.RegularExpressions.Regex.Match(StrInput,"(?<=Engine model )[A-Za-z0-9]+").ToString

Will give you the remaining

Regards,
NaNi

@THIRU_NANI ,

Engine
Model
is present in next line how to match.
Above expression not matching.

Thanks,
lakshmi