How to get number from the below string

First format : I REPAIR ORDER FORM SERIAL NO: 93911937623 I**

Here i need this one only “93911937623”

and second format : | REPAIR ORDER FORM 93912033045 |

Here i need this one only "93912033045 "

please help me on how get that serila no in both formats and we are not sure which format will be there in loop. but either of the format shud be there. then we shud get the serilanumber as string from the text file.

use regular expression something like this “([0-9])\w+” with Match activity

@sambana_karunakar If number of digits in that number are same use below regex with match options.

“[\d]{11}”

Or use an assign activity as follows:

result = System.Text.RegularExpressions.Regex.Match(inputString, “\d+”).ToString

There is also split string activity…

can some help me I need to write that extracted string in an excel under specified column.

@sambana_karunakar u can write data under specified column. But i need to know how excel has data and where u getting the above values and how u want write those values in excel like one by one as process etc.

@Manjuts90 hey I m extract the pdf and saving in a string variable, if the string variable contains the aboe serial number then i need to write the serial number under serial number columns in a excel please see below and aslo I am find for the other text in the same string variable if I found that then I write Yes or No under the other coulmn POHStatus.

image

@sambana_karunakar create dataTable with 2 columns same as excel.
using add datarow activity add those 2 values to datarow after ur data gets over write that datatable to excel using write range activity

Hi @Manjuts90 ,
Here I have a table in PDF My requirement is to get the part number if the POH has a tick mark in a corresponding row.


help me with out opening PDF and with opening PDF. 87911932740.pdf (28.8 KB)

@sambana_karunakar i Dont have any idea on this Bro, U have to wait or u have to ask for other help.

Hello,

So, here is the thing… this is a bit “touchy”

Use anchor base, and if image exists only to look left and read text from specified position so when it finds check mark under DESC it wont read anything also when check mark is in DD Test it will be blank.

Also zoom your PDF document to 150% so Google OCR can read it.

Hope it helps.

Cheers,
Radomir

@evangemert could u please help me with regex of text or characters(eg: 2 DS23D, 3 FG23D, 1DRE3W and etc ). So as per the given example I need a text next to 2 after space and 3 after Space and like viz 1, 5 etc all these are any where in text file.
Note: numbers 1,2,3,4,5 and etc are like Index numbers but after extracting pdf data to txt file itwill be not in a defined format hence need regex to get the text after index numbers. I need a logic like if i want only text from specific index number that could be 1 or 2 or 3 or 4 or any. I can change as per my requirement later. So i need logic for one index now then accordingly i can modify it for other index numbers please help

My first idea is making a dynamic pattern by using the index as a variable. Making a solid Regex pattern is gonna be hard though if you only have one number and a space to work with. Is the info you need to extract always 5 characters with only caps and digits? In that case, try this:

pattern = “(?<=”+index+“\s?)[0-9A-Z]{5}”

Then use the pattern as a variable in another assign activity, just like I mentioned above.

The first part of the pattern indicates a lookbehind. In there you can drop the index you’re looking for.

1 Like

@evangemert can you please check if this correct or not
—System.Text.RegularExpressions.Regex.Match(output, “(?<=”+2+"\s?)[0-9A-Z]{5}”).tostring—

m getting this

You need quotation marks around the 2, because you’re treating it as a string in this case:

System.Text.RegularExpressions.Regex.Match(output, “(?<=”+“2”+"\s?)[0-9A-Z]{5}”).tostring

Also I’m not sure if it allows adding strings together inside the argument. A safe way to make sure it works is by assigning the pattern in a different assign activity, and then use the variable pattern inside the actual method, like this:

pattern = “(?<=” + “2” + "\s?)[0-9A-Z]{5}”

partnum = System.Text.RegularExpressions.Regex.Match(output, pattern).ToString

@evangemert It sems not giving required result or can you tyr this on
“replace part K8Y8C .Run DD Test system is working fine.”
in the above text is standard and I need that K8Y8C is dyanamic this could be any five charcters lenght in caps only to be retrieve. can you please help me in this.

Well yeah, that’s kind of obvious as the current pattern is only looking for characters behind the index. If that info is always behind the ‘replace part’ then you just have to edit the pattern for that:

pattern = “(?<=replace part )[0-9A-Z]{5}”

@evangemert there are two things here to be mandatory one is replace part and second must be working fine i need a text between replace part and .Run DD test system is working fine. Some times the text wud be “Run DD Test system is not fine or not working fine”.

“replace part K8Y8C .Run DD Test system is working fine.”

In that case you need to add a lookahead, like this:

pattern = “(?<=replace part )[0-9A-Z]{5}(?= .Run DD Test system)”