jjes
(Jeppe Jespersen)
January 17, 2023, 6:50am
1
So, I have this Document Text that I’m trying to extract data from. Text contains order lines looking something like this:
When testing my regex in regexr (and others), it goes fine:
And when I build the expression in Studio’s Regex Builder, everything looks good, too:
Problem is… nothing after the line break gets extracted and saved to my results.
Anyone with any ideas?
Anil_G
(Anil Gorthi)
January 17, 2023, 6:58am
2
@jjes
You have a property called regex options…set that to multiline. That should solve
Currently as per screenshot i see the option as ignore case…that is where you will set multiple line
Cheers
Yoichi
(Yoichi)
January 17, 2023, 6:59am
3
Hi,
It might be linebreak matter.
Can you try to use (\r?\n) instead of \n ?
Regards,
1 Like
jjes
(Jeppe Jespersen)
January 17, 2023, 7:28am
4
Tried it. No difference. But thanks
jjes
(Jeppe Jespersen)
January 17, 2023, 7:28am
5
Nope, same result. But thanks
Yoichi
(Yoichi)
January 17, 2023, 7:29am
6
Hi,
Can you share your input text and current pattern as text file?
Regards,
HI @jjes
Try this pattern once
(?m)^\d+.*(.*?)Your articleno.:\s\d+
or
(?m)^\d+.*\r?\n(.*?)Your articleno.:\s\d+
Regards
Sudharsan
jjes
(Jeppe Jespersen)
January 17, 2023, 7:59am
8
Pattern:
\d+\s+.{1,10}\s+(.+)\s+\d{2}-\d{2}-\d{2}.+\d+,\d{2}(.+)((?:.))(\r?\n?)(. )
Input:
Yada,yada…
1 BB22B-D Motor, 10Nm, 20-12-21 11 200,15 190,10 * 2.091,10 on/off/1-pkt., dim 8x8, 150s.
Your articleno.: 225521
2 BM21-a11.2 Rotor, 4Nm, on/off, 20-12-21 5 200,00 150,00 * 750,00 multipak, dim 8x8, spring-return.
Your articleno.: 443212
bla…bla…
Yoichi
(Yoichi)
January 17, 2023, 8:03am
9
Hi,
Can you try the following pattern?
\d+\s+.{1,10}\s+(.+)\s+\d{2}-\d{2}-\d{2}.+\d+,\d{2}(.+)((?:.)*)((\r?\n)*)(.*)
Regards,
Yoichi
(Yoichi)
January 17, 2023, 8:09am
10
Hi,
FYI, I’ll attach the above sample as the following.
Sample20220117-4L.zip (2.9 KB)
Regards,
jjes
(Jeppe Jespersen)
January 17, 2023, 8:12am
11
So strange, still nothing from the second line of each order.
Yoichi
(Yoichi)
January 17, 2023, 8:17am
12
Hi,
Does the above sample : Sample20220117-4.zip work in your environment?
If yes, probably input string is something different with the above sample.
Can you share input text as a file using WriteTextFile activity?
Regards,
jjes
(Jeppe Jespersen)
January 17, 2023, 8:30am
13
Yes, your example spits out both lines.
I have attached the text file from the Digitize step, but anonymized it a bit.
input.txt (265 Bytes)
Yoichi
(Yoichi)
January 17, 2023, 8:45am
14
HI,
In my environment, it also works as it is even if input is the above input.txt.
Sample20220117-4Lv2.zip (3.2 KB)
Is there any difference b/w the above and your workflow?
Regards,
jjes
(Jeppe Jespersen)
January 17, 2023, 5:43pm
15
If I use the regex from your example on that same text in my automation, the last line is still not saved in my result (i export the dataset by iterating through the Tables collection and then doing a Write Range on each one). So that is very, very strange.
What is also odd, is that if I check the ExtractionResult object’s ResultsDocument->Fields->Raw View->(3) I can see that the last line is actually NOT in the ExtractionResult, which is very, very strange, considering I’m using the same Regex on the same text target as in your example.
I’m puzzled.
jjes
(Jeppe Jespersen)
January 17, 2023, 6:46pm
16
No luck with these, but thanks for trying
Anil_G
(Anil Gorthi)
January 17, 2023, 7:45pm
17
jjes:
1 BB22B-D Motor, 10Nm, 20-12-21 11 200,15 190,10 * 2.091,10 on/off/1-pkt., dim 8x8, 150s.
Your articleno.: 225521
2 BM21-a11.2 Rotor, 4Nm, on/off, 20-12-21 5 200,00 150,00 * 750,00 multipak, dim 8x8, spring-return.
Your articleno.: 443212
bla…bla…
@jjes
Can you please try this
.*\n(?=Your articleno\.: \d{6}).*
Uploading: A8311581-A271-41C4-87C1-1557126F3E55.jpeg…
Cheers