QUESTION AUTOMATION (STUDIO OR STUDIOX)

How could I copy the following text in word to my Excel file.

What I need to copy are the numbers indicated in the image (in the ID section of my excel) and their respective texts in the “I want” section of my excel file.

Attached screenshots:

image

excel to which I want to incorporate that information both in “ID” and in “I WANT”

Hello,

I’m going to draft the concept here. This can be fine tuned based on how efficiently these steps can be toned down.

If your input is an image then I am thinking you scrape the entire text into a single paragraph.

  • The challenge will be the quality of the text read from the image
  • If your input is a bulleted text list then it might not be much of an issue

Assuming you got the text into a paragraph:

  • Split it into single lines or read the text line by line.
  • If not split each line it into an Array using the Line-Break as the splitting character

Next would be to loop the collection and read each line within an Excel Application Scope:
Use RegEx to find the position of the first alphabet - this would be the start of the text after the bullet number. Example, for : 1.1. Login, the first alphabet is the letter L

  • The reason I’m saying RegEx is that your bullet numbering is varying based on the nested levels of your text.
  • If you don’t prefer RegEx, the other way is to find the last occurrence of the period “.” character which is usually before the bullet text begins - this may not be reliable if your bullet text has periods in them

Once you get the position:

  • You read anything before this position as the bullet number
  • Read anything after this position as the bullet text
  • Use Write Cell to write bullet number to cell A{X}
  • Use Write cell to write bullet text to cell E{X}
  • Increment counter {X} by 1 to move to the next line
1 Like

Thank you very much for the quick answer !!

in my case it is a text with bullets !!!
Would it be possible to show the implementation in code of what you explained to me?

again, thank you very much!

I tried to do this to keep the text, but the subject that by staying with the first value already discards the rest of the text. How could I fix it?

I have something set up quickly. It picks up after you get the text into a string. In my example, I have mocked up the text by writing it to a notepad file.

You can see the output is as you expect. The Bullets are in Column A and the text goes into Column E. The solution covers different kind of bullets provided they have periods as separators. Other kind of bullets will need modifications to the code.

You have to change the file paths accordingly for this example to work on your end.

Test_RegEx_Bulleted_Text.xaml (13.1 KB)
AgustinInputFile.txt (234 Bytes)
AgustinOutputFile.xlsx (9.0 KB)

Hope this helps.

great, thank you very much!

Could you explain the following 3 images to me, that I do not quite understand?

image

image

Files.ReadAllLines() is a way to read multi-line text into an array . Each line of text is assigned as a cell in the array.

matchBullet is a regular expression string that helps identify numbers separated by periods in each bulleted line. The detail is beyond the scope of this conversation.

Simply put, If your line is “1.2.1. This is my Sample Bullet”, then the expression against matchBullet identifies just the "1.2.1. " portion of your bulleted text.

regexBullet is a method of preparing the RegEx expression for use in code because matchBullet by itself is simply a string and can’t do much

In the Body, you pick each line and run it through the regular expression. The function regexBullet.Match returns only one match it has encountered in the line. From the above example it returns "1.2.1 " into the bulletMatch variable

Next, you pull that bullet number into a string variable sBullet

Lastly, a convenient way to get the bullet text is by using the bullet number. The text is anything that is a portion of the string (or substring) that follows the bullet. In other words, the starting position of the bullet text is the length of the bullet number.

Consequently, the substring function pulls “This is my Sample Bullet” portion into a variable named sBulletText

I hope this helps you get a bit more clarity on how the code works.

thanks

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.