Extract specific data from .html file

what was implemented at your end so far. Can you share screenshots or XAML?

as you can see in the screenshot, something I’m not doing well. I create a variable for Do sequence (InputData) to have the result for “Mathces” but I have the error for this (see with the red color). It seems to beat me :frowning:


I started to change a bit the strategy (like I so above from @Surya_Narayana_Korivipadu ) :

1 - read the html file with “Read text” activity;
2 - “Matches” activity with Regex “\b\d{10}\b”;
3 - In “for each” activity I put “Write line” into body to see what type of data will display and I so that it display all the data with 10 characters (because of regex);
4 - I tried with “(?<=Invoice posted:\s?)\d{10}(?=/)” but I don’t see anything if read something or not (second picture), so, I cannot apply this…I think :thinking:
5 - My question is : after I solve this extraction, how can I save these data into excel file? like I said, I need only the data with first numbers 51. I’m sorry for so many questions.

Thank you very much for your time!


@ovidiu_2088 - Here you go…

  1. Read your html file and I saved it as “StrInput”
  2. Matches - “\b51\d{8}\b” - This will get only the numbers starts with 51
  3. Build Datatable - I saved the output as DtReport

  1. For Each

4.1 Add Data Row

  1. Outside the loop Now use Write Range activity to write your results to Datatable.

Hope this helps…

Perfect, thank you so so much, is working like you said. The only issue is now that, as you so in the attachment, the number of the invoice is repeating on the same table. Now, in the excel file, how can I do to save it only once ? because in this moment I have it twice.

1 Like

@ovidiu_2088 … so instead of 6 rows you want only 3 rows in the excel, i.e. distinc values right ?

Yes, I need only 3 rows.

@ovidiu_2088 - Please make the three changes…

  1. For Each - Type argument - change it to String.

  2. For Each Values - Add the below code

    IEnRegex.Select(function(x) x.ToString).distinct().toarray
  3. Add Data Row - This time it should be only {Item}


Please try this…

Yes, is working. Thank you so much for your time. As they say : happy automation! :slight_smile:

1 Like

@ovidiu_2088 - Once you done with your testing. Please mark my post as solution so that it will benefit others…

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.