Scrapping the Web for the information based on the user input field (invoice number etc..)

Hi Team,
Hope you are well. I am working on an automation problem.
Objective : To scrap the website (relevant url) to extract the information from that page.
User input : The user will have a predefined invoice number or tax number.

The bot has to pick the user defined invoice number or tax number list , search through the url and scrap the information to put in an excel or word etc…

Please let me know how to go about this,

Thanks and Regards,
Sri

You can use RegEx for this.
If you need to extract data from a URL, it would be an easier task.
In case you need to first scrape data from the webpage, I would suggest scraping from a limited section of the whole page and then applying RegEx.

If you can share some examples here about invoice/tax number formats, other forum members can post RegEx here for you.

Hi @jivankumar.kedar ,

Thank you for the reply and the suggestion. Please find below the details.

Website - SUNAT - Consulta Ruc
Tax number - 20515239112

Steps

  1. We need to enter the tax number in the above url
  2. It will give the necessary information
  3. We need to scrap all the information from that page into excel.

So my question was

  1. Can we create a list of tax number in the bot and it automatically searches through the webpage
  2. Extract all the relevant information from the webpage for the corresponding tax number and put in an excel.

Please let me know if this is clear,

Thanks,

Thanks for providing the details. It is much clear now about the requirement.
Answering to your points:

  1. Yes it’s possible to create a list of tax numbers. I assume you must be getting your tax numbers from some input file or any other data source. But rather than creating a list, I would suggest creating a data table with the first column as Tax Number. And later dynamically add one more column to write scrapped data from the website.

  2. You can then write this whole table into excel file.

Assuming that you don’t need to add a few more columns to create additional fields in excel for scrapped data e.g. Número de RUC, Tipo Contribuyente, Fecha de Inscripción, etc. If so, you can write the whole data in a single cell.

If you need to extract fields separately and write them into separate columns in excel, you need to use RegEx on scrapped data.

1 Like

Thank you @jivankumar.kedar . Really appreciate your points and help.
Yes i need to extract fields and write in a separate column.

As i am new to this. it would be great if you can help please help me with the sample workflow?

I am attaching sample which i have tried but it just scrap the website. Please let me know how to incorporate the both the points you mentioned.

  1. Create a datatable with list of tax numbers and dynamically add one more column to write scrapped data.
  2. Regex to put it in a separate column

Thanks for the support agian,
Sai_test_2.xaml (16.7 KB)

Hi @jivankumar.kedar ,

Please let me know your thoughts on the xaml and if you can help that would be great.

Cheers,
Sri