Extracting XML Data from a Website

Hi team,

I was trying to extract data (in xml format) from a website and take to data table and I did a couple of unsuccessful attempts.

Here is a little background of what I want to accomplish. I want to extract data of over 1000 people list which is in xml format and move it to data table. One person’s data have five attributes (Employee Id, Full Name, Primary Email, Hire Date, Business Title). My final goal is extracting list email of each person. If there are 1000 people, I need their respective 1000 primary emails.

Can anyone explain the steps needed to be followed? Here is the screenshot of xml file format for your reference.

Thanks a lot for your help!

You can try using Get text or data scraping and read the entire text from website, then try using deserialize XML activity or refer the link attached to convert to datatable How to read xml data into datatable

@Quenton_Wayne_Rebello

Thanks for your response. The Get Text activity can get me xml data in text format.
I am not too familiar with xml. Could you please layout the activities involved until getting the data into data tables?

Appreciate your help!

@Sisay_Dinku
Step 2: Deserialize XML
Use the “Deserialize XML” activity to convert the XML string into an XML document (XDocument).

  1. Make sure to set the “SourceString” property of the “Deserialize XML” activity to the string variable where you stored the XML response.

Step 3: Create DataTable 5. Create a new DataTable to store the extracted data. You can use the “Build DataTable” activity to define the columns (Employee Id, Full Name, Primary Email, Hire Date, Business Title).

Step 4: Extract Data and Fill DataTable 6. Use the “For Each” activity to loop through the XML elements that contain information about each person.

Within the loop, use XPath or LINQ queries to extract the data (Employee Id, Full Name, Primary Email, Hire Date, Business Title) from each XML element and add a new DataRow to the DataTable with this information.

Step 5: Output the DataTable 8. The DataTable will now contain data for all 1000 people. You can further process this DataTable to obtain the list of primary emails for each person.

Here’s a basic outline of the workflow:

<Root>
  <Person>
    <EmployeeId>12345</EmployeeId>
    <FullName>John Doe</FullName>
    <PrimaryEmail>john.doe@example.com</PrimaryEmail>
    <HireDate>2023-07-01</HireDate>
    <BusinessTitle>Software Engineer</BusinessTitle>
  </Person>
  <!-- More Person elements here -->
</Root>

Did this work? This block looks good to me…

Dear @ryan.brown and @raja.arslankhan
Thanks a lot for the help. I’d greatly appreciate if you send the linq query used to convert xml string into xml document.

HI

Can anyone help with linq query in here

@raja.arslankhan Step four

Thanks!

@Sisay_Dinku
The XML is defining namespaces. You can handle it as described here:

A general flow could look like this

  • Deserialze XML - out: xDoc

  • Assign Activity: xnsWD | DataType: XNamespace = the XML Namespace for the WD prefix

  • Build Datatable - dtResult, Configure the columns as it is within the XML

  • For each Actvitiy | item in xDoc.Descendants(xnsWD + “Report_Entry”) | TypeArgument: XElement

    • Add Data Row Activity:
      ArrayRow = item.Elements().Select(Function (x) x.Value).Cast(of Object).toArray
      DataTable: dtResult

As An alternate:

  • Deserialze XML - out: xDoc
  • Assign Activity: xnsWD | DataType: XNamespace = the XML Namespace for the WD prefix
  • Build Datatable - dtResult, Configure 1 Email column
  • Assign Activity:
    dtResult =
(From xe in xDoc.Descendants(xnsWD + "primaryWorkEmail")
Let ra = new Object(){xe.Value}
Select r = dtResult.Rows.Add(ra)).CopyToDataTable

Hi @ppr

I am getting an error. Can someone show with an example?
Here is the xml sample attached in text.

Test.txt (866 Bytes)
Thanks!

As described we are doing the XML Namespace Anaylisis:

grafik
And can prepare for us the xnsDefault / xnsWD XNamespace Variable by:

grafik

grafik

So as a quick check we can do:
grafik

Better to tell us what was done, what was failing.

We corrected some ending tasks from your XML sample. But as you can see the mentioned main building blocks from above are working successfully.

@ppr
Working successfully. Thanks a lot!

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.