Extracting multiple tables from email body

Hi There,

I am trying to extract tables from email body which has 3 tables using “Extract data tables from HTML” activity. But it is throwing error as “Object reference not set to an instance of the object”.

But when i tried to extract a table from email body which has only one table, the extraction is working fine.

Kindly help to resolve the problem to extract 3 tables in single email body as my input requirement is the same.

TIA
Ramya A

Hi @ramya_anbazhagan
Try this method.

  • Read Email Body:

  • Use Get IMAP Mail Messages or Get Outlook Mail Messages to retrieve the email.

  • Extract the email body as a string.

  • Load HTML Document:

  • Use Invoke Code or Invoke Method to load the HTML document using Html Agility Pack.

  • Extract Tables Using HTML Agility Pack:

  • Add Dependencies:

    • Ensure you have the HtmlAgilityPack package installed in UiPath.
  • Invoke Code Activity:

csharp

Copy code

// Add the necessary namespaces
using HtmlAgilityPack;
using System.Collections.Generic;
using System.Data;

// Initialize the HTML document
var doc = new HtmlDocument();
doc.LoadHtml(htmlContent); // htmlContent is your email body

// Find all table nodes
var tableNodes = doc.DocumentNode.SelectNodes("//table");

// Initialize list to hold data tables
List<DataTable> tables = new List<DataTable>();

// Loop through each table node and extract data
foreach (var table in tableNodes)
{
    // Create a new data table for each HTML table
    DataTable dataTable = new DataTable();

    // Extract rows
    var rows = table.SelectNodes(".//tr");
    if (rows != null)
    {
        foreach (var row in rows)
        {
            DataRow dataRow = dataTable.NewRow();
            var cells = row.SelectNodes(".//th|.//td");
            if (cells != null)
            {
                for (int i = 0; i < cells.Count; i++)
                {
                    // Add columns dynamically if necessary
                    if (dataTable.Columns.Count < cells.Count)
                    {
                        dataTable.Columns.Add($"Column{i + 1}");
                    }

                    dataRow[i] = cells[i].InnerText.Trim();
                }

                dataTable.Rows.Add(dataRow);
            }
        }
    }

    tables.Add(dataTable);
}

// Output the list of data tables
outputTables = tables;
  • Variables:

    • htmlContent (String): The email body HTML content.
    • outputTables (List): The list to store the extracted data tables.
  • Process Extracted Tables:

  • You can now use the outputTables list in your workflow, each containing a DataTable representing one of the HTML tables from the email.

Hope it works.
Cheers

Hi @ramya_anbazhagan

check the below thread:

Regards

Hi @ramya_anbazhagan

you can add idx or table name or NO in your selector to extract different table details in single email.

Hope it helps!!!