I am not able to Extract a table using Form Extractor

Hi all,
I am trying to use Document understanding activities to read invoices, It’s working perfectly with “Form Extractor” but my issue is extracting the table with more than one row. I have tried the ML model but it’s not working. Is it not possible to extract a table with “Form Extractor”, Please find the attached workflow it has everything, I will be appreciated if anyone helps me to figure it out… Thanks DocumentUnderstanding_Invoices.zip (2.3 MB)

Hello @menna_almahdy,

You can use form extractor to extract tables as well - if they have a fixed height and fixed columns and fixed numbers of rows.

All you need to do is click on the table field, mark the table, as you would for regular processing (header row right side three bullets to access Extract New Table).

Then click save, and your template should be saved.

Ioana

2 Likes

I cannot get the table extractor in the Form Extractor to work effectively. It cannot pick up all the values and for some reason seems to reposition the capture area for the table incorrectly missing out the bottom third. Anyone an expert in this function?

Hi,
Does that mean that you can’t use form extractor for invoices? Since the number of items differ per invoice.

Yes, unless you want to configure very specific invoices that have a fixed table area… in this case, either form extractor (if lines have the same height) or regex extractor could be used for the line items.

Do remember form extractor is designed to process fixed form documents at this stage, imagine a W-4 IRS form for example. To process invoices or other document types of variable format, we strongly recommend using the machine learning extractor.

could you share a sample file and template? (project with .local folder included in archive as well to see the templates)

Hi, I didn’t see the reply as it was several days after my post. Are you able to help with the form extractor? It is a standard, unchanging structure, but I cannot get it read by the Form extractor. At the moment I solve the problem by ingesting it into Adobe or Abbyy and exporting it as HTML.