Accurately extracting checkbox values w/in a grid

I’m working within the Document Understanding section of UiPath and training models for extracting data from standard Acord forms, specifically Acord 28. This form has a tabular layout with a vertical grid of checkboxes through the center.

Currently, for each row of checkboxes, I’ve set up three separate fields for extraction, such as:

  • Limited Fungus Coverage Yes
  • Limited Fungus Coverage No
  • Limited Fungus Coverage N/A

The issue I’m encountering is that whenever annotations are processed, the model incorrectly identifies every checked box as “Yes,” even when the checkmark clearly corresponds to the “No” or “N/A” column. Although I haven’t processed extensive training data yet, this issue has consistently appeared, and the model hasn’t correctly identified any checkbox marked “No” or “N/A” so far.

  1. Should I structure the fields differently to improve accuracy, such as using a single field with three possible values (Yes/No/N/A) rather than separate fields for each?
  2. Is there a different extraction method I should be using for these fields? If so, can you point me in the direction of some documentation for that?

I’d appreciate any advice as this is my first Document Understanding project and my first time with Machine Learning.

Thanks,
Matthew

Hello @Matthew_M ,

You can create separate labels like

  • Limited Fungus Coverage Yes
  • Limited Fungus Coverage No
  • Limited Fungus Coverage N/A

but you need train the extractor with multiple images with different checkboxes checked

for above example we have checkboxes with no labels
In this case, The extractor will return the string value of the checkbox which is one of these two characters:

  • if the extractor returns ☒, this corresponds to YES;
  • if the extractor returns ☐, this corresponds to NO.

reference - Document Understanding - Checkboxes and Signatures

thanks,
Darshan :wink: