I’m currently training an ML Extractor for invoice processing and I’ve run into a problem that I can’t seem to fix.
I have two types of invoices:
Type A – Works perfectly
These invoices have bigger gaps between each item row, and the ML Extractor handles them without any issues. Every row is detected cleanly.
Type B – Problematic
This invoice has very tight spacing between item rows, and this is where the problem starts.
The ML Extractor keeps merging multiple lines into a single row, or mixes the fields between rows. Basically, the model can’t “see” where one row ends and the next begins.
What I’ve tried so far
- Added more training samples (including this problematic type)
- Carefully re-labeled the table rows in Data Manager
- Tried different bounding box shapes
- Double-checked taxonomy
But the issue still only happens when the spacing between rows is very small.
What I’m hoping to get advice on
- Is this a known limitation of the ML Extractor or the OCR (I’m using UiPath Document OCR)?
- Any tricks for helping the model separate rows when they’re very close together?
If anyone has run into this before, I’d really appreciate your tips.
Thanks in advance! ![]()




