Validation Document vs Training Document

Hi @frankie.amendola, welcome to the Community.

Validation documents are a subset of the training dataset that are used to assess the performance of a ML model during training. These documents are not used to update the model’s parameters but instead are used to evaluate its performance metrics, such as accuracy or F1 score, at different stages of the training process.

Typically, a portion of the training dataset is held out as a validation dataset, and the remaining documents are used to update the model parameters. The model is trained on the training dataset, and after each training iteration, it is evaluated on the validation dataset to determine whether it is overfitting or underfitting the training data.

If the model is overfitting, it will perform well on the training dataset but poorly on the validation dataset, and the training process is repeated. If the model is underfitting, it will perform poorly on both the training and validation datasets, and the model architecture may need to be modified.

The following resources with more information on these:

  1. Training-Evaluation dataset & balanced datasets section of this doc: https://docs.uipath.com/document-understanding/automation-cloud/latest/user-guide/training-high-performing-models

  2. https://docs.uipath.com/document-understanding/standalone/2020.10/user-guide/training-and-evaluation-pipelines

Hope this helps,
Best Regards.