Train extraction dataset

ajay.malhi · December 1, 2023, 11:31am

I have 3 layout of invoices. 50 examples of each invoice. I have done the labeling session for this. Now I want to run an automated training extraction. See image below.

When I run the automated training, it gives an error. See error logs below:

[
{
“entityId”: “8cbad905-2f90-ee11-8925-0022489a7f9f”,
“timeStamp”: “2023-12-01T09:50:16.693Z”,
“level”: “INFO”,
“messageKey”: “EXPORT_STARTED”,
“messageParams”: {}
},
{
“entityId”: “8cbad905-2f90-ee11-8925-0022489a7f9f”,
“timeStamp”: “2023-12-01T09:50:25.6473644”,
“level”: “Information”,
“messageKey”: “Extraction Export Event: Status is: started”,
“messageParams”: null
},
{
“entityId”: “8cbad905-2f90-ee11-8925-0022489a7f9f”,
“timeStamp”: “2023-12-01T09:56:23.187Z”,
“level”: “ERROR”,
“messageKey”: “GENERIC_ERROR”,
“messageParams”: {}
},
{
“entityId”: “8cbad905-2f90-ee11-8925-0022489a7f9f”,
“timeStamp”: “2023-12-01T09:56:23.4729785”,
“level”: “Information”,
“messageKey”: “Extraction Export Event: Status is: error”,
“messageParams”: null
},
{
“entityId”: “8cbad905-2f90-ee11-8925-0022489a7f9f”,
“timeStamp”: “2023-12-01T09:56:23.4935431”,
“level”: “Error”,
“messageKey”: "Export failed: ",
“messageParams”: null
}
]

I am not sure what the issue is. Any advise?

mohank · December 1, 2023, 11:44am

The error logs you provided indicate a generic error during the extraction export process. Unfortunately, the logs don’t provide specific details about the nature of the error. However, some general advice on how to troubleshoot and address such issues going forward

Check Data Quality:

Ensure that the labelled data is accurate and representative of the actual invoices. If there are inaccuracies in the labelling, it can affect the model’s performance.

Review Labeling Schema:

Verify that the labelling schema used during the labelling session is consistent with the expectations of the automated training system. Check if the labels and entities match the requirements of the extraction model.

Input Format:

Confirm that the input format (image or other data) provided for training matches the expected format by the automated training system. The error might be due to a mismatch in the data format.

Model Configuration:

Check the configuration settings for the automated training process. Ensure that the model parameters, hyperparameters, and other settings are appropriate for your task.

Data Volume:

Assess if the amount of training data is sufficient for the model to learn the extraction patterns. If the dataset is too small or lacks diversity, the model may struggle to generalize.

Resource Availability:

Ensure that there are no resource constraints during the training process. Insufficient memory, processing power, or disk space can lead to errors.

Review Documentation:

Consult the documentation provided by the automated training platform for any specific requirements or troubleshooting steps. There might be platform-specific considerations that need attention.

Contact Support:

If the issue persists and you can’t identify the root cause, consider reaching out to the support team of the automated training platform. They may be able to provide specific insights based on the platform’s internals.

Logs and Debugging:

Enable additional logging or debugging options if available. This might provide more detailed information about the error, helping you pinpoint the issue.

Iterative Testing:

If possible, conduct iterative testing by making small changes to the input data or training parameters to identify when the error occurs and under what conditions.

Anas-p-v · December 1, 2023, 12:20pm

Go to the labelling session and make sure you have at least 10 samples marked for each field (each field is marked atleast in 10 docs). Also, there is an export button on data labelling session, try once from there and see if its throwing any error.

ajay.malhi · December 20, 2023, 1:51pm

The issue resolved itself. There is no clear reason why or how the issue solved itself.

Topic		Replies	Views
Invoice Extraction using trainable machine learning model AI Center	2	1240	October 22, 2021
Training is not working in Invoice Extraction AI Center question , ai_center	2	1512	August 9, 2021
ML Trainning Help activities , question , ml , intelligent_ocr	2	1831	March 19, 2020
Trainable ML model for invoice extraction - Pipeline failed AI Center question , ai_center	5	2441	May 5, 2021
AI Center \| Data Extraction Scope Error \| Custom ML Model AI Center question , ai_center	29	1779	April 20, 2022

Train extraction dataset

Related topics