Couldnt label the table data


I am using data labelling under document understanding i have a table data which i could label but when i label this it is giving me error as many columns are empty i tried to hide columns but in one pdf that column will have value in other pdf some other column may have value

@farook3

ideally you need to indicate full row

also try with latest ixp which is more specialized in extracting tables

cheers

indicated full row eventhough i am getting this error : The following fields are labelled in less than 10 pages each: 20, 17, 16, 22, 26, 21, 4, 29, 25, 1, 6, 12, 11, tape-id, 24, 8, 30, 14, 31, 19, 18, 5, 2, 9, 10, 27, 28, 13, 7, 23, 3, 15. Mark the fields as ‘Hidden’, delete them, or label them in at least 10 pages. can u explain me the solution in details as this is a customized pdf i am using ml extractor using ai center so that i can train

@farook3

as mentioned in error find atleast 10 samples for each column you have..soo that you dont need to delete them or so as deleting would not fix it

cheers

As it is a customized table the values which have on that day only will be on table rest of the dates will be empty so kindly give me a working solution

"Even with customized tables where certain columns are often empty due to the nature of the data, the ML Extractor requires a minimum of 10 pages where a specific column does have a value labeled. This is essential for the model to learn and accurately identify that column across various documents.

If a column is frequently empty, but occasionally contains data that you need to extract, you must find at least 10 different documents in your dataset where that specific column has a value, and then label it in those 10+ documents.

Alternatively, if a column is genuinely empty in almost all instances and its data is not critical for extraction, you can mark it as ‘Hidden’ in the Data Labeling session to prevent it from being considered for training and avoid this error. However, if you need to extract the data when it does appear, the ‘Hidden’ option is not suitable."