Feedback about the Model Training Deep Dive: Refine and Maintain Academy course.
I think these explanations of when Check Entity and Missed Entity are used are backward. I would expect us to need to use Check Entity when the model has assigned a certain entity at a high rate, meaning that a large quantity of those assigned the entity are not correct. For example, suppose the entity “Claim Amount” has been assigned to every number in every email. Thus, high recall because every Claim Amount has been correctly identified, but low precision because the majority of those Claim Amounts are incorrectly assigned.
Similarly, suppose we have exactly 1 Claim Amount assigned correctly. Then, high precision because we are 1 for 1, but low recall because we are only 1 for X number of Claim Amounts in all the emails.
So these are backward, right? We use Check Entity to improve low precision by unassigning incorrectly assigned entities, and we use Check Missed to improve low recall by assigning missed entities.
Your correct for understanding of precision and recall, which are standard terms in the field of information retrieval and machine learning. Here’s a brief explanation:
Precision: The ratio of correctly predicted positive observations to the total predicted positives. High precision relates to a low false positive rate. You use precision when you want to be more confident about your positive predictions.
Check Entity: If the model is assigning an entity label too liberally but when it does label correctly, it is indeed correct (which means high precision), then “Check Entity” would be used to ensure that only the correct instances are labeled, thus improving precision further by reducing false positives.
Recall: The ratio of correctly predicted positive observations to all observations in actual class. High recall means most of the actual positives are correctly recognized (low false negative rate).
Missed Entity: If many entities are not being labeled by the model but when they are labeled, they are often incorrect (which means low precision), you would use “Missed Entity” to identify true positive cases that the model is currently missing, thus improving recall by reducing false negatives.
In practical terms:
Use Check Entity when you want to reduce the number of incorrect labels assigned to non-entities (thus addressing cases of low precision).
Use Missed Entity when you want the model to identify more actual entities that it currently misses (thus addressing cases of low recall).
Based on the explanation provided, it does appear that there may be a mix-up in the course material regarding when to use “Check Entity” and “Missed Entity”. The feedback given is logically sound, and the user’s understanding aligns with standard definitions:
Check Entity should be used when you’re getting too many false positives (low precision).
Missed Entity should be used when the model is missing out on correctly identifying true entities (low recall).
Thanks for confirming what I thought, @srinivasmarneni. I wish we could trust everything in the training, but at least I know I have been paying attention to the meaning of these words and understand them.