We want to predict who will receive a vaccination. Although the outcome is binary (the patient did or did not receive a vaccination), we can build a confusion matrix if for each class, we pretend it's a binary problem. So, if we started with the booked_vaccinated class, our confusion matrix looks like:
true positive: prediction == booked_vaccinated AND label == booked_vaccinated
false positive: prediction == booked_vaccinated AND label != booked_vaccinated
true negative: prediction != booked_vaccinated AND label != booked_vaccinated
false negative: prediction != booked_vaccinated AND label == booked_vaccinated
AUC or Log-Likelihood
Here are some miscellaneous notes from a data scientist who spent a good chunk of his PhD optimizing models:
- AUC is good for ascertaining whether the model is well specified and realistic. It can tell you which choice of features are better than others.
- Log-likelihood tells us how well the model fits the data. If the data is constant and the model structure is the same, the LL can tell us which choice of hyperparameters are better than others.
AUC calculations for an ML pipeline |
No comments:
Post a Comment