Model Evaluation: Why Accuracy Isn’t Enough
Suppose we have a model that predicts the colour of a ball. We have 5 red balls and 5 blue balls and we ask our model to make a prediction of the colour of each of them. How can we evaluate our model’s success?
One approach would be to count the number of predictions that our model gets right. We count the number of red balls that the model predicts to be red (number of True Positives) and count the number of blue balls that our model predicts to be blue (the number of True Negatives).