AUC can be intuitively understood as: “the probability that the classifier will assign a higher score to a randomly chosen positive example than to a randomly chosen negative example.” – Wikipedia

Yeah ok nice but what does that really mean? Actually the previous intuition is a bit tricky to understand. So let’s try to understand it.

Suppose we have a **binary classification** problem scenario as the following: we have a dataset $latex X$ with instances that have either $latex 0$ or $latex 1$ as labels. You divide the dataset into two parts: 1- training set, 2-test test. Next you train a classifier with the training set.