AUC can be intuitively understood as: “the probability that the classifier will assign a higher score to a randomly chosen positive example than to a randomly chosen negative example.” – Wikipedia

Yeah ok nice but what does that really mean? Actually the previous intuition is a bit tricky to understand. So let’s try to understand it.

Suppose we have a **binary classification** problem scenario as the following: we have a dataset $latex X$ with instances that have either $latex 0$ or $latex 1$ as labels. You divide the dataset into two parts: 1- training set, 2-test test. Next you train a classifier with the training set.

Now we want to test the performance of the classifier. Normally your test set has the following form: $latex X_{test}$ is the instances matrix and $latex Y_{test}$ is a vector that says if each instance is $latex 0$ or $latex 1$ . You feed $latex X_{test}$ into the classifier and let it classify each instance and ask it to give a confidence score for each class. For example, if the instance number 15 in $latex X_{test}$ has a true label $latex 1$ then your classifier will probably give the following confidence scores: $latex 0.97$ for class $latex 1$ and $latex 0.3$ for class $latex 0$ . So after feeding $latex X_{test}$ into the classifier each instance will have two confidence scores. Let’s assume that $latex Probs1$ and $latex Probs0$ are vectors that hold the confidence scores for all the nodes, so $latex Probs1$ holds the scores of all instances for class $latex 1$ and $latex Probs0$ for class $latex 0$ .

Let’s recall what we have until this point: $latex X_{test}$ , $latex Y_{test}$ , $latex Probs1$ , $latex Probs0$ . Now when you compute the AUC you **only** use $latex Y_{test}$ and $latex Probs1$ for the calculations (why? later my friend). Again $latex Y_{test}$ holds the true labels and $latex Probs1$ holds the confidence or the probability of each instance to be of class $latex 1$ (or the positive class). Let’s assume you have a perfect classifier that classifies everything correctly. Then now notice that in $latex Probs1$ all instances that have true labels as $latex 1$ should have really high scores, while those that have true labels as $latex 0$ should have really low scores.

Now suppose you pick two instances from the test set $latex X_{test}$ at random, $latex x_1$ that has a true label $latex 1$ and $latex x_2$ has a true label $latex 0$ . After that you check the scores of these two from $latex Probs1$ . You will see that $latex x_1$ might have $latex 0.94$ and $latex x_2$ have $latex 0.1$ . Notice that $latex x_1$ has a higher score than $latex x_2$ because $latex x_1$ is truly positive and $latex x_2$ is truly negative. Now suppose you keep picking positive and negative instances at random from $latex X_{test}$ and you make a comparison to check if the truly positive instance has a higher score than the truly negative instance. Since our classifier is perfect it will always give higher scores to truly positive instances than to truly negative instances. So its AUC will be $latex 1.0$ .

Let’s consider if our classifier is not perfect, then some instances that are truly positive in reality will have low scores in $latex Probs1$ and some instances that are truly negative in reality will have high scores in $latex Probs1$ . In this case our comparisons will have some misses, for example: given two randomly picked positive and negative instances from $latex X_{test}$ , the positive instance will have a **lower** score than the negative instance. This way the AUC of the imperfect classifier will be bad.

Actually suppose you made $latex n$ comparisons, $latex n`$ of the comparisons are correct and $latex n“$ of them are incorrect, then you can compute the AUC using the following formula:

$latex AUC = \frac{n` + 0.5 n“}{n}$

Now go back and read the intuition at the beginning of the post and hopefully you will fully understand it. (Always keep in your mind that we only use a probability vector for the positive class to compute the AUC)

p.s. this post was written very quickly without editing, so I hope it is not confusing or has serious mistakes.