Metrics used in deep learning

Last updated on:9 months ago

As a researcher in deep learning, we have to make use of metrics to measure our models’ performance. I’ll introduce the most seen metrics in this blog.

Error matrices

Positive and negative are your judgement result. True or False means your judgement is right or wrong.

  • Accuracy = (true positives + true negatives) / (total examples)
  • Precision = (true positives) / (true positives + false positives)
  • Recall (or Sensitivity) = (true positives) / (true positives + false negatives)
  • F1 score (F score) = $2\frac{PR}{P+R}$ or $\frac{2}{1/P+1/R}$
  • Specificity = $\frac{TN}{TN + FP}$

Biomedical meaning

Specificity: Find the healthy guys from all people, not giving miseading message.
Sensitivity: Find the illness guys from all people, giving the timely message.

Metrics Rate

Rate means the proportion of an indicator. The fraction of metrics rate is depended on the rate name.

Rate name Denominator Numerator Formula
True positive rate/TPR TP + FN TP $\frac{TP}{TP + FN}$
False positive rate/FPR FP + TN FP $\frac{FP}{FP + TN}$
True negative rate/TNR TN + FP TN $\frac{TN}{TN + FP}$
False negative rate/FNR FN + TP FN $\frac{FN}{FN + TP}$

Denominator is the total of the rate name situation in judgement (positive or negative). For instance, True positive means ground truth and judgement are right. We get all positive judgement item, that is TP $+$ FN. Meanwhile, the numerator is the rate name itself. True positive is TP.

The ground truth and judgement are both different in two rate name means we can sum up them to 1.

$$\text{FPR} + \text{TNR} = 1$$

$$\text{TPR} + \text{FNR} = 1$$


TPR: Sensitivity, Recall

FPR: Fall out

TNR: Specificity, selectivity

FNR: Miss rate


A receiver operating characteristic curve, or ROC curve, is a graphical plot that illustrates the diagnostic ability of a binary classifier system as its discrimination threshold is varied.

ROC curve is TPR-FPR curve, which means that we need to compute TPR and FPR first.

AUC is area under curve. It can be obtained by calculate the area among ROC curve, $x$ axis, and $y = 1$ axis.

I guess you may have a question about plotting a curve. We get the TPR and FPR result in evaluating process, which can just be plotted in one point. How do we plot this curve? Actually, we need to set a threshold to define how large possibility is taken as a positive judgement.

For multi classes, we plot each class ROC one by one. First binarize the other class and the interested class, then take it as a two class curve plotting problem.

from sklearn.metrics import roc_curve, auc
import matplotlib.pyplot as plt
import numpy as np
from sklearn import metrics

y = np.array([1, 1, 2, 2])
scores = np.array([0.1, 0.4, 0.35, 0.8])

fpr, tpr, thresholds = metrics.roc_curve(y, scores, pos_label=2)

>>> fpr
array([ 0. ,  0.5,  0.5,  1. ])
>>> tpr
array([ 0.5,  0.5,  1. ,  1. ])
>>> thresholds
array([ 0.8 ,  0.4 ,  0.35,  0.1 ])

auc = metrics.auc(fpr, tpr)
>>> auc

lw = 2
plt.plot(fpr, tpr, color='darkorange',
         lw=lw, label='ROC curve (area = %0.2f)' % auc)
plt.plot([0, 1], [0, 1], color='navy', lw=lw, linestyle='--')
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Receiver operating characteristic example')
plt.legend(loc="lower right")


Receiver operating characteristic

机器学习基础(1)- ROC曲线理解