You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The Equal Error Rate (EER) is a performance metric commonly used to evaluate binary classifiers.
The EER is defined as the point where the False Acceptance Rate (FAR) and False Rejection Rate (FRR) are equal, providing a single scalar value that balances the two types of errors.
The inner minimization $\min_t P_\mathrm{error}(\pi, t)$ yields the Bayes error rate (BER) for a given $\pi$.
The outer maximization finds the prior $\pi$ that makes BER as large as possible.
To find the worst-case error, $\max_{\pi \in [0,1]} \min_{t} P_\mathrm{error}(\pi, t)$, let's first note that the theoretical DET (or ROC) curve forms a convex set and serves as its boundary (see 3, 4 for details).
Since a DET curve is convex, the minimum dot product will be achieved at a point $(P_\mathrm{miss}(t), P_\mathrm{fa}(t))$ where the hyperplane (line) orthogonal to $[\pi, 1 - \pi]$ supports the curve.
Note that a point (vertex of the convex hull) on the empirical DET curve may correspond to multiple points on the right plot.
The right-hand side $\min_{t} \max_{\pi \in [0,1]} P_\mathrm{error}(\pi, t)$ asks: for a fixed threshold$t$, what is the worst$\pi$?
Since $P_\mathrm{error}(\pi, t) = \pi \cdot P_\mathrm{miss}(t) + (1-\pi) \cdot P_\mathrm{fa}(t)$, it is linear in $\pi$, the maximum occurs at $\pi = 0$ or $\pi = 1$, depending on whether $P_\mathrm{fa}(t)>P_\mathrm{miss}(t)$:
Hence, EER is the worst-case Bayes error when the prior $\pi$ is unknown.
This means: if a binary classifier is trained by minimizing the EER (worst-case BER), concavity of BER would insure that error-rates at all the operating points will be pushed down.
Validity of the theorem's application
BER can be written as follows: $\mathrm{BER}(\pi) = \min_{t} \left( \pi \cdot P_\mathrm{miss}(t)+(1-\pi) \cdot P_\mathrm{fa}(t) \right)$. The pointwise minimum of linear functions is quasi-convex (since linear functions are convex and their minimum preserves quasi-convexity). This ensures that Sion’s theorem applies, allowing us to swap the min and max.
Alternative derivation by differentiating the Bayes error rate
We derive the Equal Error Rate (EER) as the worst-case Bayes error rate and seek the prior $\pi$ that maximizes BER.
This shows that $\mathrm{BER}(\pi)$ is concave in $\pi$, so the critical point is indeed a maximum.
Geometric interpretation
Minimizing a linear function over a convex set:
Consider a convex set $\mathcal{C}$ and a vector $P$ whose endpoint is on a line segment between points $A$ and $B$. For each $P$ we can compute the function $f(P)$ that is a dot product $\langle P, Z \rangle$, minimized over all points $Z$ from the set $\mathcal{C}$. Find a point $P$ that maximizes $f(P)$.
Let's start from expressing $P$ as:
$P(\pi) = A + \pi \cdot (B - A)$,
where $\pi$ is a number between $0$ and $1$.
Then, the problem can be formulated as follows:
$\max_{\pi \in [0,1]}\min_{Z \in \mathcal{C}} \langle P(t), Z \rangle$
Since $\mathcal{C}$ is convex, the minimum dot product over $Z \in \mathcal{C}$ will be achieved at a point where the hyperplane orthogonal to $P$ supports the set $\mathcal{C}$.
The objective function for the outer optimization can be rewritten as:
$f(\pi) = f(P(\pi)) = \min_{Z \in \mathcal{C}} \langle A + \pi \cdot (B - A), Z \rangle = \min_{Z \in \mathcal{C}} \langle A, Z \rangle + \pi \cdot \langle B - A, Z \rangle$
For each fixed $Z$, the expression $\langle A, Z \rangle + \pi \cdot \langle B - A, Z \rangle$ is a straight line in $\pi$. The minimum of a family of straight lines is a concave function in $\pi$.
The maximum of $f(\pi)$ must occur at a point where the derivative with respect to $\pi$ is zero (if such a point exists in $[0, 1]$). So, the maximum occurs where $\langle B - A, Z \rangle = 0$.
In the 2D case, the condition $\langle Z, B - A \rangle = 0$ means that the vector $Z$ is perpendicular to the line $AB$.
Let's recall that a theoretical DET (or ROC) curve is convex (concave). Hence, the inner minimization over its epigraph (convex set) can be replaced by minimization over a scalar $t$. For a fixed $t$, we seek the point $(P_\mathrm{fa}(t), P_\mathrm{miss}(t))$ on the DET curve that minimizes this dot product. The outer maximization can be seen as finding a point $P = (\pi, 1 - \pi)$ on a line segment between the points $(0, 1)$ and $(1, 0)$. This formulation matches the general result obtained before and allows us to conclude that the optimal point is on the intersection of the DET curve with the line along the direction $(1, 1)$, which is exactly the EER point $(\mathrm{EER}, \mathrm{EER})$.
References
Footnotes
Brummer, N. (2010). Measuring, refining and calibrating speaker and language information extracted from speech.↩
Brummer, N., Ferrer, L., Swart, A. (2021). Out of a hundred trials, how many errors does your speaker verifier make?↩
Cali, C., Longobardi, M. (2015). Some mathematical properties of the ROC curve and their applications.↩
Gneiting, T., Vogel, P. (2022). Receiver operating characteristic (ROC) curves: equivalences, beta model, and minimum distance estimation.↩