(Strictly) Proper Scoring Rules


A strictly proper scoring rule is one in which a forecaster maximizes (or minimizes, depending on the nature of the score) by forecasting exactly his or her true beliefs about the situation.

A proper scoring rule is one in which the forecaster cannot gets the best score by forecasting his or her true beliefs, although it may be possible to get the same score by forecasting something else.

To test if a score is strictly proper, we calculate the expected score a forecaster will receive for a forecast and find what forecast, f, yields the best score as a function of his or her true belief, p. If the score is strictly proper, then the score will be maximized (minimized) iff p = f.

Let's test the family of scores that look like the half Brier score, expect for the exponent, defined by

Bn=(|f -x|)n,

where f is the forecast probability and x=0 or 1, depending on whether the event does not or does occur, respectively. The expected value is given by

E[Bn] = p(1-f)n + (1-p)f n

where p is the forecaster's true belief. To find the value of f that minimizes E[Bn], we take the partial derivative with respect to f and set it equal to 0. That is,

dE/df = -np(1-f)n-1 + n(1-p)f n-1 = 0.

[note that the d here should be the partial, but I'm doing this in ASCII]


p(1-f)n-1 = (1-p)f n-1

Thus, we want

p{(1-f)n-1 + f n-1 } = f n-1


p = f n-1 /{(1-f)n-1 + f n-1 }


For n=1 (a linear scoring rule), this gives p=1/2, so that the maximized by forecasting the extreme values of f, 0 or 1, if p<0.5 or p>0.5, respectively.

For n=2 (quadratic scoring rule--the Brier score), this gives p=f, which is strictly proper.

For n>2, this yields a complicated function that, in effect, says that a forecaster should forecast values closer to 0.5 than his or her true belief, unless p is very close to 0 or 1.