For a PDF version of this post, click here.

Life comes at you fast. Last week, I wrote a blogpost extolling the virtues of the following scoring rule, which I called the enhanced log rule: $$\mathfrak{l}^\star_1(x) = -\log x + x \ \ \ \ \ \mbox{and}\ \ \ \ \ \ \ \mathfrak{l}^\star_0(x) = x$$And I extolled its virtues. I noted that it is strictly proper and therefore furnishes an accuracy dominance argument for Probabilism. And I showed that, if we restrict attention to credence functions defined over partitions, rather than full algebras, it is the unique strictly proper scoring rule that delivers Conditionalization when you ask for the posterior that minimizes expected inaccuracy with respect to the prior and under the constraint that the posterior credence in the evidence must be 1. But then Catrin Campbell-Moore asked the natural question: what happens when you focus attention instead on full algebras rather than partitions? And looking into this revealed that things don't look so rosy for the enhanced log score. Indeed, if we focus just on the algebra built over three possible worlds, we see that every strictly proper scoring rule delivers the same updating rule, and it is not Conditionalization.

Let's see this in more detail. First, let $\mathcal{W} = \{w_1, w_2, w_3\}$ be our set of possible worlds. And let $\mathcal{F}$ be the algebra over $\mathcal{W}$. That is, $\mathcal{F}$ contains the singletons $\{w_1\}$, $\{w_2\}$, $\{w_3\}$, the pairs $\{w_1, w_2\}$, $\{w_1, w_3\}$, and $\{w_2, w_3\}$ and the tautology $\{w_1, w_2, w_3\}$. Now suppose that your prior credence function is $(p_1, p_2, p_3 = 1-p_1-p_2)$. And suppose that you learn evidence $E = \{w_1, w_2\}$. Then we want to find the posterior, among those that assign credence 1 to $E$, that minimizes expected inaccuracy. Such a posterior will have the form $(x, 1-x, 0)$. Now let $\mathfrak{s}$ be the strictly proper scoring rule by which you measure inaccuracy. Then you wish to minimize:

\begin{eqnarray*}

&& p_1[\mathfrak{s}_1(x) + \mathfrak{s}_0(1-x) + \mathfrak{s}_0(0) + \mathfrak{s}_1(x+(1-x)) + \mathfrak{s}_1(x+0) + \mathfrak{s}_0((1-x)+0)] + \\

&& p_2[\mathfrak{s}_0(x) + \mathfrak{s}_1(1-x) + \mathfrak{s}_0(0) +
\mathfrak{s}_1(x+(1-x)) + \mathfrak{s}_0(x+0) + \mathfrak{s}_1((1-x)+0)] +\\

&& p_3[\mathfrak{s}_0(x) + \mathfrak{s}_0(1-x) + \mathfrak{s}_1(0) +
\mathfrak{s}_0(x+(1-x)) + \mathfrak{s}_1(x+0) + \mathfrak{s}_1((1-x) +0))]

\end{eqnarray*}

Now, ignore the constant terms, since they do not affect the minima; replace $p_3$ with $1-p_1-p_2$; and group terms together. Then we get:

\begin{eqnarray*}

&& \mathfrak{s}_1(x)(1+p_1 - p_2) + \mathfrak{s}_1(1-x)(1-p_1 + p_2) + \\

&& \mathfrak{s}_0(x)(1-p_1 + p_2) + \mathfrak{s}_0(x)(1+p_1 - p_2)

\end{eqnarray*}

Now, divide through by 2, which again doesn't affect the minimization, and note that$$\frac{1+p_i-p_j}{2} = p_i + \frac{1-p_i-p_j}{2}$$. Then we have

\begin{eqnarray*}

&& (p_1 + \frac{1-p_1-p_2}{2})\mathfrak{s}_1(x) + (p_2 + \frac{1-p_1-p_2}{2})\mathfrak{s}_0(x) + \\

&& (p_2 + \frac{1-p_1-p_2}{2})\mathfrak{s}_1(1-x) + (p_1 + \frac{1-p_1-p_2}{2})\mathfrak{s}_0(1-x)

\end{eqnarray*}

Now, $\mathfrak{s}$ is strictly proper. And $p_2 + \frac{1 -p_1 -p_2}{2} = 1 - (p_1 + \frac{1-p_1-p_2}{2})$. So providing $p_1 + \frac{1-p_1-p_2}{2} \leq 1$ and $p_2 + \frac{1-p_1-p_2}{2} \leq 1$, the posterior that minimizes expected inaccuracy from the point of view of the prior and that assigns credence 1 to $E$ is $(x, 1-x, 0)$ where:$$x = p_1 + \frac{1-p_1-p_2}{2}\ \ \ \ \mbox{and}\ \ \ \ 1-x = p_2 + \frac{1-p_1-p_2}{2}$$And this is very much not Conditionalization. It turns out then, that no strictly proper scoring rule gives Conditionalization on full algebras in this manner.

comment test

ReplyDelete(This is Leszek Wroński here, I can't figure out my accounts at the BlogSpot.)

ReplyDeleteSo: Fascinating! But there's one thing that's bugging me about this. I last thought about these issues sometime in 2017, I think, but I desperately want to return to them this year. Minimizing your expected score according to the only local proper scoring rule, that is, the log rule, delivers Conditionalization: this is seen e.g. from Chapter 15 of your book, Richard.

(To get here from a different direction: assume 'M(I)RE' is 'Minimise (Inverse) Relative Entropy' in the sense of Douven & Romeijn (2011). Theorem 8.6 from Paris' book shows that MRE delivers Conditionalization and one can tweak the same argument to show that this holds also for MIRE.)

Doesn't this contradict the result from this post, since the log rule is strictly proper?

(Uh, the MIRE bit is understandable when you note that it is equivalent to 'minimize inaccuracy as measured by the log rule)

Deletehi, your post is very helpful for me. Finally, I found exactly what i want. If you need information regarding bullguard antivirus then you can visit our site Servicio al cliente Bullguard

ReplyDelete