The Accuracy Dominance Argument for Probabilism without the Additivity assumption

For a PDF of this post, see here.

One of the central arguments in accuracy-first epistemology -- the one that gets the project off the ground, I think -- is the accuracy-dominance argument for Probabilism. This started life in a more pragmatic guise in de Finetti's proof that, if your credences are not probabilistic, there are alternatives that would lose less than yours would if they were penalised using the Brier score, which levies a price of (1x)2 on every credence x in a truth and x2 on every credence x in a falsehood. This was then adapted to an accuracy-based argument by Roger Rosencrantz, where he interpreted the Brier score as a measure of inaccuracy, not a penalty score. Interpreted thus, de Finetti's result says that any non-probabilistic credences are accuracy-dominated by some probabilistic credences. Jim Joyce then noted that this argument only establishes Probabilism if you have a further argument that inaccuracy should be measured by the Brier score. He thought there was no particular reason to think that's right, so he greatly generalized de Finetti's result to show that, relative to a much wider range of inaccuracy measures, all non-probabilistic credences are accuracy dominated. One problem with this, which Al Hájek pointed out, was that he didn't give a converse argument -- that is, he didn't show that, for each of his inaccuracy measures, each probabilistic credence function is not accuracy dominated. Joel Predd and his Princeton collaborators then addressed this concern and proved a very general result, namely, that for any additive, continuous, and strictly proper inaccuracy measure, any non-probabilistic credences are accuracy-dominated, while no probabilistic credences are.

That brings us to this blogpost. Additivity is a controversial claim. It says that the inaccuracy of a credence function is the (possibly weighted) sum of the inaccuracies of the credences it assigns. So the question arises: can we do without additivity? In this post, I'll give a quick proof of the accuracy-dominance argument that doesn't assume anything about the inaccuracy measures other than that they are continuous and strictly proper. Anyone familiar with the Predd, et al. paper will see that the proof strategy draws very heavily on theirs. But it bypasses out the construction of the Bregman divergence that corresponds to the strictly proper inaccuracy measure. For that, you'll have to wait for Jason Konek's forthcoming work...

Suppose:
  • F is a set of propositions;
  • W={w1,,wn} be the set of possible worlds relative to F;
  • C be the set of credence functions on F;
  • P be the set of probability functions on F. So, by de Finetti's theorem, P={vw:wW}+. If p is in P, we write pi for p(wi).
Theorem Suppose I is a strictly proper inaccuracy measure on the credence functions in F. Then if c is not in P, there is c in P such that, for all wi in W,
I(c,wi)<I(c,wi)

Proof. We begin by defining a divergence D:P×C[0,] that takes a probability function p and a credence function c and measures the divergence from the former to the latter:
D(p,c)=ipiI(c,wi)ipiI(p,wi)
Three quick points about D.

(1) D is a divergence. Since I is strictly proper, D(p,c)0 with equality iff c=p.

(2) D(vwi,c)=I(c,wi), for all wi in W.

(3) D is strictly convex in its first argument.  Suppose p and q are in P, and suppose 0<λ<1. Then let r=λp+λq. Then, since ipiI(c,wi) is uniquely minimized, as a function of c, at c=p, and iqiI(c,wi) is uniquely minimized, as a function of c, at c=q, we haveipiI(c,wi)<ipiI(r,wi)iqiI(c,wi)<iqiI(r,wi)Thus

 λ[ipiI(p,wi)]+(1λ)[iqiI(q,wi)]>

λ[ipiI(r,wi)]+(1λ)[iqiI(r,wi)]=

iriI(r,wi)

Now, adding

λipiI(c,wi)+(1λ)iqiI(c,wi)=

i(λpi+(1λ)qi)I(c,wi)=iriI(c,wi)

to both sides gives

λ[ipiI(c,wi)ipiI(p,wi)]+

(1λ)[iqiI(c,wi)iqiI(q,wi)]>

 iriI(c,wi)iriI(r,wi)

That is,λD(p,c)+(1λ)D(q,c)>D(λp+(1λ)q,c)as required.

Now, suppose c is not in P. Then, since P is a closed convex set, there is a unique c in P that minimizes D(x,c) as a function of x. Now, suppose p is in P. We wish to show thatD(p,c)D(p,c)+D(c,c)We can see that this holds iffi(pici)(I(c,wi)I(c,wi))0After all,
D(p,c)D(p,c)D(c,c)=[ipiI(c,wi)ipiI(p,wi)][ipiI(c,wi)ipiI(p,wi)][iciI(c,wi)iciI(c,wi)]=i(pici)(I(c,wi)I(c,wi))
Now we prove this inequality. We begin by observing that, since p, c are in P, since P is convex, and since D(x,c) is minimized uniquely at x=c, if 0<ε<1, then1ε[D(εp+(1ε)c,c)D(c,c)]>0Expanding that, we get

1ε[i(εpi+(1ε)ci)I(c,wi)

i(εpi+(1ε)ci)I(εp+(1ε)c,wi)

iciI(c,wi)+iciI(c,i)]>0\medskip

 So

1ε[i(ci+ε(pici))I(c,wi)

i(ci+ε(pici))I(εp+(1ε)c,wi)

iciI(c,wi)+iciI(c,wi)]>0\medskip

 So\medskip

i(pici)(I(c,wi)I(εp+(1ε)c),wi)+

1ε[iciI(c,wi)iciI(εp+(1ε)c,wi)]>0\medskip

Now, since I is strictly proper,
1ε[iciI(c,wi)iciI(εp+(1ε)c,wi)]<0
So, for all ε>0,i(pici)(I(c,wi)I(εp+(1ε)c,wi)>0
So, since I is continuousi(pici)(I(c,wi)I(c,wi))0which is what we wanted to show. So, by above,D(p,c)D(p,c)+D(c,c)In particular, since each wi is in P,D(vwi,c)D(vwi,c)+D(c,c)But, since c is in P and c is not, and since D is a divergence, D(c,c)>0. SoI(c,wi)=D(vwi,c)>D(vwi,c)=I(c,wi)as required.




Comments

  1. What do you mean when you say that the inaccuracy measure is continuous, and how do you know this to be true? Is it an additional assumption, or does it follow from strict propriety somehow?

    ReplyDelete
    Replies
    1. Sorry, you’re right — I should have included this explicitly. The result covers all continuous strictly proper inaccuracy measures.

      Delete
  2. In the proof of the Theorem, in part (3), where you claim that D is strictly convex in its first argument, I don't see how the first pair of offset strict inequalities could hold. In the argument, c is supposed to be fixed and arbitrary. So for all you've said, we could have c=r there, and the inequalities would not be strict.

    ReplyDelete
  3. I think the proof in (3) that D is strictly convex in its first argument might be incorrect, and I don't see why this result should hold in general. In the part of the proof where you say "Now adding...to both sides gives...," the quantities you're adding could be infinite, in which case the strict inequality will not be preserved. In general, in order for D to be strictly convex in its first argument, there can be no c such that J(c,w)= for every w (otherwise D is a constant function (=) in its first argument, and therefore not strictly convex). But I don't see that continuity and strict propriety rule out the possibility that there is such a c.

    ReplyDelete
  4. Also, even if D were strictly convex in its first argument, that wouldn't imply (as you claim it does in your paper) that it attains a minimum on a closed convex set. For example, the real function on [0,1] defined by f(x)=1x if x[0,1) and f(1)=1 is strictly convex, but it does not attain a minimum due to the discontinuity on the boundary. In order to show that D attains a minimum, you need to appeal to some kind of continuity property for D.

    ReplyDelete

Post a Comment