Saturday, 20 April 2013

The Probability of a Ramsey Sentence

This post is inspired by a recent very interesting talk, "Theoretical Terms and Induction", by Hannes Leitgeb at a conference on "Theoretical Terms" in Munich a couple of weeks ago (April 3-5th, 2013).

Hannes's talk is a response to the debate about whether a Ramsey sentence for a theory $\Theta$ can account for the inductive systematization of evidence given by $\Theta$ itself. This debate goes back to earlier works by Carl Hempel and Israel Sheffler (The Anatomy of Inquiry, 1963) and, in particular, a 1968 Journal of Philosophy paper, "Reflections on the Ramsey Method" by Sheffler. The debate has recently been revived in an interesting 2012 Synthese paper, "Ramsification and Inductive Inference", by Panu Raatikainen. The conclusion of this argument is that ramsification of a theory $\Theta$ damages the inductive systematization that the theory $\Theta$ provides. I recommend interested readers consult Panu's 2012 paper on this.

On Hannes's approach, one assigns a probability to a Ramsey sentence $\Re(\Theta)$, on the assumption that the corresponding Carnap sentence
$\Re(\Theta) \to \Theta$
has probability 1. Since Carnap himself insisted that the Carnap sentence of a theory is analytic, it seems reasonable, on his perspective, to assign it probability 1. On this Carnapian assumption, it can then be shown that the probability of a theory and its Ramsey sentence are the same. (Hannes's discussion also related these probabilistic conclusions to the notion of logical probability, counting models of a theory, over a finite domain.)

To explain what's going on, note first that it's well-known that $\Theta$ and $\Re(\Theta)$ are deductively equivalent with respect to the observation language $L_O$. That is, for any $\phi \in L_O$, we have,
$\Theta \vdash \phi$ if and only if $\Re(\Theta) \vdash \phi.$ 
But suppose the Carnap sentence has probability 1. Then we can show that $\Theta$ and $\Re(\Theta)$ are probabilistically equivalent.

First, we give a lemma in probability theory:
$Pr(A) + Pr(A \to B) = Pr(B) + Pr(B \to A)$.
Proof. Reasoning using probability axioms,
$Pr(A \to B) = Pr(\neg A \vee B)$
= $Pr(\neg A) + Pr(B) - Pr(\neg A \wedge B)$
= $1 - Pr(A) + Pr(B) - Pr(\neg (B \to A))$
= $1 + Pr(B) - Pr(A) - 1 + Pr(B \to A)$
= $Pr(B) - Pr(A) + Pr(B \to A)$.
$Pr(A) + Pr(A \to B) = Pr(B) + Pr(B \to A)$.

Next, let $\Theta$ be a theory and let $\Re(\Theta)$ be its Ramsey sentence. Note that
$\Theta \vdash \Re(\Theta)$.
(That is, one can deduce $\Re(\Theta)$ from $\Theta$ in a system of second-order logic, using comprehension.)

It follows that,
$Pr(\Theta \to \Re(\Theta)) = 1$.
Suppose that the Carnap sentence, $\Re(\Theta) \to \Theta$, has probability 1. That is,
$Pr(\Re(\Theta) \to \Theta) = 1$.
Then the Lemma above gives:
$Pr(\Re(\Theta)) = Pr(\Theta)$.
So, given a theory $\Theta$. the probability of its Ramsey sentence equals the probability of the theory itself, on the assumption that its Carnap sentence has probability 1.

[UPDATE 20 April: I have made a few changes and modified the Lemma used to a slightly stronger one.]


  1. Panu Raatikainen20 April 2013 at 14:41

    This is very interesting, and I'll have to think about it more carefully.

    But I wonder how all this relates to my obsevation, in the sister paper "On Carnap sentences" (Analysis 2011), where I point out that in some cases, (allegedly analytic) Carnap sentences can in fact become empirically disconfirmed?

    My intuitive first reaction is that it seems problematic to stipulate that the probability of such sentences is 1...

  2. Yes, I know the Analysis paper (very good, btw!).

    I agree - I told Hannes that I think Pr(Carnap sentence) shouldn't be 1 ...


  3. Panu Raatikainen20 April 2013 at 15:35

    BTW, the Synthese paper was written first, and the Analysis paper later, but Analysis was so much quicker in publishing it that it got out in 2011 and the former only in 2012. :)

  4. As far as I remember, in order for Hannes' argument to succeed, the probability of the Carnap sentence just needs to be *high* (and not necessarily 1).

  5. Albert, yes, I think you're right - there was a somewhat different version from the argument I'm giving here (which is entirely probabilistic), and I think that allowed Pr(Carnap) to be just high rather than 1.

    However, suppose we have $\Theta$ and $\Re(\Theta$). Then since $\Theta$ implies $\Re(\Theta)$, we have

    (i) $Pr(Theta \to \Re(\Theta)) = 1$

    But suppose the Carnap sentence has probability less than 1. E.g.,

    (ii) $Pr(\Re(\Theta) \to \Theta) = 1 - \epsilon$.

    It follows that:

    (iii) $Pr(\Theta) = Pr(\Re(\Theta)) - \epsilon$,

    That is, the theory has slightly lower probability than its ramsification does.

    (Maybe I'm confused.)


  6. Oops. LaTeX typo.
    The "Theta" above in line (i) should be "$\Theta$".

  7. Jeff,

    I'm sorry, but I don't remember exactly what Hannes' claim was (let alone its proof).

    But is this your reasoning? Instantiating your Lemma from above we get

    (1) $Pr(\Theta) + Pr(\Theta \rightarrow R(\Theta)) = Pr(R(\Theta)) + Pr(R(\Theta) \rightarrow \Theta)$

    By using the assumptions (ii) and (iii)

    (2) $Pr(\Theta) + 1 = Pr(R(\Theta)) + 1 - \epsilon$

    We'd immediately get

    (3) $Pr(\Theta) = Pr(R(\Theta)) - \epsilon$

    Here the Lemma seems to play an important role. In its proof however, unfortunately, I don't understand how to get from line 3 to line 4. I expected there to be

    (line 3) $\ldots - Pr(\neg(B \rightarrow A))$
    (line 4) $\ldots - (1 - Pr(B \rightarrow A)$

    1. But it's more likely that I'm the confused one ;)

  8. Hi Albert,

    Isn't that ok?

    (3) $\dots - Pr(\neg(B \to A))$
    $\text{ }$ $\dots - (1 - Pr(B \to A))$
    (4) $\dots - 1 + Pr(B \to A))$


  9. Thanks Jeff, of course it is -- see, I told you, I'm the confused one (I should have used my brain and not just stubbornly applying the definition. But at this time of day, already definitions are hard ;).

    So what your argument shows is that if the probability of the Carnap sentence is high (and not 1), the probability of the theory cannot equal the probability of its ramsification. In my (totally naive and uninformed) opinion this makes perfect sense: I thought of a Carnap sentence with a probability < 1 implying that the ramsification of the theory is somewhat weaker than the theory itself and this makes ramsification more probable than the theory.

    Do you remember Hannes' claim and argument? If I remember correctly, he seemed somehow to get Pr(theory) = Pr(R(theory)) (also for Pr(theory) high), but I can't remember how he did it...

    Thanks again, Jeff!

    Good night,

  10. Hannes's talk also discussed models and logical probability, with a finite domain, but I cannot quite recall any details involved ... though the existence of distinct empirically equivalent models played some role.


  11. Well post and its give us good idea how to write Ramsey Sentence on text text book when you write a story summery thanks for sharing custom essay writing service .