Sleeping Beauty – a “halfer” approach

If you read the last post on the Sleeping Beauty problem, you may recall I did not pledge allegiance to either the “halfer” or the “thirder” camp, because I was still thinking my position through. More than a month later, I still can’t say I am satisfied. Mathematically, the thirder position seems to be the most coherent, but intuitively, it doesn’t seem quite right.

Mathematically the thirder position works well because it is the same as a simpler problem. Imagine the director of the research lab drops in to see how things are going. The director knows all of the details of the Sleeping Beauty experiment, but does not know whether today is day one or two of the experiment. Looking in, she sees Sleeping Beauty awake. To what degree should she believe that the coin toss was Heads? Here there is no memory-wiping and the problem fits neatly into standard applications of probability and the answer is 1/3.

My intuitive difficulty with the thirder is better expressed with a more extreme version of the Sleeping Beauty problem. Instead of flipping the coin once, the experimenters flip the coin 19 times. If there are 19 tails in a row (which has a probability of 1 in 524,288), Sleeping Beauty will be woken 1 million times. Otherwise (i.e. if there was at least one Heads tossed), she will only be woken once. Following the standard argument of the thirders, when Sleeping Beauty is awoken and asked for her degree of belief that the coin tosses turned up at least one Heads, she should say approximately 1/3 (or more precisely, 524287/1524287). Intuitively, this doesn’t seem right. Notwithstanding the potential for 1 million awakenings, I would find it hard to bet against something that started off as a 524287/524288 chance. Surely when Sleeping Beauty wakes up, she would be quite confident that at least one Heads came up and she is in the single awakening scenario.

Despite the concerns my intuition throws up, the typical thirder argues that Sleeping Beauty should assign 1/3 to Heads on the basis that she and the director have identical information. For example, here is an excerpt from a comment by RSM on the original post:

I want to know if halfers believe that two people with identical information about a problem, and with an identical set of priors, should assign identical probabilities to a hypothesis. I see the following possibilities:

  1. The answer is no -> could be a halfer (but not necessarily).
  2. The answer is yes, but the person holds that conditionalization is not a valid procedure –> could be a halfer.
  3. The answer is yes and the person accepts conditionalization, but does not accept that the priors for the four possibilities in the Sleeping Beauty puzzle should be equal –> could be a halfer.
  4. Otherwise, must be a thirder.

My intuition suggests, in a way I struggle to make precise, that Sleeping Beauty and the director do not in fact have identical information. All I can say is that Sleeping Beauty knows she will be awake on Monday (even if she subsequently forgets the experience), but the director may not observe Sleeping Beauty on Monday at all.

Nevertheless, option 2 raises interesting possibilities, on that have been explored in a number of papers. For example in D.J. Bradley’s “Self-location is no problem for conditionalization“, Synthese 182, 393–411 (2011), it is argued that learning about temporal information involves “belief mutation”, which requires a different approach to updating beliefs than “discovery” of non-temporal information, which makes use of conditionalisation.

All of this serves as a somewhat lengthy introduction to an interesting approach to the problem developed by Giulio Katis, who first introduced me to the problem. The Stubborn Mule may not be a well-known mathematical imprint, but I am pleased to be able to publish his paper, Sleeping Beauty, the probability of an experiment being in a state, and composing experiments, here on this site. In this post I will include excerpts from the paper, but encourage those interested in a mathematical framing of a halfer’s approach to the problem. I am sure that Giulio will welcome comments on the paper.

Giulio begins:

The view taken in this note is that the contention between halfers and thirders over the Sleeping Beauty (SB) problem arises primarily for two reasons. The first reason relates to exactly what experiment or frame of reference is being considered: the perspective of SB inside the experiment, or the perspective of an external observer who chooses to randomly inspect the state of the experiment. The second reason is that confusion persists because most thirders and halfers have not explicitly described their approach in terms of generally defining a concept such as “the probability of an experiment being in a state satisfying a property P conditional on the state satisfying property C”.

Here Giulio harks back to Bob Walters’ distinction between experiments and states. In the context of the Sleeping Beauty problem, the “experiment” is a full run from coin toss, through Monday and Tuesday, states are a particular point in the experiment and as an example, P could be a state with the coin toss being Heads and C being a state in which Sleeping Beauty is awake.

From here, Giulio goes on to describe two possible “probability” calculations. The first would be familiar to thirders and Giulio notes:

What thirders appear to be calculating is the probability that an external observer randomly inspecting the state of an experiment finds the state to be satisfying P . Indeed, someone coming to randomly inspect this modified SB problem (not knowing on what day it started) is twice as likely to find the experiment in the case where tails was tossed. This reflects the fact that the reference frame or ‘time­frame’ of this external observer is different to that of (or, shall we say, to that ‘inside’) the experiment they have come to observe. To formally model this situation would seem to require modelling an experiment being run within another experiment.

The halfer approach is then characterised as follows:

The halfers are effectively calculating as follows: first calculate for each complete behaviour of the experiment the probability that the behaviour is in a state satisfying property P; and then take the expected value of this quantity with respect to the probability measure on the space of behaviours of the experiment. Denote this quantity by ΠX(P) .

An interesting observation about this definition follows:

Note that even though at the level of each behaviour the ‘probability of being in a state satisfying P’ is a genuine probability measure, the quantity ΠX(P) is not in general a probability measure on the set of states of X . Rather, it is an expected value of such probabilities. Mathematically, it fails in general to be a probability measure because the normalization denominators n(p) may vary for each path. Even though this is technically not a probability measure, I will, perhaps wrongly, continue to call ΠX(P) a probability.

I think that this is an important observation. As I noted at the outset, the mathematics of the thirder position “works”, but typically halfers end up facing all sorts of nasty side-effects. For example, an incautious halfer may be forced to conclude that, if the experimenters tell Sleeping Beauty that today is Monday then she should update her degree of belief that the coin toss came up Heads to 2/3. In the literature there are some highly inelegant attempts to avoid these kinds of conclusions. Giulio’s avoids these issues by embracing the idea that, for the Sleeping Beauty problem, something other than a probability measure may be more appropriate for modelling “credence”:

I should say at this point that, even though ΠX(P) is not technically a probability, I am a halfer in that I believe it is the right quantity SB needs to calculate to inform her degree of ‘credence’ in being in a state where heads had been tossed. It does not seem ΞX(P) [the thirders probability] reflects the temporal or behavioural properties of the experiment. To see this, imagine a mild modification of the SB experiment (one where the institute in which the experiment is carried out is under cost pressures): if Heads is tossed then the experiment ends after the Monday (so the bed may now be used for some other experiment on the Tuesday). This experiment now runs for one day less if Heads was tossed. There are two behaviours of the experiment: one we denote by pTails which involves passing through two states S1 = (Mon, Tails), S2 = (Tue, Tails) ; and the other we denote by pHeads which involves passing through one state S3 = (Mon,Heads). Let P = {S3}, which corresponds to the behaviour pHeads . That is, to say the experiment is in P is the same as saying it is is in the behaviour pHeads. Note π(pHeads) = 1/2 , but ΞX(P) = 1/3 . So the thirders view is that the probability of the experiment being in the state corresponding to the behaviour pHeads (i.e. the probability of the experiment being in the behaviour pHeads) is actually different to the probability of pHeads occurring!

This halfer “probability” has some interesting characteristics:

There are some consequences of the definition for ΠX(P) above that relate to what some thirders claim are inconsistencies in the halfers’ position (to do with conditioning). In fact, in the context of calculating such probabilities, a form of ‘interference’ can arise for the series composite of two experiments (i.e. the experiment constructed as ‘first do experiment 1, then do experiment 2’), which does not arise for the probabilistic join of two experiments (i.e. the experiment constructed as ‘with probability p do experiment 1, with probability 1-­p do experiment 2’).

In a purely formal manner (and, of course, not in a deeper physical sense) this ‘non­locality’, and the importance of defining the starting and ending states of an experiment when calculating probabilities, reminds me of the interference of quantum mechanical experiments (as, say, described by Feynman in the gem of a book QED). I have no idea if this formal similarity has any significance at all or is completely superficial.

Giulio goes on to make an interesting conjecture about composition of Sleeping Beauty experiments:

We could describe this limiting case of a composite experiment as follows. You wake up in a room with a white glow. A voice speaks to you. “You have died, and you are now in eternity. Since you spent so much of your life thinking about probability puzzles, I have decided you will spend eternity mostly asleep and only be awoken in the following situations. Every Sunday I will toss a fair coin. If the toss is tails, I will wake you only on Monday and on Tuesday that week. If the toss is heads, I will only wake you on Monday that week. When you are awoken, I will say exactly the same words to you, namely what I am saying now. Shortly after I have finished speaking to you, I will put you back to sleep and erase the memory of your waking time.” The voice stops. Despite your sins, you can’t help yourself, and in the few moments you have before being put back to sleep you try to work out the probability that the last toss was heads. What do you decide it is?

In this limit, Giulio argues that a halfer progresses to the thirder position, assigning 1/3 to the probability that the last toss was heads!

These brief excerpts don’t do full justice to the framework Giulio has developed, but I do consider it a serious attempt to encompass all of the temporal/non-temporal, in-experiment/out-of-experiment subtleties that the Sleeping Beauty problem throws up. This paper is only for the mathematically inclined and, like so much written on this subject, I doubt it will convince many thirders, but if nothing else I hope it will put Giulio’s mind at rest having the paper published here on the Mule. Over recent weeks, his thoughts have been as plagued by this problem as have mine.

Possibly Related Posts (automatically generated):

58 thoughts on “Sleeping Beauty – a “halfer” approach

  1. Giulio Katis

    I am loath to get involved in this personal exchange, but as my name has come in to it, I should clarify something. Yannis, the ‘thirders’ formula I wrote and which you describe as absurd is, as RSM has pointed out, the frequentist definition of Manley’s. For the record, I see nothing absurd with this probability measure per se: it says the likelihood of being in state x is greater than the likelihood of being in state y iff the expected number of times the experiment visits state x > the expected number of times the experiment visits state y. Of course, I don’t believe it is the right definition to use from the point of a subject inside an experiment since you can construct simple Markov Chains where the frequentist measure is not consistent with the measure on the set of behaviours (in the precise sense that the frequentist probability of being in a subset of states that completely characterize a behaviour can be different from the probability of that behaviour occuring).

  2. Ioannis Mariolis

    @ Giulio Katis
    Although I have addressed my previous post to RSM, I don’t regard my discussion with him/her as a personal exchange of posts. I believe one can find interesting arguments on the main subject in them, and I would be very pleased if others commented on those arguments.
    I would not like to repeat my objections on why, in my opinion, it makes no sense to compute the probability “that an external observer randomly inspecting the state of an experiment finds the state to be satisfying P” using the formula Giulio associates with the thirders’ approach on the SB problem. You can find them in my post on October 3, 2014 at 11:53 pm (No. 14). In my opinion it all comes down to how one expects the random selection of the state (by an external observer) to take place, and the ” thirders’ ” formula is suggesting (in the way I see it) an absurd way to do this selection. Although, it is a very interesting topic I am not sure whether we should start extensively debating on that in this thread.

  3. RSM

    I have finished reading the Briggs article I linked to earlier. Here is a summary.

    After introducing the SB problem, Briggs defines a “halfer rule” and a “thirder rule” in mathematical terms. She then goes into a discussion about evidential decision theory vs. causal decision theory. I have to admit that this was a distinction I was previously unaware of. Essentially, expected utilities are calculated differently depending on whether or not one considers different observer instances to be causally linked.

    Briggs then proceeds to evaluate the halfer and thirder rules with respect to three criteria.

    The first criterion is the familiar Dutch book argument. Briggs shows that accepting or rejecting bets based on whether the bet is at least as good as “thirder odds” is the only sure way to avoid a Dutch book. But what does this mean? It turns out, she demonstrates, that thirders will bet at thirder odds iff they are causal decision theorists, and halfers will bet at thirder odds iff they are evidential decision theorists. Thirders who are evidential decision theorists, and halfers who are causal, will be susceptible to Dutch books — but not the same Dutch book arrangement in each case.

    Thus the desirability of the halfer rule or the thirder rule hinges on which type of decision theory one accepts. In my mind, this is analogous to factoring a number in two different ways, neither of which is unequivocally right or wrong: halfer x evidential = correct decision, or thirder x causal = correct decision.

    The second criterion Briggs investigates is that of scoring rules. For this, she favors the quadratic (Brier) scoring rule, although she suggests that any proper scoring rule will probably suffice. Again, the halfer and thirder rules are evaluated, under evidential and causal decision theories, with respect to minimizing the expected inaccuracy according to Brier scores. And once again, it turns out that (halfer rule x evidential decision theory) is equivalent in performance to (thirder rule x causal decision theory).

    So, it seems at this point that either rule is equally pragmatic, both with respect to avoidance of Dutch books and minimizing expected inaccuracy, provided one is using the decision theory that goes best with that rule.

    As a side note, I recall Ioannis, in his paper, mentioning that the halfer and thirder calculate betting odds differently but “coincidentally” reach the same result. Under Briggs’ analysis, the agreement is not at all coincidental, but follows inevitably from the choice of a decision theory to match one’s credence rule (halfer or thirder).

    Finally, Briggs looks at a third criterion, stability. And here she finds a difference between the halfer and thirder rule. For, according to her analysis, if irrelevant de dicto information is introduced (a la the Technicolor Beauty variation), the halfer rule may lead to a change in credence, while the thirder rule does not produce any change in credence. This conclusion seems to parallel my observation that the thirder rule preserves the notion that two agents with identical beliefs and priors should have identical credences, while the halfer rule forces the notion to be abandoned. So in the end, we have one criterion that favors the thirder rule, independent of decision theory, and two criteria that favor the thirder rule provided that we are causal decision theorists (suggesting that causal decision theory is to be preferred, as well).

    I did not check Ms. Briggs’ mathematical derivations, although perhaps I will do so at a future date if time permits. I also did not verify that she has given accurate mathematical formulations of causal and evidential decision theories. However, the logical reasoning appears to be sound, and if both of these mathematical concerns check out okay, then Briggs has found a mathematical justification of the thirder approach. To me, that is more satisfying than a purely philosophical or psychological justification (though to be sure, there is a philosophical assumption, but a reasonable one I think, that “stability” is itself a desirable thing).

  4. RSM

    Ioannis, I have not wanted to start a flame war here, nor am I trying to win an argument. I am just trying to give an exposition of my views. In doing so, I am finding it necessary to defend them against straw-man arguments. In particular, your characterization of the propositions I have abbreviated as M and Tu as not being mutually exclusive follows from an inaccurate paraphrase of what these symbols represent. In fact, I have always used these symbols to designate “It is Monday” and “It is Tuesday”, respectively, which are mutually exclusive propositions: At most one of them can be true in any given instance (the corner case of midnight between the two days can safely be treated as an irrelevant infinitesimal).

  5. Ioannis Mariolis

    RSM wrote:
    “Ioannis, I have not wanted to start a flame war here, nor am I trying to win an argument. I am just trying to give an exposition of my views. ”

    Dear RSM, I never thought otherwise, and the same applies for me. Our exchange of posts is all about giving an exposition of our views. However, our disagreement is based on some very subtle points. I agree with you that: “It is Monday” and “It is Tuesday”, are mutually exclusive propositions. Throughout the conducted experiment SB could either be on a Monday or on a Tuesday but never on both. However, in order for “It is Monday” and “It is Tuesday” to be considered as mutually exclusive events you have to describe the situation that when it arises it can either result to “It is Monday” or “It is Tuesday”. The conducted experiment does not qualify as such a situation, since in case of Tails in the same trial you can first observe that “It is Monday” and the next day observe that “It is Tuesday”. In other words it does not include a mechanism for selecting one of the two awakenings in case of Tails. This is why I have proposed SBRE, which includes by its definition such a mechanism and performs a random selection between “It is Monday” and “It is Tuesday” in case of Tails. Although such a situation never occurs in the real world, SB can use it to model her own situation upon awakening, since she is uncertain on the day of the week.
    For a better illustration of my point I would like to ask, how would you simulate the experiment you are using to compute P(Heads) upon awakening? Do you agree with Stubborn Mule’s position on how thirders’ simulation should be? Even in that case it is evident that a selection between “It is Monday” and “It is Tuesday” is made. The problem with that simulation is that the selection of the day is considered independent from the coin toss, which is contradicting the fact that in case of Heads a Monday awakening is always selected. Thus, SBRE is not as irrelevant as you consider. It is just like the thirders’ simulation Stubborn Mule has presented, with the subtle difference that it takes into account the dependency between the day selection and the coin toss result.

    Regarding betting odds you commented:
    “As a side note, I recall Ioannis, in his paper, mentioning that the halfer and thirder calculate betting odds differently but “coincidentally” reach the same result. ”
    I would like to clarify that I explain why this happens. I characterize it as “coincidence”, because if one computes the expected gain for any other values of wager/payoff the two approaches produce different results and I argue that halfer’s approach estimates the correct gain. After all, one can simulate the betting setup and estimate the expected gain to test my analysis.

  6. Pingback: The Role of Cycles in Charting the Unknown

  7. vote for pedro

    I thought I would share my own interpretation. I think the problem is about a distinction between “believe/infer” and “guessing/predicting”.

    I think it is helpful to start with “uncontroversial” assumptions. In particular, everyone agrees that
    before the experiment starts.

    Now the normal Bayesian process is to calculate p(Heads | D , I) where D is whatever data has been observed. So the question then becomes “what is the data D that has been observed?”
    I think the data is “awake on day 1 or day 2”. But….this is already implied by the experiment p(D | I)=1 which means
    p(Heads | D , I)=p(Heads | I)=1/2

    But what about the gambling perspective that the thirders insist on? Well, we need to establish the decision framework for this problem. Suppose we adopt the decision that if asked, we will bet 1 dollar that its heads. If correct, we get ‘g’ dollars. what is the value for ‘g’ that has zero expected profit?

    If the coin was heads, then on day 1 we will have (g-1) profit. However, if the coin was tails, we will have (-2) profit. It is a 2, because we play/bet twice (once each day). Suppose we set p(Heads|D,I)=q. Then the expected profit is


    Setting to zero and solve for g and we get

    g = 2/q – 1

    So, the “halfer” should set q=1/2, which gives g=3. But this is the “thirder” argument!
    I think this is the source of the paradox. We do not see the usual connection between fair bets and probability. Typically we would see the fair bet to be g=1/q, which for “halfers” means g=2.
    But this game has a random/uncertain cost to play, something not usually seen in betting. The betting framework is usually fixed cost, and uncertain gains. Properly working through the decision space makes it clear.

Leave a Reply