[go: up one dir, main page]

Academia.eduAcademia.edu
UC Berkeley Faculty Publications Title Logics of Imprecise Comparative Probability Permalink https://escholarship.org/uc/item/1m3156ps Authors Ding, Yifeng Holliday, Wesley Halcrow Icard, Thomas Frederick, III Publication Date 2021-04-30 License https://creativecommons.org/licenses/by-nc-nd/4.0/ 4.0 Peer reviewed eScholarship.org Powered by the California Digital Library University of California Logics of Imprecise Comparative Probability Yifeng Ding† , Wesley H. Holliday† , and Thomas F. Icard, III‡ † University of California, Berkeley and ‡ Stanford University Preprint of April, 2021. Forthcoming in International Journal of Approximate Reasoning . Abstract This paper studies connections between two alternatives to the standard probability calculus for representing and reasoning about uncertainty: imprecise probability and comparative probability. The goal is to identify complete logics for reasoning about uncertainty in a comparative probabilistic language whose semantics is given in terms of imprecise probability. Comparative probability operators are interpreted as quantifying over a set of probability measures. Modal and dynamic operators are added for reasoning about epistemic possibility and updating sets of probability measures. Keywords: imprecise probability, comparative probability, logic and probability. 1 Introduction While the standard probability calculus remains the dominant formal framework for representing uncertainty across numerous disciplines, a small but significant tradition in philosophy, economics, computer science, and statistics has contended that the precision inherent in assigning “sharp” probabilities to uncertain events is often inappropriate. The reasons are several. One obvious concern is the psychological reality of arbitrarily precise real-valued judgments (Boole 1854; Keynes 1921; Koopman 1940; Good 1962; Suppes 1974). As Suppes (1974) expresses the concern, “Almost everyone who has thought about the problems of measuring beliefs in the tradition of subjective probability or Bayesian statistical procedures concedes some uneasiness with the problem of always asking for the next decimal of accuracy in the prior estimation of a probability” (p. 160). Another quite distinct concern is that even for a certain kind of idealized agent free of computational or representational limitations, in many important cases the available evidence somehow underdetermines the “right” probability function to have, and it would be epistemically unfitting to opt for any one of them (Carnap 1936; Levi 1974; Joyce 2005; Konek Forthcoming). A number of alternative formal frameworks have been advanced (see, e.g., Halpern 2003). Our focus here is on two especially prominent alternatives. Some authors favor a sort of generalization of the probability calculus, allowing uncertainty to be measured by sets of probability functions (Good 1962; Levi 1974; Walley 1991; Seidenfeld et al. 2012; see Bradley 2019 for a philosophical overview). This imprecise probability framework retains many of the benefits of standard Bayesian representation and reasoning—indeed allowing the standard picture to emerge as a special case—while also affording a wider range of epistemic attitudes. Philosophical questions about imprecise probability have generated a great deal of discussion in recent years (see, e.g., Joyce 2005; Schoenfield 2012; Rinard 2013; Bradley and Steele 2014; Moss 2020). A second line of work renounces the demand for explicit numerical judgments altogether, arguing that qualitative, especially comparative, judgments should be the primitive building blocks for the theory of uncertainty (Keynes 1921; Koopman 1940; Fine 1973; Hawthorne 2016; see Konek 2019 for a philosophical overview). Aside from being intuitively simpler and arguably closer to “ordinary” expressions of uncertainty, some 1 authors have argued that this setting of comparative probability is perhaps uniquely suited to solving notable epistemic puzzles (Fine 1977; DiBella 2018; Eva 2019). Others have sought more ameliorative reconciliations between the quantitative and qualitative approaches so as to capitalize on the advantages of each (see, e.g., Suppes and Zanotti 1976 and Elliott 2020). Our aim in this paper is neither to weigh in on the debate between precise and imprecise versions of probabilism, nor to adjudicate between the quantitative and the qualitative alternatives, but rather to shed light on the connections between them. Only quite recently have even the most basic questions about such connections been clarified (Rı́os Insua 1992; Alon and Lehrer 2014; Alon and Heifetz 2014; Harrison-Trainor et al. 2016). This is of interest from all perspectives. If one takes sets of probability measures as primitive, it would nevertheless be desirable to understand some of the core qualitative commitments implicit in this representation, including how such commitments relate to those of precise probability and other frameworks. Most conspicuously, the generalization to sets of measures brings with it a rejection of the infamous comparability principle (also sometimes called opinionation or totality), according to which every two events ought to be compared in probability. Indeed, rejection of this principle has served as one of the primary arguments against precise probabilism. As Keynes (1921) expressed it a century ago: Is our expectation of rain, when we start out for a walk, always more likely than not, or less likely than not, or as likely as not? I am prepared to argue that on some occasions none of these alternatives hold, and that it will be an arbitrary matter to decide for or against the umbrella. If the barometer is high, but the clouds are black, it is not always rational that one should prevail over the other in our minds, or even that we should balance them. (p. 30) Aside from the rejection of comparability, are there other differences between the precise and imprecise probabilistic frameworks that surface in this qualitative setting? Likewise, we can ask about various additional qualitative notions aside from the usual “weak” comparison ‘at least as likely as’. For example, whereas the strict version of this judgment, ‘more likely than’, is easily definable in the precise setting in terms of weak comparison, this is no longer the case in the imprecise setting (see Section 2 below), raising new questions about the qualitative principles characterizing this distinctive kind of unanimity operator. If, on the other hand, one takes qualitative judgments as primitive, this has the potential advantage of discarding principles forced upon us by (even imprecise) probabilistic representations. This may be desirable, e.g., if one is solely concerned with certain epistemic virtues such as maximizing accuracy (Fitelson and McCarthy 2014). At the same time, there are also arguments that purport to show why an agent who maintains only comparative judgments would not want to violate qualitative probabilistic principles (Fishburn 1986; Fitelson and McCarthy 2014; Icard 2016). For example, suppose that we operationalize a judgment of the form ‘A is more likely than B’ in terms of a disposition to opt for a prospect that pays some positive dividend conditional on A over one that pays the same amount conditional on B. Moreover, suppose that satisfying this preference is worth some cost, while judgments of the form ‘A and B are equally likely’ engender no such disposition. Then one can show that an agent will be forced into choosing strictly dominated actions (worse than some other available option no matter how the world turns out) if and only if the agent’s judgments fail to comport with any set of probability measures (Icard 2016). Arguments like these highlight the importance of gaining a better understanding of what compatibility of comparative judgments with imprecise probability means. In the present paper we take a logical approach, studying a sequence of increasingly expressive qualitative formal systems, all interpreted over sets of probability measures. To illustrate the type of reasoning we would like to systemize, consider the following examples. Example 1.1. A patient learns from her doctor of the existence of a gland in the human body and of a disease previously unknown to her.1 The doctor informs her that if her gland 1 This example is inspired by van Benthem’s (2011, p. 164, p. 166) example of the hypochondriac. 2 is swollen, then it is more likely than not that she has the disease. Subsequently the patient’s gland is examined, and she learns that it is swollen. As a result, she comes to think it is more likely than not that she has the disease. How should we model the patient’s evolving uncertainty? A natural approach is to represent her relevant uncertainty using the following set of four possible states: {hswollen, diseasei, hswollen, no diseasei, hnot swollen, diseasei, hnot swollen, no diseasei}. Initially, the patient knows nothing about the gland or the disease. We represent this ignorance using the set P of all probability measures on the state space above. Next, when her doctor informs her that if her gland is swollen, then it is more likely than not that she has the disease, we eliminate from her set of measures all measures except those for which the probability of disease conditional on a swollen gland is greater than the probability of no disease conditional on a swollen gland. This gives us a new set P ′ of measures. Finally, when she has the gland examined and learns that it is swollen, we condition each measure in P ′ on the information that the gland is swollen, giving us a final set P ′′ of measures. All measures in P ′′ give a higher probability to disease than no disease. How should one model the example using the standard representation of an agent’s uncertainty with a single probability measure? First, the standard representation forces the agent to have sharp probabilities that her gland is swollen and that she has the disease, even when she just learns of their existence and knows nothing else about them. It also forces her to have a sharp conditional probability for having the disease conditional on her gland being swollen, before the doctor tells her anything about the connection between the two. Suppose she thinks that disease and no disease are equally likely conditional on her gland being swollen. What do we then do with her probability measure when the doctor informs her that if her gland is swollen, then it is more likely than not that she has the disease? One idea would be to replace her probability measure with the “closest” measure for which the conditional probability of disease given a swollen gland is greater than that of no disease given a swollen gland; but the existence of a unique closest such measure is clearly problematic. Another idea is that we must give up the simple state space above. Instead, we must use a complicated state space involving possibilities for what her doctor might say to her. On this approach, the patient must start out with a sharp conditional probability for having the disease conditional on her doctor uttering at time t the words “if your gland is swollen, then it is more likely than not that you have the disease.” Assuming this conditional probability is greater than .5, it follows that conditional on the doctor not uttering those words at time t, the probability she assigns to having the disease will be less than .5. In order to allow that time t may pass in silence without the patient changing her probability for disease, we must introduce still further distinctions in the state space, beyond the distinction that the doctor may or may not utter the indicated words at t. Though we will not argue that the modeling approach with a single probability measure is unworkable, in this paper we wish to explore the multi-measure approach sketched above. We will fully formalize the swollen gland example in Section 6.2. There we will even model the patient’s becoming aware of the distinction between having a swollen gland and not having a swollen gland and of the distinction between having the disease and not having the disease, creating the state space and set P of measures above. The next example is one in which it is essential to consider the possibilities for what an informant may say. It was made famous by vos Savant (1991) in the Monty Hall version of the puzzle posed by Selvin (1975). We will present the earlier but mathematically equivalent Three Prisoners version of the puzzle from Gardner (1959a; 1959b). Example 1.2. The following is Diaconis and Zabell’s (1986, p. 30) description of the Three Prisoners puzzle (also see Diaconis 1978 and Halpern 2003): Of three prisoners a, b, and c, two are to be executed, but a does not know which. He therefore says to the jailer, “Since either b or c is certainly going to 3 be executed, you will give me no information about my own chances if you give me the name of one man, either b or c, who is going to be executed.” Accepting this argument, the jailer truthfully replies, “b will be executed.” Thereupon a feels happier because before the jailer replied, his own chance of execution was two-thirds, but afterward there are only two people, himself and c, who could be the one not executed, and so his chance of execution is one-half. Under what conditions could a’s reasoning possibly be sound? Imagine there are four relevant ways the world could be: wab , wac , wbc , and wcb , where in wij prisoner i is the one who lives and prisoner j is the one who the jailer says will be executed. Assuming that each prisoner is equally likely to be spared, we can assume wbc and wcb both have probability onethird, and the disjunction “wab or wac ” has probability one-third. Concerning the relative probability of wab and wac , we could apply a principle of indifference and proclaim that the jailer is equally likely to announce b or announce c, in case a is the one to be spared. It is then easy to compute that the conditional probability of being spared after learning that b will be executed (and thus wac and wbc can be eliminated as possibilities) is still one-third. In this case a learns nothing from the jailer’s announcement. By contrast, if for whatever reason a thinks the jailer is certain to tell him it is b who will be executed when a is the one to be spared, then learning b will be executed does rationally lead a to conclude that he now has a one-half chance of survival. There is an intuition in this scenario that the right way to respond to the evidence is to leave the relative likelihood of wab and wac open: to represent a’s uncertainty in terms of the set of all probability measures that assign one-third to each of wbc , wcb , and the disjunction “wab or wac .” In this case the probabilities of wab and wac each range from zero to one-third, under the constraint that their sum is one-third. Updating each such measure by eliminating wac and wbc results in a range of posterior probability values for a surviving, from zero to one-half. Thus, the probability that a is spared (the disjunction “wab or wac ”) has dilated (Walley 1991) from precisely one-third to the entire interval [0, 1/2]. Examples 1.1 and 1.2 illustrate some important aspects of imprecise probabilistic reasoning, which surface already in a purely qualitative setting. By the end of this paper, we will be able to formalize Examples 1.1 and 1.2 in a dynamic logic of updating imprecise comparative probability (Examples 6.5 and 6.21). The outline of the paper is as follows. In Section 2, we consider the pure order-theoretic setting of comparative probability and prove a representation theorem extending previous results in the literature. The theorem concerns both a weak and a strict comparative relation together represented by a set of probability measures (Theorem 2.7). In Section 3, we turn to the logical setting and review some completeness theorems for logics of precise and imprecise probability with a single weak comparative relation (Theorems 3.7, 3.9). In Section 4, we consider a logical language that includes both weak and strict comparative relations and, using the representation in Theorem 2.7, prove a corresponding completeness theorem (Theorem 4.4). Section 5 explores the addition of a primitive “possibility” operator asserting the existence of a probability measure with a given property, culminating again in a complete axiomatization (Theorem 5.5), plus an analysis of complexity (Theorem 5.12). In Section 6, we turn to modeling the dynamics of learning. In Section 6.1, we add to our language an update operator whose semantics is given by a process of discarding from one’s set of measures any measure assigning zero probability to the learned proposition and then conditioning the remaining measures on the learned proposition. With this we can model updating on pure comparative probability formulas (through the discarding part), as well as non-probabilistic (ontic) formulas (through the conditioning part) and mixed probabilisticontic formulas. The language also allows the formalization of basic comparative conditional probabilities. Yet we prove that the extended language is in fact no more expressive than the previous system from Section 5: the extended language can be completely axiomatized by a set of “reduction axioms” (Theorem 6.8). Finally, in Section 6.2, we add a second dynamic operator for becoming aware of a new proposition (recall how the patient becomes 4 aware of the existence of the gland and disease in Example 1.1). When an agent becomes aware of a new proposition, we form a new state space by splitting each state in her old state space in two, one where the new proposition is true and the other where it is false, and we form a new set of probability measures by taking all measures on the new set of propositions that when restricted to just the old propositions coincide with some old measure. We show that this language is more expressive than our previous languages, allowing us to express any linear inequality with integer coefficients about the probability of formulas. What emerges is a landscape of increasingly expressive logical systems, consistent with both precise and imprecise probabilistic representations, simple but sufficiently powerful to model sophisticated reasoning about uncertainty. Perhaps surprisingly, the computational complexity of reasoning (e.g., determining validity or consistency) in each of the “static” systems is no worse than for the classical propositional calculus. The complexity of reasoning in the dynamic logic of updating sets of probability measures is an open problem, as is the complexity and axiomatization of the dynamic logic of becoming aware. 2 Representation Before introducing any explicit logical calculus, in this section we consider the pure ordertheoretic setting of comparative probability. A comparative notion of probability is most naturally formalized as a binary relation on an algebra of events. However, not all binary relations can be intuitively interpreted as comparing how likely events are, just as not all functions from events to [0, 1] can be interpreted as assigning quantitative probabilities. Taking the usual axiomatization of quantitative probability for granted, a natural question— posed early on by de Finetti (1949)—is what would be a set of axioms that are intuitive and in harmony with those quantitative axioms. This question was first solved for finite event algebras by Kraft et al. (1959). Given a binary relation % on ℘(W ), where W is a finite set, and a probability measure µ on ℘(W ), we say that % is precisely represented by µ if for all X, Y ⊆ W , X % Y iff µ(X) ≥ µ(Y ). Theorem 2.1 (Kraft et al. 1959). Let W be a nonempty finite set and % a binary relation on ℘(W ). Then % is precisely represented by some probability measure on ℘(W ) if and only if: • ∅ 6% W , {w} % ∅ for all w ∈ W , and for all A, B ∈ ℘(W ), A % B or B % A, and • % satisfies the finite cancellation condition (FC): letting 1X denote the characteristic function any two finite sequences hAi ini=1 , hBi ini=1 of events in ℘(W ) such Pn of X, for P n W that i=1 1Ai = i=1 1Bi (additions are done in the vector space R ), if for all i < n, Ai % Bi , then Bn % An . Following the same paradigm, we can consider a comparative notion of imprecise probability and ask the following question: which binary relations on a finite algebra of events can be naturally interpreted as an imprecise version of the at-least-as-likely-as relation? More precisely, given a binary relation % on ℘(W ), where W is a finite set, and a set P of probability measures on ℘(W ), we say that % is imprecisely represented as the weak relation by P if for all X, Y ⊆ W , X % Y iff for all µ ∈ P, µ(X) ≥ µ(Y ). The following analogue of Theorem 2.1 was proved by Rı́os Insua (1992) (also see Alon and Lehrer 2014). Theorem 2.2 (Rı́os Insua 1992). Let W be a nonempty finite set and % a binary relation on ℘(W ). Then % is imprecisely represented as the weak relation by some set P of probability measures on ℘(W ) if and only if: • ∅ 6% W , {w} % ∅ for all w ∈ W , and 5 • % satisfies the generalized finite cancellation condition (GFC): for any two finite sePn−1 quences hAi ini=1 , hBi ini=1 of events in ℘(W ) and k ∈ N \ {0} such that i=1 1Ai + Pn−1 k1An = i=1 1Bi + k1Bn , if for all i < n, Ai % Bi , then Bn % An .2 Remark 2.3. Harrison-Trainor et al. (2016) prove that there are relations % satisfying the conditions of Theorem 2.1 except for the comparability principle (that for all A, B ∈ ℘(W ), A % B or B % A) and which fail to satisfy the GFC condition in Theorem 2.2. Thus, it is necessary to strengthen FC to GFC when dropping comparability to obtain Theorem 2.2. A subtlety not covered by Theorem 2.2 is that given a set P of probability measures, there are two natural ways to generate a strict relation, corresponding to the strict and the weak dominance relation in game theory: • X strictly dominates Y in P iff for all µ ∈ P, µ(X) > µ(Y ); • X weakly dominates Y in P iff for all µ ∈ P, µ(X) ≥ µ(Y ), and there is a µ ∈ P such that µ(X) > µ(Y ). When % is represented as the weak relation by P, it is easy to see that X weakly dominates Y iff X % Y but Y 6% X. However, we cannot pin down the strict dominance relation simply from the weak relation % or vice versa, as shown by the following example. Example 2.4. Let W = {w, v} and consider the four binary relations %1 , %2 , ≻1 , ≻2 pictured below from left to right (for dashed arrows, reflexive and transitive arrows are omitted; for solid arrows, transitive arrows are omitted). W W {w} {v} ∅ %1 {w} W {v} ∅ {w} %2 W {v} ∅ ≻1 {w} {v} ∅ ≻2 If all we know about a set P of probability measures on ℘(W ) is that its weak relation is %1 , then both ≻1 and ≻2 may be P’s strict dominance relation. For example, we can define a probability measure µw<v on ℘(W ) that favors v so that µw<v ({w}) = 1/3. Then let µw=v be the uniform distribution on ℘(W ): µw=v ({w}) = µw=v ({v}) = 1/2. Then for both {µw<v , µw=v } and {µw<v }, their weak relation is %1 . Yet the strict dominance relation of the former is ≻1 while the strict dominance relation of the latter is ≻2 . Similarly, if all we know about P is that its strict dominance relation is ≻1 , then both %1 and %2 may be its weak relation. For this, define a probability measure µw>v that favors w so that µw>v ({w}) = 2/3. Then we see that the strict dominance relation of both {µw<v , µw=v } and {µw<v , µw>v } is ≻1 while the weak relation of the former is %1 and the weak relation of the latter is %2 . In light of these considerations, we introduce the following definition that accounts for both relations; cf. Konek (2019, p. 275, footnote 4), who suggests that the study of comparative probability ought to start with pairs h%, ≻i because an agent who judges that X is at least as likely as Y but withholds judgment about whether Y is at least as likely as X does not necessarily judge that X is strictly more likely than Y . Definition 2.5. Given a pair h%, ≻i of binary relations on ℘(W ) and a set P of probability measures on ℘(W ), we say that h%, ≻i is represented by P iff for all X, Y ⊆ W , • X % Y iff for all µ ∈ P, µ(X) ≥ µ(Y ), and • X ≻ Y iff for all µ ∈ P, µ(X) > µ(Y ). 2 Note that n can be 1, in which case the condition simply expresses the reflexivity of %. 6 Remark 2.6. Define X < Y as not Y ≻ X, i.e., there is some µ ∈ P such that µ(X) ≥ µ(Y ) (cf. the notion of justifiable preference in Lehrer and Teper 2011). Then the pair h%, <i of weak relations is what Giarlotta and Greco (2013) call a necessary and possible preference. The following theorem characterizes the representable relation pairs. Theorem 2.7. Let W be a nonempty finite set and %, ≻ two binary relations on ℘(W ). Then h%, ≻i is represented by a set P of probability measures on ℘(W ) if and only if: • ≻ is irreflexive and ≻ ⊆ %; • W ≻ ∅, and {w} % ∅ for all w ∈ W ; • % satisfies (GFC) and ≻ satisfies the strict generalized finite cancellation condition (SGFC): for any two finite sequences hAi ini=1 , hBi ini=1 of events in ℘(W ) and k ∈ N\{0} Pn−1 Pn−1 such that i=1 1Ai + k1An = i=1 1Bi + k1Bn , if for all i < n, Ai % Bi and there is i < n with Ai ≻ Bi , then Bn ≻ An . The rest of this section is devoted to the proof of Theorem 2.7. The proof is adapted from the proof of Theorem 2.2 above in Alon and Lehrer 2014, which also generalizes the proof in Scott 1964 for Theorem 2.1 (also see Mierzewski 2018, § 3.3 for a representation theorem concerning h%, ≻i in the setting of precise probability). For this, pick a nonempty finite set W and a pair h%, ≻i satisfying the conditions (the necessity of the conditions is easy). The main strategy is to reframe the representability of h%, ≻i in terms of the existence of solutions to some systems of homogeneous linear inequalities in the vector space RW . Hence we use vectors in ∆(W ) = {µ ∈ RW | µ · 1W = 1 and for all w ∈ W, µ(w) ≥ 0} as probability measures. Define D% = {1A − 1B | A, B ⊆ W, A % B} and D≻ = {1A − 1B | A, B ⊆ W, A ≻ B}. Intuitively, D% contains vectors that always receive non-negative measures and D≻ contains vectors that always receive positive measures. Given the conditions satisfied by % and ≻, we can prove the following lemmas. Lemma 2.8. If f ∈ {−1, 0, 1}W is a non-negative linear combination of vectors in D% , then f ∈ D% . Proof. Suppose f ∈ {−1, 0, 1}W is a non-negative linear combination of vectors in D% . Since all the vectors are in {−1, 0, 1}W , we can assume that all coefficients are rational since a system of linear inequalities with rational coefficients has a solution if and only if it has a rational solution. Then we can clear the denominators and obtain Pan k ∈ N \ {0} such that kf is simply a sum of vectors in D% possibly with repetitions: i=1 gi . Since f and the gi ’s are in D% , we can find subsets Ai , Bi for i = 1 . . . n + 1 of W such that • gi = 1Ai − 1Bi for i = 1 . . . n and f = 1Bn+1 − 1An+1 (take Bn+1 = f −1 (1) and An+1 = f −1 (−1)), and • Ai % Bi for i = 1 . . . n. Pn Pn Pn Then given that kf = i=1 gi , we have n=1 1Ai + k1An+1 = i=1 1Bi + k1Bn+1 . Hence n+1 we can apply (GFC) to hAi in+1 i=1 and hBi ii=1 and see that Bn+1 % An+1 . Therefore, f = 1Bn+1 − 1An+1 ∈ D% . Lemma 2.9. If f ∈ {−1, 0, 1}W is a non-negative linear combination of vectors in D% ∪ D≻ with a coefficient for a vector in D≻ being positive, then f ∈ D≻ . Proof. Similar to the proof of the previous lemma. The only change in this case is that when we find k and express kf as a sum of vectors in D% ∪ D≻ , at least one vector in D≻ must figure in the sum since initially the non-negative linear combination resulting in f has a positive coefficient for a vector in D≻ . Then we can find sets Ai ’s and Bi ’s similarly and apply (SGFC) to see that f must be in D≻ already. 7 Now define P = {µ ∈ ∆(W ) | ∀f ∈ D% , µ · f ≥ 0 and ∀f ∈ D≻ , µ · f > 0}. Our goal is to show that h%, ≻i is represented by this P. Note that one direction is done already: for any A, B ⊆ W , • if A % B, then by the definition of P, for all µ ∈ P, µ · (1A − 1B ) ≥ 0, which means that µ · 1A ≥ µ · 1B ; • similarly, if A ≻ B, then for all µ ∈ P, µ · (1A − 1B ) > 0, which means that µ · 1A > µ · 1B . Hence all that are left to prove are the following two claims: (a) If A 6% B, then there is a µ ∈ P such that µ · (1A − 1B ) < 0; (b) If A 6≻ B, then there is a µ ∈ P such that µ · (1A − 1B ) ≤ 0. For (a), it is enough to prove that for all f ∈ {−1, 0, 1}W , if f 6∈ D% , then there is µ ∈ P such that µ · −f > 0, since for any A, B ⊆ W , we have 1A − 1B ∈ {−1, 0, 1}W . Hence take such an f ∈ {−1, 0, 1}W \ D% . We need to find a µ such that (i) µ ∈ P and (ii) µ · −f > 0. Given the definition of P, this amounts to the existence of a solution to the following system of homogeneous linear inequalities (where we write [D] for the matrix containing as columns the vectors in a set D of vectors): [D% ]⊤ ~x ≥ ~0, [D≻ ∪ {−f }]⊤ ~x > ~0. (1) The existence of a µ satisfiying (i) and (ii) is equivalent to the existence of a solution to the above system of inequalities because by assumption, W ≻ ∅ and {w} % ∅ for all w ∈ W , which means that 1W ∈ D≻ and 1{w} ∈ D% for all w ∈ W , so any solution can be scaled to be an element in P. The condition of the existence of a solution is given by a special case of Motzkin’s Transposition Theorem (see Motzkin 1951). Theorem 2.10 (Motzkin’s Transposition Theorem). The linear inequality system M1 ~x ≥ 0, M2 ~x > ~0 has a solution if and only if there is no solution to the system M1⊤ ~y1 + M2⊤ ~y2 = ~0, ~y1 ≥ ~0, ~y2 ≥ ~0, y~2 6= ~0. Suppose toward a contradiction that there is no solution to (1). Then by Motzkin’s Transposition Theorem, there are non-negative ~y1 , y~2 with ~y2 non-trivial such that [D% ]⊤ ~y1 + [D≻ ∪ {−f }]⊤ y~2 = ~0. In other words, ~0 is a non-negative linear combination of vectors in D% ∪ D≻ ∪ {−f } with one of the vectors in D≻ ∪ {−f } having a positive coefficient. Now there are two possibilities: either −f has a positive coefficient or not. If not, then ~0 is a non-negative linear combination of vectors in D% ∪ D≻ with a vector in D≻ having a positive coefficient. Then, by Lemma 2.9, ~0 ∈ D≻ . This contradicts the assumption that ≻ is irreflexive. If −f has a positive coefficient, then f is a linear combination of vectors in D% ∪ D≻ = D% since ≻ ⊆ %. By Lemma 2.8, f ∈ D% , but we picked f specifically from outside D% . Hence, either way, we have a contradiction. This completes the proof of (a). The proof of (b) is almost identical. It is enough to show that for any f ∈ {−1, 0, 1}W \ D≻ , the following has a solution: [D% ∪ {−f }]⊤ ~x ≥ ~0, [D≻ ]⊤ ~x > ~0. If there is no solution, then by Motzkin’s Transposition Theorem, ~0 is a non-negative linear combination of vectors in D% ∪ {−f } ∪ D≻ with at least one vector in D≻ having a positive coefficient. Again, we consider whether −f has a positive coefficient or not. If not, then ~0 should again be in D≻ , which is not the case. If indeed −f has a positive coefficient, then f is a linear combination of vectors in D% ∪ D≻ with at least one vector in D≻ having a positive coefficient. By Lemma 2.9, f ∈ D≻ , contradicting the way we picked f . Hence (b) is also proved, which completes the proof of Theorem 2.7. 8 Remark 2.11. The sets D% and D≻ used in the proof above are reminiscent of an alternative but also prominent way of modelling uncertainty in the imprecise probability literature: sets of desirable gambles (see Walley 2000 and chapters in Augustin et al. 2014 for introductions). For any event A ⊆ W , we may interpret it as a gamble that returns a unit of utility for the states in A and returns nothing for states outside A. In other words, we can understand comparing the likelihoods of two events A and B as comparing the two corresponding gambles 1A and 1B , which in turn reduces to the question of whether the gamble 1A − 1B is acceptable/desirable. However, there are two important differences between our setting and the desirable gambles approach commonly presented in the literature. First, since we are only comparing propositions, we do not need to appeal to the desirability of gambles not in {−1, 0, 1}W . In the literature on desirable gambles, all gambles in RW are considered, and that is partly the reason for succinct axioms for coherent sets of desirable gambles such as closure under positive scaling and pairwise addition. The same cannot be done when restricting to {−1, 0, 1}W , since for example 1W + 1W is no longer in {−1, 0, 1}W . Also, it is not hard to see that different coherent sets of desirable gambles have the same intersection with {−1, 0, 1}W . This means that using sets of desirable gambles in RW , we may encode more information than needed for comparing propositions. Second, we model an agent’s uncertainty with a pair of binary relations, and hence when translated to sets of desirable gambles, we use a pair of sets of desirable gambles instead of a single one. This can be easily seen from the proof above: we constructed a pair hD% , D≻ i from h%, ≻i. If we disregard the previous difference and consider all gambles in RW , our approach can be understood as generalizing representation by a single set of almost desirable gambles (using the terminology in Couso and Moral 2011) by pairing it with another set of gambles that can be interpreted as strictly desirable gambles. However, the axiomatic requirement for this set is weaker than the requirement for “sets of strictly desirable gambles” in Couso and Moral 2011. More importantly, our axiomatic requirement concerns two sets jointly, as can be seen from Lemma 2.9. In this way, we achieve greater generality (expressivity) than merely using a set of almost desirable gambles. We leave further comparison between these two approaches to imprecise probability for future work. 3 The Logic IP(%) In this section and the following sections, we turn to the formalization of imprecise comparative probabilistic reasoning in logical systems. The representation theorems of Section 2 lead to completeness theorems for these logical systems. The logics we consider form a hierarchy of increasing expressive power of their languages. The least expressive language we will consider is the following. Definition 3.1. The language L(%), generated from a nonempty set Prop of propositional variables, is defined by the following grammar: ϕ ::= p | ¬ϕ | (ϕ ∧ ϕ) | (ϕ % ϕ) where p ∈ Prop. A propositional (or Boolean) formula is a formula generated from Prop using only ¬ and ∧. We define the other propositional connectives ∨, →, ↔, ⊤, and ⊥ as usual. Finally, we define ϕ  ψ as (ϕ % ψ) ∧ ¬(ψ % ϕ) and ϕ ≈ ψ as (ϕ % ψ) ∧ (ψ % ϕ). We will consider several semantics for this language, each of which builds on the standard possible world models for propositional logics. Definition 3.2. A propositional model is a pair M = hW, V i where W is a nonempty set and V : Prop → ℘(W ). We may abuse notation and write ‘w ∈ M’ to mean w ∈ W . The first semantics for L(%) that we will consider, which may be considered its “intended semantics,” equips a propositional model with one or more probability measures, as follows. 9 Definition 3.3. An imprecise probabilistic model (IP model) is a pair hM, Pi where M = hW, V i is a propositional model and P is a set of finitely additive probability measures on a field F of subsets of W such that V (p) ∈ F for each p ∈ Prop. A precise probabilistic model is an imprecise probabilistic model hM, Pi such that |P| = 1. The key part of the truth definition of formulas of L(%) in IP models matches the notion of imprecise representation from Section 2: ϕ % ψ is true just in case according to all the probability measures in P, the probability of the set of worlds where ϕ is true is at least as great as the probability of the set of worlds where ψ is true. Definition 3.4. Given an IP model hM, Pi, w ∈ M, and ϕ ∈ L(%), we define M, P, w  ϕ and JϕKM,P = {w ∈ W | M, P, w  ϕ} as follows: 1. M, P, w  p iff w ∈ V (p); 2. M, P, w  ¬ϕ iff M, P, w 2 ϕ; 3. M, P, w  (ϕ ∧ ψ) iff M, P, w  ϕ and M, P, w  ψ; 4. M, P, w  ϕ % ψ iff for all µ ∈ P, µ(JϕKM,P ) ≥ µ(JψKM,P ). If α is a propositional formula, we may write ‘V (α)’ for JαKM,P to emphasize that the set of worlds where α is true does not depend on the set P of probability measures. Finally, given a class K of IP models, ϕ is valid with respect to K iff for all hM, Pi ∈ K and w ∈ M, we have M, P, w  ϕ. An easy induction shows that for any formula ϕ, the set of worlds where ϕ is true belongs to the algebra F of measurable sets. Lemma 3.5. For every IP model hM, Pi and ϕ ∈ L(%), we have JϕKM,P ∈ F. Below we define logics that are sound and complete with respect to the classes of imprecise probabilistic models and precise probabilistic models, respectively. To do so, we first need to define a syntactic abbreviation that allows us to express the finite cancellation condition of Theorem 2.1 using formulas of our language. Given formulas ϕ1 , . . . , ϕn , ψ1 , . . . , ψn ∈ L(%), and 1 ≤ k ≤ n, define Ck to be the disjunction of all conjunctions f 1 ϕ1 ∧ · · · ∧ f n ϕ n ∧ g 1 ψ 1 ∧ · · · ∧ g n ψ n where exactly k of the f ’s and k of the g’s are the empty string, and the rest are ¬. Thus, Ck is true at a state w ∈ W iff exactly k of the ϕ’s and k of the ψ’s are true at w. Then let (ϕ1 , . . . , ϕn ) ≡ (ψ1 , . . . , ψn ) := C1 ∨ · · · ∨ Cn , which is true at a state w ∈ W iff the number of ϕ’s true at w is exactly the same as the number of ψ’s true at w. Using these abbreviations, we can express the finite cancellation condition with the axiom schema (A4) below. Definition 3.6. The set of theorems of SP(%) (the logic of sharp probability) is the smallest subset of L(%) that contains all tautologies of propositional logic, is closed under modus ponens (if ϕ ∈ SP(%) and ϕ → ψ ∈ SP(%), then ψ ∈ SP(%)) and necessitation (if ϕ ∈ SP(%), then ϕ % ⊤ ∈ SP(%)), and contains all instances of the following axiom schemas for all n ∈ N:3 (A0) (ϕ % ψ) ∨ (ψ % ψ); (A1) ϕ % ⊥; 3 The labeling of axioms here follows Alon and Heifetz 2014. 10 (A2) ϕ % ϕ;4 (A3) ¬(⊥ % ⊤);  (A4) (ϕ1 % ψ1 ) ∧ · · · ∧ (ϕn % ψn ) ∧ (ϕ1 , . . . , ϕn , ϕ′ ) ≡ (ψ1 , . . . , ψn , ψ ′ ) % ⊤ → (ψ ′ % ϕ′ );  (A5) (ϕ % ψ) → (ϕ % ψ) % ⊤ ;  (A6) ¬(ϕ % ψ) → ¬(ϕ % ψ) % ⊤ . The representation result in Theorem 2.1 may be used to prove the following completeness theorem for SP(%). Theorem 3.7 (Segerberg 1971; Gärdenfors 1975). For all ϕ ∈ L(%): ϕ is a theorem of SP(%) if and only if ϕ is valid with respect to the class of all precise probabilistic models. To obtain a complete logic for imprecise probabilistic models, we express the generalized finite cancellation conditions of Theorem 2.2 using formulas of our language as follows. Definition 3.8. The logic IP(%) (the logic of imprecise probability) is defined in the same way as SP(%) except without axiom (A0) and with (A4) replaced by:  (A4′ ) (ϕ1 % ψ1 ) ∧ · · · ∧ (ϕn % ψn ) ∧ (ϕ1 , . . . , ϕn , ϕ′ , . . . , ϕ′ ) ≡ (ψ1 , . . . , ψn , ψ ′ , . . . , ψ ′ ) % ⊤ | {z } | {z } k times → (ψ ′ % ϕ′ ). k times The representation result in Theorem 2.2 may be used to prove the following completeness theorem for IP(%). Theorem 3.9 (Alon and Heifetz 2014). For all ϕ ∈ L(%): ϕ is a theorem of IP(%) if and only if ϕ is valid with respect to the class of all imprecise probabilistic models. In Section 4 we will give a completeness proof that shows how the proof of Theorem 3.9 goes as well. 4 The Logic IP(%, ≻) Our first step beyond the existing literature on logics of imprecise comparative probability is to add to our formal language the primitive strict operator ≻ from Section 2. Definition 4.1. The language L(%, ≻) is defined by the following grammar: ϕ ::= p | ¬ϕ | (ϕ ∧ ϕ) | (ϕ % ϕ) | (ϕ ≻ ϕ) where p ∈ Prop. As before, we define ϕ  ψ as (ϕ % ψ) ∧ ¬(ψ % ϕ). Let L(≻) be the fragment of L(%, ≻) in which % does not occur. Definition 4.2. We extend the semantics of Definition 3.4 to L(%, ≻) as follows: • M, P, w  ϕ ≻ ψ iff for all µ ∈ P, µ(JϕKM,P ) > µ(JψKM,P ). It follows from Example 2.4 that the formula p ≻ q is not equivalent to any formula of L(%), including p  q, while the formula p % q is not equivalent to any formula of L(≻). In the following, we first present a sound and complete logic for L(%, ≻) whose axioms match the conditions of the representation result in Theorem 2.7. Then we discuss the expressivity of this language, including how it is more expressive than L(%). 4 Axiom (A2) is redundant given (A0), but below we consider a logic that drops (A0). In fact, (A2) is also derivable from the n = 0 case of (A4) and (A4′ ), but we include (A2) to match Alon and Heifetz 2014. 11 4.1 Logic Definition 4.3. The logic IP(%, ≻) is the smallest subset of L(%, ≻) that contains all tautologies of propositional logic, is closed under modus ponens (if ϕ ∈ IP(%, ≻) and ϕ → ψ ∈ IP(%, ≻), then ψ ∈ IP(%, ≻)) and necessitation (if ϕ ∈ IP(%, ≻), then ϕ % ⊤ ∈ IP(%, ≻)), and contains all instances of the following axiom schemas for all n ∈ N: (B1) ϕ % ⊥; (B2) ⊤ ≻ ⊥; (B3) (ϕ ≻ ψ) → (ϕ % ψ); (B4) ¬(ϕ ≻ ϕ);  (B5) ϕ1 , . . . , ϕn , ϕ′ , . . . , ϕ′ ) ≡ (ψ1 , . . . , ψn , ψ ′ , . . . , ψ ′ ) % ⊤ → | {z } | {z } k times k times  Vn ( i=1 (ϕi % ψi )) → (ψ ′ % ϕ′ ) ;  (B6) (ϕ1 , . . . , ϕn , ϕ′ , . . . , ϕ′ ) ≡ (ψ1 , . . . , ψn , ψ ′ , . . . , ψ ′ ) % ⊤ → | {z } | {z } k times  Vnk times Wn ( i=1 (ϕi % ψi ) ∧ i=1 (ϕi ≻ ψi )) → (ψ ′ ≻ ϕ′ ) ; (B7) (ϕ % ψ) → ((ϕ % ψ) % ⊤); (B8) ¬(ϕ % ψ) → (¬(ϕ % ψ) % ⊤); (B9) (ϕ ≻ ψ) → ((ϕ ≻ ψ) % ⊤); (B10) ¬(ϕ ≻ ψ) → (¬(ϕ ≻ ψ) % ⊤). The rest of this section is devoted to the proof of the following theorem. Theorem 4.4 (Soundness and Completeness). For all ϕ ∈ L(%, ≻): ϕ is a theorem of IP(%, ≻) if and only if ϕ is valid with respect to the class of all imprecise probabilistic models. The soundness direction is easy to check. For completeness, as usual, pick an arbitrary formula γ consistent in IP(%, ≻) and let p be the set of propositional variables appearing in γ and L0 the restriction of L(%, ≻) to p. Then extend {γ} to a set Γ that is maximally consistent in IP(%, ≻) with respect to L0 . Now our goal is to build an IP model of γ by extracting information from Γ. To this end, we view L0 as a term algebra of the type of Boolean algebras expanded with two binary operations. Then define ϕ by ϕ ∧ (ϕ % ⊤), F = {ϕ ∈ L0 | ϕ ∈ Γ}, and define a binary relation ∼ on L0 by ϕ ∼ ψ iff (ϕ ↔ ψ) ∈ F . Lemma 4.5. F contains ⊤ and is closed under deduction in L0 : whenever ϕ → ψ ∈ L0 is a theorem of IP(%, ≻) and ϕ ∈ F , then ψ ∈ F too. Also, ∼ is an equivalence relation extending the provable equivalence relation on L0 and is congruential over ¬, ∧, %, and ≻: for all ϕ, ψ, χ ∈ L0 , if ϕ ∼ ψ, then ¬ϕ ∼ ¬ψ, (ϕ ∧ χ) ∼ (ψ ∧ χ), (χ ∧ ϕ) ∼ (χ ∧ ψ), (ϕ % χ) ∼ (ψ % χ), (χ % ϕ) ∼ (χ % ψ), (ϕ ≻ χ) ∼ (ψ ≻ χ), and (χ ≻ ϕ) ∼ (χ ≻ ψ). Proof. When n = 0, (B5) together with necessitation shows that for every ϕ, ϕ % ϕ is a theorem. Then clearly ⊤ ∈ F . To show that F is closed under deduction in L0 , noting that Γ is clearly closed under deduction in L0 due to its being a maximally consistent set, it is enough to show that whenever ϕ → ψ ∈ IP(%, ≻), we have (ϕ % ⊤) → (ψ % ⊤) ∈ IP(%, ≻) too. For this, apply (B5) to hϕ, ψ ∧ ¬ϕ, ⊤i and h⊤, ⊥, ψi. Since F is closed under deduction in L0 and contains ⊤, F also contains all theorems of IP(%, ≻) in L0 . Hence it is easy to show that ∼ is an equivalence relation extending the provable equivalence relation on L0 that is congruential over ¬ and ∧. To show that ∼ is congruential over % and ≻, using again that Γ is closed under deduction in L0 , we only need to show that the following are derivable: 12 • ((ϕ ↔ ψ) % ⊤) → ((ϕ % χ) ↔ (ψ % χ)); • ((ϕ ↔ ψ) % ⊤) → (((ϕ % χ) ↔ (ψ % χ)) % ⊤); • ((ϕ ↔ ψ) % ⊤) → ((ϕ ≻ χ) ↔ (ψ ≻ χ)); • ((ϕ ↔ ψ) % ⊤) → (((ϕ ≻ χ) ↔ (ψ ≻ χ)) % ⊤). In fact, the second and the fourth follow from the first and the third using (B7) to (B10), the closure of (· % ⊤) under deduction, and Boolean reasoning. The first and the third are again simple exercises using (B5) and (B6), respectively. Lemma 4.6. B = L0 /∼ is a Boolean algebra expanded with two binary operations which we denote again by % and ≻. Moreover, by axioms (B7) to (B10), for any a, b ∈ B, a % b is either the top element or the bottom element, and so is a ≻ b. In addition, B is finite. Proof. Since ∼ is a congruence extending the provable equivalence relation and IP(%, ≻) has all Boolean reasoning principles, B is a Boolean algebra. To see that a % b is either the top element or the bottom element, pick any ϕ, ψ ∈ L0 such that [ϕ]∼ = a and [ψ]∼ = b. Then note that either ϕ % ψ ∈ Γ or ¬(ϕ % ψ) ∈ Γ. In the former case, given (B7), we have (ϕ % ψ) ∈ F and hence a % b = [ϕ % ψ]∼ is the top element. In the latter case, using (B8), ¬(a % b) is the top element, which means that a % b is the bottom element. The same reasoning goes for a ≻ b, using (B9) and (B10). Finally, to see that B is finite, note first that it has a finite set of generators: [p]∼ = {[p]∼ | p ∈ p}. Since we have just shown that % and ≻ only bring elements to either the top element or the bottom element, in generating B from [p]∼ we can use only the Boolean operations. Hence the Boolean reduct of B is a finitely generated Boolean algebra, which must be finite. Since (the Boolean reduct of) B is a finite Boolean algebra, it is isomorphic to the powerset algebra of its set of atoms. However, to facilitate the proof of the completeness theorem of the next section, we take the set that includes all possible truth-assignments of propositional variables in p. Definition 4.7. Let Wp = {0, 1}p and Vp : Prop → ℘(W ) be the natural valuation function defined by Vp (p) = {f ∈ Wp | f (p) = 1} when p ∈ p and Vp (p) = ∅ when p 6∈ p. Finally, let Mp = hWp , Vp i. In this way, ℘(Wp ) is essentially the free Boolean algebra generated by the images of p under Vp . The difference between ℘(Wp ) and the Boolean reduct of B is that B might be missing some of the atoms in the sense that some truth-assignments to p may be inconsistent in B. However, from the probabilistic point of view, it is enough to make them impossible probabilistically by assigning them 0 probability. This gives us the advantage of always using the same Mp when satisfying any consistent subset of L0 . To connect Mp to B, first let π be the natural Boolean quotient map π from ℘(Wp ) to B0 such that π(Vp (p)) = [p]∼ . This map is uniquely given since ℘(Wp ) is the free Boolean algebra generated by {V (p) | p ∈ p} and B is generated by {[p]∼ | p ∈ p} using Boolean operations. Then, on ℘(Wp ), we define two binary relations: • X %Γ Y iff π(X) % π(Y ) is the top element of B; • X ≻Γ Y iff π(X) ≻ π(Y ) is the top element of B. Then it is not hard to show the following using the axioms (B1) to (B6). Lemma 4.8. h%Γ , ≻Γ i satisfies all the conditions required in Theorem 2.7. Proof. Note that for every a ∈ B, a = [ϕ]∼ for some ϕ ∈ L0 . Hence any quantification over B, and by the quotient map π, any quantification over ℘(Wp ) as well, can be simulated by 13 quantification over L0 . Since the axioms are schematic, (B1) to (B4) directly translate the first two bullet points of Theorem 2.7. For (GFC) and (SGFC), it is enoughP to note thatPfor any two finite sequences hAi ini=1 n n n and hBi ii=1 of sets in ℘(Wp ) such that i=1 1Ai = i=1 1Bi , we can find two sequences n n hϕi ii=1 and hψi ii=1 of formulas in L0 such that: • for all i = 1 . . . n, we have [ϕi ]∼ = π(Ai ) and [ψi ]∼ = π(Bi ), which implies that Ai %Γ Bi iff ϕi % ψi ∈ Γ and that Ai ≻Γ Bi iff ϕi ≻ ψi ∈ Γ; • [(ϕ1 , . . . , ϕn ) ≡ (ψi , . . . , ψn )]∼ = [⊤]∼ and hence (ϕ1 , . . . , ϕn ) ≡ (ψi , . . . , ψn ) ∈ F , which in turn implies that ((ϕ1 , . . . , ϕn ) ≡ (ψi , . . . , ψn ) % ⊤) ∈ Γ. The existence of these formulas means that we can use (B5) and (B6) to show (GFC) and (SGFC), respectively. Hence, by Theorem 2.7, we obtain a set PΓ of probability measures on ℘(Wp ) such that • X %Γ Y iff for all µ ∈ PΓ , µ(X) ≥ µ(Y ), and • X ≻Γ Y iff for all µ ∈ PΓ , µ(X) > µ(Y ). From this, we can show the following truth lemma. Lemma 4.9. For all ϕ ∈ L0 , π(JϕKhMp ,PΓ i ) = [ϕ]∼ . Proof. By a simple induction on L0 . The only cases of interest are the inductive steps for % and ≻. Note that Jϕ % ψKhMp ,PΓ i is either Wp or ∅. Similarly, we have shown that [ϕ % ψ]∼ is either [⊤]∼ or [⊥]∼ . Then the only missing connection is the following: Jϕ % ψKhMp ,PΓ i = Wp ⇐⇒ ∀µ ∈ PΓ , µ(JϕKhMp ,PΓ i ) ≥ µ(JψKhMp ,PΓ i ) ⇐⇒ JϕKhMp ,PΓ i %Γ JψKhMp ,PΓ i ⇐⇒ (π(JϕKhMp ,PΓ i ) % π(JψKhMp ,PΓ i )) = [⊤]∼ ⇐⇒ ([ϕ]∼ % [ψ]∼ ) = [⊤]∼ ⇐⇒ [ϕ % ψ]∼ = [⊤]∼ . The proof for the case with ϕ ≻ ψ is almost identical. Now note that [γ]∼ is not the bottom element in B, since otherwise [¬γ]∼ would be the top element, and then ¬γ ∈ Γ, which means ¬γ ∈ Γ too, rendering Γ inconsistent. Hence JγKhMp ,PΓ i is nonempty because π(∅) must be [⊥]∼ , which is not [γ]∼ . Take a w ∈ JγKhMp ,PΓ i . Then hMp , PΓ i, w  γ, and we are done. To sum up, we now have the following strengthening of the completeness theorem, noting that there are only finitely many logically inequivalent formulas all using only a finite set p of propositional variables (see Lemma 6.11). Proposition 4.10. For any finite subset p of Prop with L0 being the set of formulas in L(%, ≻) using only the propositional variables in p, and for any Γ ⊆ L0 that is consistent relative to IP(%, ≻), there is a set PΓ of probability measures on ℘(Wp ) and a w ∈ Wp such that Mp , PΓ , w  γ for all γ ∈ Γ. Before we discuss the expressivity of L(%, ≻), we comment on the logic of precise probabilistic models. While ≻ is not definable in L(%, ≻) with respect to all IP models, with respect to precise probabilistic models, ϕ ≻ ψ can be defined simply by ¬(ψ % ψ). Hence we can define the logic SP(%, ≻) as follows. Definition 4.11. The logic SP(%, ≻) is the smallest subset of L(%, ≻) that is closed under modus ponens (if ϕ ∈ SP(%, ≻) and ϕ → ψ ∈ SP(%, ≻), then ψ ∈ SP(%, ≻)) and necessitation (if ϕ ∈ SP(%, ≻) then ϕ % ⊤ ∈ SP(%, ≻)), contains all instances of tautologies of propositional logic, all instances of the axiom schemas (A1) to (A6) for SP(%), and all instances of the axiom schema (A7) (ϕ ≻ ψ) ↔ ¬(ψ % ϕ). 14 Then the following completeness theorem for SP(%, ≻) can be shown in the same way that we just showed the completeness of IP(%, ≻) using instead the representation result in Theorem 2.1. It will be used in the completeness proof for IP(%, ≻, ♦) in the next section. Proposition 4.12. For any finite subset p of Prop with L0 being the set of formulas in L(%, ≻) using only the propositional variables in p, and for any Γ ⊆ L0 that is consistent relative to SP(%, ≻), there is a probability measure µΓ on ℘(Wp ) and a w ∈ Wp such that Mp , {µΓ }, w  γ for all γ ∈ Γ. 4.2 Expressivity In this subsection we discuss the expressivity of L(%) and L(%, ≻) in distinguishing IP models. Given Example 2.4, it should not be surprising that L(%, ≻) is more expressive than L(≻). But here we precisely characterize the expressivity of these languages. Definition 4.13. For any probability measure µ defined on a field F of sets, let %µ and ≻µ be binary relations on F such that for any X, Y ∈ F , X %µ Y iff µ(X) ≥ µ(Y ), and X ≻µ Y iff T µ(X) > µ(Y ). In addition, T for any set P of probability measures defined on F , let %P = {%µ | µ ∈ P} and ≻P = {≻µ | µ ∈ P}. For IP models hW, V, Pi and hW ′ , V ′ , P ′ i, we say that they are %-order-similar in p ⊆ Prop if for any Boolean formulas α, β using only letters in p, • JαKhW,V i %P JβKhW,V i iff JαKhW ′ ,V ′ i %P ′ JβKhW ′ ,V ′ i . We say that they are order-similar in p ⊆ Prop if in addition to the above biconditional for %, it is also true that for any Boolean formulas α, β using only letters in p, • JαKhW,V i ≻P JβKhW,V i iff JαKhW ′ ,V ′ i ≻P ′ JβKhW ′ ,V ′ i . A special case for (%-)order-similarity is worth mentioning. Proposition 4.14. Let hW, V, Pi and hW, V, P ′ i be IP models and p a subset of Prop. Let F be the field of sets generated by the image of p under V . Then hW, V, Pi and hW, V, P ′ i are %-order-similar (resp. order-similar) in p iff %P |F = %P ′ |F (resp. %P |F = %P ′ |F and ≻P |F = ≻P ′ |F ). Proposition 4.15. Let hW, V, Pi and hW ′ , V ′ , P ′ i be IP models and w, w′ worlds in W and W ′ , respectively. Then w and w′ satisfy the same formulas in L(%, ≻) (resp. L(%)) using only propositional variables in p ⊆ Prop iff 1. w and w′ satisfy the same propositional variables in p, and 2. hW, V, Pi and hW ′ , V ′ , P ′ i are order-similar (resp. %-order-similar) in p. Proof. The left-to-right direction is trivial since failure of either 1 or 2 directly translates to a formula in the appropriate language with respect to which w and w′ disagrees. For the right-to-left direction, note first that any comparative formula χ of the form ϕ % ψ or ϕ ≻ ψ is true at one world iff it is true at all worlds. This means that a formula ϕ with χ occurring is equivalent to (χ ∧ ϕ[χ/⊤]) ∨ (¬χ ∧ ϕ[χ/⊥]) where ϕ[χ/⊤] is the result of replacing χ by ⊤ in ϕ, and similarly for ϕ[χ/⊥]. By repeated use of this method, it is not hard to see that every formula in L(%, ≻) using only letters in p is semantically equivalent to a Boolean combination of formulas of one of the following types: • a propositional variables in p, • α % β where α, β are Boolean formulas using only letters in p, and • α ≻ β where α, β are Boolean formulas using only letters in p. 15 The case with L(%) is similar (without the last kind of formula in the above list). The proposition then follows easily. Now we can translate Example 2.4 to a pair of pointed IP models that L(%, ≻) can distinguish but L(%) cannot. Let W = {w, v} and V be the valuation such that V (p) = {w} and V (q) = ∅ for all q ∈ Prop \ {p}. Let µw<v and µw=v be defined as in Example 2.4. Then by Propositions 4.15 and 4.14, L(%) cannot distinguish hW, V, {µw<v , µw=v }i, w from hW, V, {µw<v }i, w, since %{µw<v ,µw=v } and %{µw<v } are the same on ℘(W ). However, ¬p ≻ p distinguishes the pointed models. The Logic IP(%, ≻, ♦) 5 In this section, we further extend our language with a possibility modal ♦. In the context of natural language semantics, one proposal for the meaning of “possibly ϕ” in precise probabilistic models is that ϕ has non-zero probability (Lassiter, 2010, §4.4). In imprecise probabilistic models, we could require either (a) that all measures in P give ϕ non-zero probability or (b) that at least some measure in P gives ϕ non-zero probability. We adopt the weaker interpretation (b) of “possibly ϕ” (not as a proposal in natural language semantics, but because it suits our technical purposes in the next section). In addition to making claims about the possibility of factual states of affairs, e.g., “It is possible that it is raining,” we would like to be able to make claims about the possibility of likelihood relations, e.g., “It is possible that hail is more likely than lightning tonight.” According to the formal semantics given below, the latter will be true when there exists a probability measure in P such that according to that measure hail is more likely than lightning. Definition 5.1. The language L(%, ≻, ♦) is defined by the following grammar: ϕ ::= p | ¬ϕ | (ϕ ∧ ϕ) | (ϕ % ϕ) | (ϕ ≻ ϕ) | ♦ϕ where p ∈ Prop. We define ϕ := ¬♦¬ϕ. Definition 5.2. We extend the semantics of Definition 4.2 to L(%, ≻, ♦) as follows: • M, P, w  ♦ϕ iff there is a µ ∈ P such that µ(JϕKM,{µ} ) 6= 0. Note that with ♦ added, we no longer need ≻ as a primitive in the language, since ϕ ≻ ψ is definable as ¬♦(ψ % ϕ), but we keep ≻ as a primitive for convenience. In the following, we first present a sound and complete logic for the valid formulas in L(%, ≻, ♦). Then we briefly comment on the logic’s complexity. Finally, we show how L(%, ≻, ♦) is more expressive than L(%, ≻) and characterize the expressivity of L(%, ≻, ♦). 5.1 Logic An important logical fact about the set of valid formulas of L(%, ≻, ♦) is that it is not closed under uniform substitution of arbitrary formulas for propositional variables. Example 5.3. The formula (p ≻ ⊥) → ♦(p ≻ ⊥) is valid but (¬((p % q) ∨ (q % p)) ≻ ⊥) → ♦(¬((p % q) ∨ (q % p)) ≻ ⊥) is not valid. The reason is that there is no single probability measure that can make true the non-comparability formula ¬((p % q) ∨ (q % p)). While the failure of uniform substitution can complicate efforts to axiomatize a set of validities (cf. Holliday et al. 2012, 2013), we will completely axiomatize the validities of L(%, ≻, ♦) with the logic IP(%, ≻, ♦) defined below. 16 Definition 5.4. The logic SP(%, ≻, ♦) is the smallest subset of L(%, ≻, ♦) that is (i) closed under modus ponens, uniform substitution, and the rule of replacement of provable equivalents, and (ii) contains all theorems of SP(%, ≻) and ♦p ↔ (p ≻ ⊥). The logic IP(%, ≻, ♦) is the smallest subset of L(%, ≻, ♦) that is (i) closed under modus ponens, the rule of replacement of provable equivalents, and the rule that if ϕ ∈ SP(%, ≻, ♦), then ϕ ∈ IP(%, ≻, ♦), and (ii) contains all substitution instances in L(%, ≻, ♦) of the theorems in IP(%, ≻) and also all instances of the following axiom schemas, where α and β are propositional: (C1) (ϕ ∧ (ϕ → ψ)) → ψ; (C2) ♦⊤; (C3) ϕ → (ϕ % ⊤); (C4) ♦ϕ → (♦ϕ % ⊤); (C5) ϕ ↔ (ϕ % ⊤); (C6) (α % β) ↔ (α % β); (C7) (α ≻ β) ↔ (α ≻ β). The rest of this section is devoted to the proof of the following theorem. Theorem 5.5 (Soundness and Completeness). For all ϕ ∈ L(%, ≻, ♦): ϕ is a theorem of IP(%, ≻, ♦) if and only if ϕ is valid with respect to the class of all imprecise probabilistic models. To prove Theorem 5.5, we first show that (1) there is no need for a ♦ to scope over a ♦ and (2) there is no need for a % or ≻ to scope over a ♦. In other words, we will find a significantly simpler fragment of L(%, ≻, ♦), which we call LSimp , such that every formula in L(%, ≻, ♦) is provably equivalent to a formula in LSimp in IP(%, ≻, ♦). Definition 5.6. Define T−♦ : L(%, ≻, ♦) → L(%, ≻) by: • T−♦ (p) = p for all p ∈ Prop; • T−♦ (¬ϕ) = ¬T−♦ (ϕ); • T−♦ (ϕ ∧ ψ) = T−♦ (ϕ) ∧ T−♦ (ψ); • T−♦ (ϕ % ψ) = T−♦ (ϕ) % T−♦ (ψ); • T−♦ (ϕ ≻ ψ) = T−♦ (ϕ) ≻ T−♦ (ψ); • T−♦ (♦ϕ) = ¬(⊥ % T−♦ (ϕ)). Lemma 5.7. For every ϕ ∈ L(%, ≻, ♦), ϕ ↔ T−♦ (ϕ) is in SP(%, ≻, ♦). Moreover, T−♦ (ϕ) uses the same propositional variables as ϕ does. Proof. A simple induction with repeated use of replacement of equivalents suffices. Lemma 5.8. In IP(%, ≻, ♦), formulas of the form ♦ϕ ↔ ¬¬ϕ are theorems. In addition,  is a normal operator: for any ϕ ∈ L(%, ≻, ♦), (ϕ ∧ (ϕ → ψ)) → ψ is in IP(%, ≻, ♦), and whenever ϕ is in IP(%, ≻, ♦), so is ϕ. Proof. To derive ♦ϕ ↔ ¬¬ϕ, it is enough to derive ♦ϕ ↔ ♦¬¬ϕ. But this is clearly derivable by replacement of equivalents since ♦ϕ ↔ ♦ϕ and ϕ ↔ ¬¬ϕ are theorems. Definition 5.9. Let LSimp be the fragment of L(%, ≻, ♦) generated from Prop and {♦ϕ | ϕ ∈ L(%, ≻)} by ¬ and ∧. 17 In the following, for any p ⊆ Prop, we append [p] to the name of a language to denote the set of formulas in that language using only variables in p. Lemma 5.10. For every ϕ ∈ L(%, ≻, ♦), there is a T (ϕ) ∈ LSimp such that ϕ ↔ T (ϕ) ∈ IP(%, ≻, ♦). Moreover, T (ϕ) and ϕ use the same propositional variables. Proof. By induction on L(%, ≻, ♦). The base case is trivial: we can simply define T (p) = p. The Boolean cases are also trivial: we can define T (¬ϕ) = ¬T (ϕ) and T (ϕ ∧ ψ) = T (ϕ) ∧ T (ψ). For the ♦ case, define T (♦ϕ) = ♦T−♦ (ϕ). To see that ♦ϕ is provably equivalent to ♦T−♦ (ϕ), first note that by Lemma 5.7, ϕ ↔ T−♦ (ϕ) ∈ SP(%, ≻, ♦). But then (ϕ ↔ T−♦ (ϕ)) ∈ IP(%, ≻, ♦). By the normality of , we have ♦ϕ ↔ ♦T−♦ (ϕ) ∈ IP(%, ≻, ♦). To find the appropriate T (ϕ % ψ), given that the required T (ϕ) and T (ψ) in LSimp have been found, we need to extract all ♦’ed formulas in T (ϕ) % T (ψ) so that they are no longer in the scope of the main connective % in T (ϕ) % T (ψ). Clearly this can be done by iteratively using the following claim: (*) for any χ ∈ L(%, ≻) and ϕ, ψ ∈ LSimp , (ϕ % ψ) ↔ (♦χ ∧ (ϕ[♦χ/⊤] % ψ[♦χ/⊤])) ∨ (¬♦χ ∧ (ϕ[♦χ/⊥] % ψ[♦χ/⊥]))) is in IP(%, ≻, ♦). The claim is easily proved using (C3) and (C4). Note that since ϕ, ψ are in LSimp , they are Boolean combinations of propositional variables and formulas of the form ♦χ where χ ∈ L(%, ≻). List all the V♦’ed formulas appearing in ϕ or ψ as δ1 , δ2 , . . . , δn . Then for n any f ∈ {0, 1}n , let δf be i=1 ¬f (i) δi where ¬0 δi is ¬δi and ¬1 δi is simply δi . Moreover, let ϕ[f ] be ϕ[δ1 /⊤f (1) , · · · , δn /⊤f (n) ] and similarly for ψ[f ], where ⊤f (i) = ⊤ if f (i) = 1 and ⊤f (i) = ⊥ if f (i) = 0. With this notation, W it is not hard to see that by repeatedly applying (*), ϕ % ψ is provably equivalent to f ∈{0,1}n (δf ∧ (ϕ[f ] % ψ[f ])) and then also W to f ∈{0,1}n (δf ∧ (ϕ[f ] % ψ[f ])) since for any f , ϕ[f ] and ψ[f ] are propositional since we have replaced all theW♦’ed formulas by either ⊤ or ⊥ and by axiom (C6) we can add a  there. The formula f ∈{0,1}n (δf ∧ (ϕ[f ] % ψ[f ])) is the desired T (ϕ % ψ) since it is clearly in LSimp now. The definition of T (ϕ ≻ ψ) is almost identical: we can simply replace ϕ[f ] % ψ[f ] by ϕ[f ] ≻ ψ[f ]. In this case, we use (C7) instead. Now we are ready to prove the soundness and completeness of IP(%, ≻, ♦). Soundness is clear as usual. For completeness, pick an arbitrary γ that is consistent relative to IP(%, ≻, ♦), and let p be the set of propositional variables used in γ. Then take an arbitrary Γ that is maximally consistent containing γ. Following the standard strategy, let Σ = {(ϕ % ⊤) | ϕ ∈ Γ, ϕ ∈ L(%, ≻)[p]}. Note that Σ ⊆ L(%, ≻)[p]. Also, Σ must be consistent relative to SP(%, ≻) since otherwise there are formulas (ϕ1 % ⊤), (ϕ2 % ⊤), . . . , (ϕn % ⊤) in Σ such that ((ϕ1 % ⊤) ∧ · · · ∧ (ϕn % ⊤)) → ⊥ is in SP(%, ≻). But then by the rules of IP(%, ≻, ♦) and the normality of , we have that ((ϕ1 % ⊤)∧· · ·∧(ϕn % ⊤)) → ⊥ is in IP(%, ≻, ♦). Since ϕ is provably equivalent to (ϕ % ⊤) by (C5), we have that ⊥ is in Γ according to the maximality of Γ, rendering Γ inconsistent since we have (C2). Now let D = {Σ ∪ {¬(ϕ % ⊤)} | ¬ϕ ∈ Γ, ϕ ∈ L(%, ≻)[p]}. Note that for each ∆ = Σ ∪ {¬(ϕ % ⊤)} ∈ D, ∆ is also a set of formulas in L(%, ≻)[p]. Moreover, ∆ must be consistent relative to SP(%, ≻) as well. If not, then since Σ is consistent, we must have formulas (ϕ1 % ⊤), . . . , (ϕn % ⊤) in Σ such that ((ϕ1 % ⊤) ∧ · · · ∧ (ϕn % ⊤)) → (ϕ % ⊤) ∈ SP(%, ≻). Then by reasoning similar to that above, (¬ϕ % ⊤) and hence ¬ϕ are in Γ using (C5), rendering Γ inconsistent. Thus, for each ∆ ∈ D, according to Proposition 4.12, there is a probability measure µ∆ on ℘(Wp ) and a w ∈ Wp such that Mp , {µ∆ }, w  ∆. Note that since all formulas in ∆ are comparison formulas of the form ϕ % ⊤ or its negation, it does not matter what w is. Hence we have that Mp , {µ∆ }  ∆. Take P to be the set {µ∆ | ∆ ∈ D}. Then we are left only to show that there is a w ∈ Wp such that Mp , P, w  ϕ for all ϕ ∈ Γ ∩ L(%, ≻, ♦)[p]. 18 Let w0 be the element in Wp = {0, 1}p defined by w0 (p) = 1 iff p ∈ Γ for all p ∈ p. Then we are ready to show the following truth lemma. Lemma 5.11. For all ϕ ∈ L(%, ≻, ♦)[p], Mp , P, w0  ϕ iff ϕ ∈ Γ. Proof. It is enough to show that for all ϕ ∈ LSimp [p], Mp , P, w0  ϕ iff ϕ ∈ Γ. This is because for any ϕ ∈ L(%, ≻, ♦)[p], according to Lemma 5.10, ϕ ∈ Γ iff T (ϕ) ∈ Γ with T (ϕ) ∈ LSimp [p]. But then T (ϕ) ∈ Γ ⇐⇒ Mp , P, w0  T (ϕ) ⇐⇒ Mp , P, w0  ϕ. The first equivalence holds by the fact that T (ϕ) ∈ LSimp [p] and the truth lemma we will show below in this fragment. The second is by soundness. We now focus on the fragment LSimp [p]. Since the generating operations of this fragment are Boolean, the inductive cases are trivial. The atomic case for propositional variables in p is also trivial by the definition of w0 . Hence we are left to show that for any ϕ ∈ {♦ψ | ψ ∈ L(%, ≻)[p]}, we have ϕ ∈ Γ iff Mp , P, w0  ϕ. In other words, we only need to show that for all ϕ ∈ L(%, ≻)[p], we have ♦ϕ ∈ Γ iff Mp , P, w0  ♦ϕ. • Suppose ♦ϕ 6∈ Γ, so ¬ϕ ∈ Γ. Then (¬ϕ % ⊤) ∈ Σ since ¬ϕ ∈ L(%, ≻)[p], which means (¬ϕ % ⊤) ∈ ∆ for all ∆ ∈ D. Then, for any µ∆ ∈ P, Mp , {µ∆ }  ¬ϕ % ⊤ since (¬ϕ % ⊤) ∈ ∆, which in turn means that µ∆ (JϕKMp ,{µ∆ } ) = 0. This is precisely the condition for ♦ϕ to be false at Mp , P, w0 . • Suppose ♦ϕ ∈ Γ, so ¬¬ϕ ∈ Γ. Then there is a ∆ such that ¬(¬ϕ % ⊤) ∈ ∆ again because ¬ϕ ∈ L(%, ≻)[p]. For this µ∆ then, Mp , {µ∆ } 2 ¬ϕ % ⊤. In other words, µ∆ (JϕKMp ,{µ∆ } ) 6= 0. The existence of this µ∆ ∈ P shows that ♦ϕ is true at Mp , P, w0 . Given the above truth lemma, Mp , P, w0  γ since γ ∈ Γ and γ ∈ L(%, ≻, ♦)[p]. Hence we have successfully found a model for the arbitrarily chosen consistent γ, completing the proof of the completeness of IP(%, ≻, ♦). 5.2 Complexity In this section, we briefly comment on the complexity of the consistency problem of the logic IP(%, ≻, ♦) or equivalently the satisfiability problem of L(%, ≻, ♦). First, adapting the proof of Theorem 9 in Harrison-Trainor et al. 2017, it is not hard to see that the satisfiability problem for a conjunction of literals where we take formulas in both Prop and {♦ϕ | ϕ ∈ L(%, ≻)} as atomic formulas is in NP (note that Theorem 2.6 in Fagin et al. 1990, used in the proof of Harrison-Trainor et al. 2017, allows strict inequalities). Hence the satisfiability problem for LSimp is also in NP. Then to see that the satisfiability problem for L(%, ≻, ♦) is in NP, it is enough to show that every ϕ ∈ L(%, ≻, ♦) is equivalent to a disjunction of formulas in LSimp where each disjunct’s length is bounded by O(|ϕ|). In our proof of Lemma 5.10 above, this is done by extracting ♦ from the scope of % and ≻ and eliminating ♦ in the scope of ♦. Note that the elimination of ♦ in the scope of ♦ can be done before the extraction: given an input formula ϕ, replace each subformula ♦χ not in the scope of any ♦ by ♦T−♦ (χ). The resulting formula, which we call ϕ′ , is clearly at most four times longer than ϕ. Then we only need to run the process of (1) extracting ♦’ed formulas in the scope of % or ≻ and (2) adding a  to a % formula or a ≻ formula when both arguments to the % or ≻ no longer contain modal operators. This process, while introducing disjunctions exponentially, only grows the length of the disjuncts by at most a constant for each extracting operation. The number of total extracting operations is clearly at most the length of the input formula ϕ′ . Thus, we obtain the following. Theorem 5.12. The complexity of the satisfiability problem for L(%, ≻, ♦) is NP-complete. 19 5.3 Expressivity Reflecting the failure of uniform substitution, for any purely propositional formula α, ♦α is already expressible in L(%). Lemma 5.13. Let α, β be propositional formulas. Then: 1. ♦α is equivalent to ¬(⊥ % α); 2. ♦(α % β) and ♦¬(β ≻ α) are both equivalent to ¬(β ≻ α); 3. ♦(β ≻ α) and ♦¬(α % β) are both equivalent to ¬(α % β). However, ♦ϕ is not in general expressible without ♦. Example 5.14. The formula ♦(p ≈ ¬p) is not equivalent to any formula of L(%, ≻). Consider again the propositional model M = hW, V i where W = {w, v} and V (p) = {w} while V (q) = ∅ for all q ∈ Prop \ {p}. Then let P be the set of all probability measures on ℘(W ) and P ′ the set of all probability measure µ on ℘(W ) except the ones that give equal probability to {w} and {v}. Then %P and %P ′ (resp. ≻P and ≻P ′ ) are the same on ℘(W ) and are pictured below: W {w} {v} ∅ Thus, using Propositions 4.15 and 4.14, for any ϕ ∈ L(%, ≻), M, P, w  ϕ iff M, P ′ , w  ϕ. Yet M, P, w  ♦(p ≈ ¬p) while M, P ′ , w 2 ♦(p ≈ ¬p). Now we characterize the expressivity of L(%, ≻, ♦) precisely. Proposition 5.15. Let hW, V, Pi and hW ′ , V ′ , P ′ i be IP models and w, w′ worlds in W and W ′ , respectively. Let p be a subset of Prop. Then w and w′ satisfy the same formulas in L(%, ≻, ♦) using only propositional variables in p if 1. w and w′ satisfy the same propositional variables in p, 2. for any µ ∈ P, there is µ′ ∈ P ′ such that hW, V, {µ}i and hW ′ , V ′ , {µ′ }i are ordersimilar in p, and 3. for any µ′ ∈ P ′ , there is µ ∈ P such that hW, V, {µ}i and hW ′ , V ′ , {µ′ }i are ordersimilar in p. The converse also holds if in addition p is finite. Proof. The left-to-right direction is again easy. For the only non-obvious case, suppose for example that the second clause fails: there is a µ ∈ P such that for any µ′ ∈ P ′ , hW, V, {µ}i and hW ′ , V ′ , {µ′ }i are not order-similar in p. Then let {αi }1≤i≤n be a finite set of Boolean formulas such that every Boolean formula using only letters in p is logically equivalent to some αi (such a set can be found using disjunctive normal forms). We can V now describe µ in full relative to p by the conjunction χ = 1≤i,j≤n si (αi % αj ) where si is empty if µ(Jαi KhW,V i ) % µ(Jαj KhW,V i ) and is ¬ otherwise. Indeed, by the definition of order-similarity, whenever hW, V, {µ}i and hW ′ , V ′ , {µ′ }i are not order-similar in p, at any world in W ′ , χ is false. This means that w′ would falsify ♦µ, but w satisfies ♦µ, showing that the two worlds disagree on a formula in L(%, ≻, ♦). The right-to-left direction follows from the normal form lemma, Lemma 5.10. If the last two clauses hold, then for any formula of the form ♦ϕ where ϕ ∈ L(%, ≻)[p], ♦ϕ is true 20 at M, P, w iff it is true at M, P ′ , w′ . By the first clause, the two pointed IP models also satisfy the same propositional variables in p. Then by a simple induction, they satisfy the same formulas in LSimp [p]. But by Lemma 5.10, this is enough for them to satisfy the same formulas in L(%, ≻, ♦)[p]. The special case where the two IP models share the same propositional model is again worth spelling out. Proposition 5.16. Let M = hW, V i be a propositional model, w and w′ two worlds in W , and P and P ′ nonempty sets of probability measures defined on fields of sets extending V [Prop]. Let p be a subset of Prop and F the field of sets on W generated by V [p]. Then M, P, w and M, P ′ , w′ satisfy the same formulas in L(%, ≻, ♦)[p] if • w and w′ satisfy the same propositional variables in p, • for any µ ∈ P, there is µ′ ∈ P ′ such that %µ |F = %µ′ |F , and • for any µ′ ∈ P ′ , there is µ ∈ P such that %µ |F = %µ′ |F . The converse also holds if in addition p is finite. 6 Dynamics In this section, we consider two kinds of information dynamics in the context of imprecise probability. The first is a standard notion of updating a set of probability measure on new evidence (see, e.g., Halpern 2003, p. 81) where we can eliminate both possible worlds (keeping only the worlds compatible with the evidence) and probability measures (keeping only the probability measures that give the evidence a positive probability measure). Usually, especially in a Bayesian framework, such updates are all we need for information dynamics, since we can always model agents with a universal and all-inclusive state space, anticipating all distinctions that could be made among states. However, there are numerous examples where an agent is not initially aware of a distinction. In Example 1.1, the agent is not initially aware of the gland and hence the distinction between a swollen and normal gland. When the doctor tells the agent about the gland, we can model the agent as first learning the mere existence of a new proposition—the swollen gland proposition—and then learning how this proposition relates probabilistically to her having the disease. Without imprecise probability, we face the perennial question of how to assign a probability for such a new proposition. Given imprecise probability, however, we can simply choose the set of all probability measures that are compatible with one of the old probability measures. This models how an agent can “initialize” her uncertainty toward a newly introduced proposition. In the next two subsections, we discuss the two dynamic operators in more detail. For the update operator, we show how it does not add expressivity to the language L(%, ≻, ♦), and we present a sound and complete logic following the standard “reduction axiom” strategy in dynamic epistemic logic. For the operators modeling the introduction of new propositions, however, we show that they significantly increase expressivity, and we leave the axiomatization of the valid formulas as an open question. 6.1 Updating Probabilities and the Logic IP(%, ≻, ♦, h i) In this subsection, we introduce the update operator h i that models learning the truth of a proposition. Given an initial set P of probability measures, after learning some proposition U ⊆ W with certainty, we update the set P to the set PU = {µ(· | U ) : µ ∈ P, µ(U ) > 0}, where µ(· | U ) is defined by conditionalization as usual: for any V ⊆ W , µ(V | U ) = 21 µ(V ∩U ) µ(U ) . Since we have a formal language with comparative probability operators, we can model updating on sentences containing not only factual formulas but also comparative probability formulas (cf. Weatherson 2007; Yalcin 2011; Moss 2018), as in “it is raining, and it is more likely that there will be hail than it is that there will be lightning” (r ∧ (h ≻ ℓ)). Intuitively, if Ann tells Bob that “hail is more likely than lightning,” she is not telling Bob something about his own epistemic state (which he already knows, in the models of this paper) but is rather recommending that he update his epistemic state to one according to which hail is more likely than lightning—which he can do by discarding from his set of measures any measure according to which hail is not more likely than lightning.5 Our semantics below, developed in the style of dynamic epistemic logic (see, e.g., van Ditmarsch et al. 2008; van Benthem 2011), will allow such updates in response to comparative probability claims. Definition 6.1. The language L(%, ≻, ♦, h i) is defined by the following grammar: ϕ ::= p | ¬ϕ | (ϕ ∧ ϕ) | (ϕ % ϕ) | (ϕ ≻ ϕ) | ♦ϕ | hϕiϕ where p ∈ Prop. We read hαiϕ as “(update with α is possible and) after update with α, ϕ is the case.” As usual, [α]ϕ abbreviates ¬hαi¬ϕ. Definition 6.2. We extend the semantics of Definition 5.2 to L(%, ≻, ♦, h i) as follows: • M, P, w  hϕiψ iff there is a µ ∈ P such that µ(JϕKM,{µ} ) 6= 0 and M, Pϕ , w  ψ, where Pϕ = {ν(· | JϕKM,{ν} ) : ν ∈ P and ν(JϕKM,{ν} ) 6= 0}. Lemma 6.3. The semantics for [ϕ]ψ is as follows: • M, P, w  [ϕ]ψ iff if there is a µ ∈ P such that µ(JϕKM,{µ} ) 6= 0, then M, Pϕ , w  ψ. The following lemma states how updating with a formula ϕ % ψ, if possible, results in restricting one’s set of measures to just those that individually satisfy ϕ % ψ. Lemma 6.4. For any IP model hM, Pi and ϕ, ψ ∈ L(%, ≻, ♦), Pϕ%ψ = ∅ or Pϕ%ψ = {ν ∈ P : M, {ν}  ϕ % ψ}. Let us see how this framework can be used to formalize the three prisoners scenario from Example 1.2. Example 6.5. Let ei and si stand for ‘prisoner i will be executed ’ and ‘the jailer says that prisoner i will be executed’, respectively. Define a propositional model M = hW, V i with W = {wab , wac , wbc , wcb } where at wij , prisoner i is the only prisoner who lives and prisoner j is the prisoner who the jailer says will be executed, so V (ea ) = {wbc , wcb }, V (eb ) = {wab , wac , wcb }, V (ec ) = {wab , wac , wbc }, V (sb ) = {wab , wcb }, V (sc ) = {wac , wbc }. Since prisoner a knows that each prisoner is equally likely to be executed but has no idea about how the jailer is likely to answer his question about which of b or c will be executed (except that the jailer is certain to give a true answer), prisoner a’s epistemic state may be modelled by the following set of probability measures: P = {µ : µ({wab , wac }) = µ({wbc }) = µ({wcb }) = 1/3}. 5 Another possible interpretation is that there is some objectively correct probability measure, and Ann is telling Bob a fact about that measure, which he wants his probabilities to ultimately match. 22 Then the following formulas together capture what is distinctive about the puzzle, all coming out true in this model. First, we can state that each prisoner is equally likely to be spared— indeed that each has one-third chance:    α := ⊥ % (ea ∧ eb ∧ ec ) ∧ (ea ∧ eb ) ∨ (ea ∧ ec ) ∨ (eb ∧ ec ) % ⊤ ∧ (ea ≈ eb ) ∧ (eb ≈ ec ). Second, we can state that the jailer only announces truthfully one of sb and sc : β := ((sb → eb ) % ⊤) ∧ ((sc → ec ) % ⊤) ∧ (⊥ % (sb ∧ sc )). Given the dynamic operator, we can also express a fact about how a’s uncertainty is affected upon learning that b is to be executed. After this announcement, a’s credences dilate from a sharp two-thirds probability to including the possibilities that he is sure to be executed and that he has merely one-half probability of being executed:  hsb i ♦(ea % ⊤) ∧ ♦(ea ≈ ¬ea ) . If, however, a first updates with the information that the jailer is following a protocol of reporting b or reporting c with equal probability in the case that a is to be spared, then dilation no longer occurs. In fact, the probability of ea remains at two-thirds, and for instance the following formula is true:  h(¬ea ∧ sb ) ≈ (¬ea ∧ sc )ihsb i (ea ≻ ec ) ∧ (ea ≻ ¬ea ) ∧ (⊤ ≻ ea ) . Finally, were a to update with the information that the jailer would certainly announce eb in case ea were false, then the probabilities of ea , eb , and ec would all remain equally likely: h⊥ % (¬ea ∧ sc )iα. But after learning that b will be executed, the probability of ea decreases to one-half: h⊥ % (¬ea ∧ sc )ihsb i(ea ≈ ¬ea ). It is important to note that we do not have to resort to the particular model above to model the prisoner case. Indeed, the following formulas are true at any pointed IP model and hence also provable in the complete logic to be presented:  (α ∧ β) → [(¬ea ∧ sb ) ≈ (¬ea ∧ sc )]hsb i (ea ≻ ec ) ∧ (ea ≻ ¬ea ) ∧ (⊤ ≻ ea ) (2) (α ∧ β) → [⊥ % (¬ea ∧ sc )](α ∧ hsb i(ea ≈ ¬ea )) (α ∧ β) → ((♦(⊥ % (¬ea ∧ sb )) ∧ ♦(⊥ % (¬ea ∧ sc ))) → hsb i(♦(ea % ⊤) ∧ ♦(ea ≈ ¬ea ))) . (3) (4) In (2) and (3), we have to use [ ] instead of h i since there are models that satisfy α ∧ β but do not contain probability measures satisfying either (¬ea ∧ sb ) ≈ (¬ea ∧ sc ) or ⊥ % (¬ea ∧ sc ), unlike the particular model above using the all-inclusive P. To cope with this, we need to use the box version of the update operator. In formula (4), the extra premise ♦(⊥ % (¬ea ∧ sb )) ∧ ♦(⊥ % (¬ea ∧ sc )) is again required since dilation crucially relies on P containing both a measure assigning 0 to ¬ea ∧ sb and a measure assigning 0 to ¬ea ∧ sc . In our current language, using the ♦ operator is the most straightforward way to express this. An equivalent way is to use ¬((¬ea ∧ sb ) ≻ ⊥) ∧ ¬((¬ea ∧ sb ) ≻ ⊥). However, the ♦ in ♦(ea ≈ ¬ea ) is necessary: there is no formula in L(%, ≻) that is equivalent to ♦(ea ≈ ¬ea ). To obtain a complete logic for reasoning about updating sets of probability measures, we follow the standard “reduction axiom” strategy used in dynamic epistemic logic: identify a set of valid biconditionals that allow us to reduce any formula containing the dynamic operators hϕi to an equivalent formula of L(%, ≻, ♦) without dynamic operators, which can then be handled by the complete logic for L(%, ≻, ♦). 23 Definition 6.6. The logic IP(%, ≻, ♦, h i) is the smallest set of L(%, ≻, ♦, h i) formulas that is (i) closed under modus ponens and the rule of replacement of equivalents, and (ii) contains all theorems of IP(%, ≻, ♦) as well as all instances of the following axiom schemas where p ∈ Prop and α and β are propositional: (R0) hϕip ↔ (♦ϕ ∧ p); (R1) hϕi♦ψ ↔ ♦hϕiψ; (R2) hϕi¬ψ ↔ (♦ϕ ∧ ¬hϕiψ); (R3) hϕi(ψ ∧ χ) ↔ (hϕiψ ∧ hϕiχ); (R4) hϕi(α % β) ↔ (♦ϕ ∧ ((ϕ ∧ α) % (ϕ ∧ β))); (R5) hϕi(α ≻ β) ↔ (♦ϕ ∧ ((ϕ ≻ ⊥) → ((ϕ ∧ α) ≻ (ϕ ∧ β)))). Example 6.7. In a given model, we may ask if after the agent updates with the information that it is raining and that hail is more likely than lightning tonight the agent judges that it is at least as likely that a window will break as it is that the power will go out: hr ∧ (h ≻ l)i(w % p). This is equivalent, in light of the reduction axiom (R4), to   ♦(r ∧ (h ≻ l)) ∧  (r ∧ (h ≻ l)) ∧ w % (r ∧ (h ≻ l)) ∧ p , which is in turn equivalent to  ♦(r ∧ (h ≻ l)) ∧  (h ≻ l) → (r ∧ w) % (r ∧ p) , i.e., there is some measure that gives r non-zero probability and gives h greater probability than l, and every measure that gives h greater probability than l also makes the probability of w conditional on r at least as great as the probability of p conditional on r. The rest of this section is devoted to the proof of the following theorem. Theorem 6.8 (Soundness and Completeness). For all ϕ ∈ L(%, ≻, ♦, h i): ϕ is a theorem of IP(%, ≻, ♦, h i) if and only if ϕ is valid with respect to the class of all imprecise probabilistic models. The soundness of IP(%, ≻, ♦, h i) is less trivial than the soundness of the previous systems. More importantly, we will use its soundness to prove its completeness, similar to the proof of completeness of other dynamic epistemic logics axiomatized by reduction axioms. Proposition 6.9. For all ϕ ∈ L(%, ≻, ♦, h i): if ϕ is a theorem of IP(%, ≻, ♦, h i), then ϕ is valid with respect to the class of all imprecise probabilistic models. Proof. Clearly it is enough to check the validity of (R0) to (R5). • For (R0), note that the valuation of p is invariant under the updating. • For (R1), the key is to treat hϕi♦ as a whole, whence the semantics of hϕi♦ψ at M, P, w is that there is a µ ∈ Pϕ such that µ(JψKM,{µ} ) > 0. But given the construction of Pϕ , this is precisely saying that there is a ν ∈ P such that ν(JϕKM,{ν} ) > 0 and that, letting µ = ν(· | JϕKM,{ν} ), we have µ(JψKM,{µ} ) > 0. Now note that for any ν ∈ P such that JϕKM,{ν} > 0, letting µ = ν(· | JϕKM,{ν} ), we have JhϕiψKM,{ν} = JψKM,{µ} since {ν}ϕ = {µ}. Hence the truth condition of hϕi♦ψ is transformed into the existence of ν ∈ P such that ν(JϕKM,{ν} ) > 0 and that JhϕiψKM,{ν} > 0. But this is precisely the truth condition of ♦hϕiψ. 24 • For (R2), the key insight is that at M, P, w, assuming that there is a ν ∈ P such that ν(JϕKM,{ν} ) > 0, we have: M, P, w  hϕi¬ψ ⇐⇒ M, Pϕ , w  ¬ψ ⇐⇒ M, Pϕ , w 6 ψ ⇐⇒ M, P, w  ¬hϕiψ. • For (R3), the idea is similar to the above. • For (R4), it is enough to observe the following chain of equivalences assuming that there is a ν ∈ P such that ν(JϕKM,{ν} ) > 0: M, P, w  hϕi(α % β) ⇐⇒ M, Pϕ , w  α % β ⇐⇒ ∀µ ∈ Pϕ , µ(JαKM,Pϕ ) ≥ µ(JβKM,Pϕ ) ⇐⇒ ∀µ ∈ Pϕ , µ(V (α)) ≥ µ(V (β)) ⇐⇒ ∀ν ∈ P such that ν(JϕKM,{ν} ) > 0, ν(V (α) | JϕKM,{ν} ) ≥ ν(V (β) | JϕKM,{ν} ) ⇐⇒ ∀ν ∈ P such that ν(JϕKM,{ν} ) > 0, ν(V (α) ∩ JϕKM,{ν} ) ≥ ν(V (β) ∩ JϕKM,{ν} ) ⇐⇒ ∀ν ∈ P, ν(V (α) ∩ JϕKM,{ν} ) ≥ ν(V (β) ∩ JϕKM,{ν} ) ⇐⇒ ∀ν ∈ P, M, {ν}  (ϕ ∧ α) % (ϕ ∧ β) ⇐⇒ ∀ν ∈ P, ν(J(ϕ ∧ α) % (ϕ ∧ β)KM,{ν} ) = 1 ⇐⇒ M, P, w  ((ϕ ∧ α) % (ϕ ∧ β)). Note that the last three equivalences extensively use the fact that a Boolean combination of comparison formulas is true at a world if and only if it is true at all worlds. The sixth equivalence is true because when ν(JϕKM,{ν} ) = 0, it trivially holds that ν(V (α) ∩ JϕKM,{ν} ) ≥ ν(V (β) ∩ JϕKM,{ν} ). • For (R5), the strategy is the same—it is enough to observe the following chain of equivalences assuming that there is a ν ∈ P such that ν(JϕKM,{ν} ) > 0: M, P, w  hϕi(α ≻ β) ⇐⇒ M, Pϕ , w  α ≻ β ⇐⇒ ∀µ ∈ Pϕ , µ(JαKM,Pϕ ) > µ(JβKM,Pϕ ) ⇐⇒ ∀µ ∈ Pϕ , µ(V (α)) > µ(V (β)) ⇐⇒ ∀ν ∈ P such that ν(JϕKM,{ν} ) > 0, ν(V (α) | JϕKM,{ν} ) > ν(V (β) | JϕKM,{ν} ) ⇐⇒ ∀ν ∈ P such that ν(JϕKM,{ν} ) > 0, ν(V (α) ∩ JϕKM,{ν} ) > ν(V (β) ∩ JϕKM,{ν} ) ⇐⇒ ∀ν ∈ P, if M, {ν}  ϕ ≻ ⊥ then M, {ν}  (ϕ ∧ α) ≻ (ϕ ∧ β) ⇐⇒ ∀ν ∈ P, M, {ν}  (ϕ ≻ ⊥) → ((ϕ ∧ α) ≻ (ϕ ∧ β)) ⇐⇒ ∀ν ∈ P, ν(J(ϕ ≻ ⊥) → ((ϕ ∧ α) ≻ (ϕ ∧ β))KM,{ν} ) = 1 ⇐⇒ M, P, w  ((ϕ ≻ ⊥) → ((ϕ ∧ α) ≻ (ϕ ∧ β))). Again, the last four equivalences extensively use the fact that a Boolean combination of comparison formulas is true at a world if and only if it is true at all worlds. For completeness, we first show that the axioms allow us to provably-equivalently reduce any formula in L(%, ≻, ♦, h i) to a fragment LSimpd1 that is even simpler than the fragment LSimp : the comparison formulas in the scope of any ♦ must not have nested comparison. 25 Definition 6.10. Let LBool be the set of propositional formulas. In other words, this is the fragment generated from Prop by ¬ and ∧. Let LCompd1 be the fragment of L(%, ≻) with no nesting of % and ≻. In other words, this is the fragment generated from Prop and {(α % β), (α ≻ β) | α, β ∈ LBool } by ¬ and ∧. Finally, let LSimpd1 be the fragment of L(%, ≻, ♦) generated from Prop and {♦ϕ | ϕ ∈ LCompd1 } by ¬ and ∧. Lemma 6.11. For every ϕ ∈ L(%, ≻), there is a TCompd1 (ϕ) ∈ LCompd1 such that ϕ ↔ TCompd1 (ϕ) ∈ IP(%, ≻). Moreover, ϕ and TCompd1 (ϕ) use the same propositional variables. Proof. We use a standard argument for extracting comparisons embedded in comparisons. Formally, an induction over L(%, ≻) is needed. The base case and the inductive cases for ¬ and ∧ are trivial as we can simply define TCompd1 (p) = p, TCompd1 (¬ϕ) = ¬TCompd1 (ϕ), and TCompd1 (ϕ ∧ ψ) = TCompd1 (ϕ) ∧ TCompd1 (ψ). For the non-trivial cases for % and ≻, we only need the following: for any α, β ∈ LBool and ϕ, ψ ∈ LCompd1 , the following are in IP(%, ≻): (ϕ % ψ) ↔ (((α % β) ∧ (ϕ[α % β/⊤] % ψ[α % β/⊤])) ∨ (¬(α % β) ∧ (ϕ[α % β/⊥] % ψ[α % β/⊥]))); (ϕ % ψ) ↔ (((α ≻ β) ∧ (ϕ[α % β/⊤] % ψ[α % β/⊤])) ∨ (¬(α ≻ β) ∧ (ϕ[α % β/⊥] % ψ[α % β/⊥]))); (ϕ ≻ ψ) ↔ (((α % β) ∧ (ϕ[α % β/⊤] ≻ ψ[α % β/⊤])) ∨ (¬(α % β) ∧ (ϕ[α % β/⊥] ≻ ψ[α % β/⊥]))); (ϕ ≻ ψ) ↔ (((α ≻ β) ∧ (ϕ[α % β/⊤] ≻ ψ[α % β/⊤])) ∨ (¬(α ≻ β) ∧ (ϕ[α % β/⊥] ≻ ψ[α % β/⊥]))). They are proven mainly by (B7) to (B10). The key idea is to first derive the following: (α % β) → ((ϕ ↔ ϕ[α % β/⊤]) % ⊤); ¬(α % β) → ((ϕ ↔ ϕ[α % β/⊥]) % ⊤); (α ≻ β) → ((ϕ ↔ ϕ[α ≻ β/⊤]) % ⊤); ¬(α ≻ β) → ((ϕ ↔ ϕ[α ≻ β/⊥]) % ⊤). Together with ((ϕ ↔ ψ) % ⊤) → ((ϕ % χ) ↔ (ψ % χ)) and ((ϕ ↔ ψ) % ⊤) → ((ϕ ≻ χ) ↔ (ψ ≻ χ)), the required equivalences can easily be derived. Proposition 6.12. For every ϕ ∈ L(%, ≻, ♦) there is a TSimpd1 (ϕ) ∈ LSimpd1 such that ϕ ↔ TSimpd1 (ϕ) is in IP(%, ≻, ♦). Proof. The result of replacing all ♦χ in TSimp (ϕ) by ♦TCompd1 (χ) is the desired TSimpd1 (ϕ). Proposition 6.13. For every ϕ ∈ L(%, ≻, ♦, h i) there is a TSimpd1 (ϕ) ∈ LSimpd1 such that ϕ ↔ TSimpd1 (ϕ) is in IP(%, ≻, ♦, h i). Proof. We proceed by induction. Given Proposition 6.12 and the rule of replacement of equivalents, the only non-trivial case is to show that there is a TSimpd1 (hϕiψ) that is provably equivalent to hϕiψ in IP(%, ≻, ♦, h i) where ϕ, ψ are in LSimpd1 . By repeated use of (R1) to (R3) and the rule of replacement of equivalents, obviously we can push the hϕi into ψ over Boolean connectives and ♦ and obtain a Boolean combination of formulas of the form hϕip or of the form hϕi(α % β) or hϕi(α ≻ β) since in LSimpd1 , % and ≻ only scope over propositional formulas. All three kinds of formulas can be replaced by formulas in L(%, ≻, ♦) provably equivalently. Then we apply TSimpd1 again to finish off (to eliminate any ♦’s appearing inside ♦’s). With the above reduction method, the completeness of IP(%, ≻, ♦, h i) follows. Proposition 6.14. For all ϕ ∈ L(%, ≻, ♦, h i): if ϕ is valid with respect to the class of all imprecise probabilistic models, then ϕ is a theorem of IP(%, ≻, ♦, h i). 26 Proof. Let ϕ be any valid formula in L(%, ≻, ♦, h i). Then by the soundness of IP(%, ≻ , ♦, h i) and the fact that ϕ ↔ TSimpd1 (ϕ) ∈ IP(%, ≻, ♦, h i), TSimpd1 (ϕ) is also valid. But TSimpd1 (ϕ) ∈ LSimpd1 ⊆ L(≻, ≻, ♦). By the completeness of IP(%, ≻, ♦), TSimpd1 (ϕ) ∈ IP(% , ≻, ♦). By the definition of IP(%, ≻, ♦, h i), it contains all theorems of IP(%, ≻, ♦). Hence TSimpd1 (ϕ) is in IP(%, ≻, ♦, h i). Then by Boolean reasoning, ϕ is in IP(%, ≻, ♦, h i). Although the reduction axioms for L(%, ≻, ♦, h i) allow us to reduce the satisfiability problem for L(%, ≻, ♦, h i) to that for L(%, ≻, ♦), which is in NP (Theorem 5.12), it does not immediately follow that the satisfiability problem for L(%, ≻, ♦, h i) is in NP, due to the blowup in the length of formulas during the reduction process. A similar obstacle occurs in the case of the simplest dynamic epistemic logic (public announcement logic), in which case a solution is to use a satisfiability-preserving reduction with only polynomial blowup instead of the standard validity-preserving reduction with exponential blowup (Lutz 2006). Whether this or other techniques apply to L(%, ≻, ♦, h i) we leave as an open problem. Problem 6.15. Determine the complexity of the satisfiability problem for L(%, ≻, ♦, h i). 6.2 Introducing a New Proposition In the previous subsection, we considered the dynamic update operator that concerns learning the truth of a proposition. In this subsection, we consider the complementary dynamics of learning the mere existence of a proposition and then being maximally uncertain about it in the way of imprecise probability (cf. Joyce 2005). Our goal is to show that this kind of information dynamics is expressively helpful, especially in formalizing examples in a natural way, and we leave the complete axiomatization of its logic as an open question. Definition 6.16. The language L(%, ≻, ♦, h i, I) is defined by the following grammar: ϕ ::= p | ¬ϕ | (ϕ ∧ ϕ) | (ϕ % ϕ) | (ϕ ≻ ϕ) | ♦ϕ | hϕiϕ | Ip+ ϕ | Ip− ϕ where p ∈ Prop. We read Ip+ ϕ as “letting p be a true proposition that is newly introduced to the agent, ϕ”; similarly, Ip− ϕ reads “letting p be a false proposition that is newly introduced to the agent, ϕ”. We also take Ip ϕ as an abbreviation of (Ip+ ϕ ∧ Ip− ϕ). We treat both Ip+ and Ip− as a kind of propositional quantifier, since they change the meaning (denotation) of p, and we define free and bound propositional variables in the obvious way. For any ϕ ∈ L(%, ≻, ♦, h i, I), let Prop(ϕ) be the set of freely occurring propositional variables in ϕ. Now we specify the semantics for I + and I − . First, we define how a model changes when we introduce a new proposition. Definition 6.17. Given a non-empty set W , a field of sets F on W , a valuation V such that V (p) ∈ F for all p ∈ Prop, and a set of finitely additive probability measure P on F, we interpret F as the collection of the “old” propositions. Our goal is to define the result of adding a “new” proposition P . Intuitively, we first split each w ∈ W into hw, 1i and hw, 0i corresponding to P being true and false, respectively, while keeping the truth value of the old propositions. For the probability measures, we take all probability measures defined on both the old and new propositions that, when restricted to just the old propositions, coincide with some old probability measure. The following gives the formal details. • Let F × 2 = {X × {0, 1} | X ∈ F}, which is a field of sets on W × {0, 1}. • Let Split(F) be the smallest field of sets on W × {0, 1} extending F ∪ {W × {0}}. • Let V × 2 be defined such that V × 2(p) = V (p) × {0, 1} for all p ∈ Prop; note that V × 2(p) ∈ F × 2 for all p ∈ Prop. 27 • For any p ∈ Prop, let V +p be defined such that ( V (q) × {0, 1} +p V (q) = W × {1} if q 6= p if q = p ; note that V +p (q) ∈ Split(F) for all q ∈ Prop. • For any finitely additive measure µ on F, define µ × 2, a finitely additive measure on F × 2, by µ × 2(X × {0, 1}) = µ(X) for all X ∈ F. • Let P × 2 = {µ × 2 | µ ∈ P}. • Let Split(P) be the set of all finitely additive measures µ on Split(F) such that µ|F ×2 ∈ P × 2. Using the above definition, given a propositional variable p ∈ Prop and a propositional model M = hW, V i, let M × 2 = hW × {0, 1}, V × 2i and M+p = hW × {0, 1}, V +p i. Then if hM, Pi is an IP model, so is hM+p , Split(P)i, and hM+p , Split(P)i represents the result of adding a new proposition, now denoted by p, to hM, Pi. Remark 6.18. In the algebraic theory of Boolean algebras, there is a standard operation of freely adjoining a new element to a Boolean algebra: for any Boolean algebra B and any a 6∈ B, there is a unique up to isomorphism Boolean algebra B +a such that • B is a subalgebra of B +a , and every element in B +a is generated from B ∪ {a}; • for any b ∈ B, b ∧ a and b ∧ ¬a are not the bottom element in B +a . The operation of Split(F) is precisely the dual of this algebraic operation. Hence, if we use an algebraic model hB, V, Pi where B is a Boolean algebra (of propositions of which the agent is currently aware), V a valuation function from Prop to B, and P a set of finitely additive functions from B to [0, 1], we can easily define the result of adding a new proposition a 6∈ B to be denoted by p as hB +a , V ′ , P ′ i where V ′ coincides with V on Prop except that V ′ (p) = a and P ′ = {µ : B +a → [0, 1] | µ is finitely additive and µ|B ∈ P}. Remark 6.19. The model construction from hM, Pi to hMp , Split(P)i can also be viewed as an event-model update from (probabilistic) dynamic epistemic logic (van Benthem et al. 2009). The event model contains two events {1, 0} corresponding to whether the new proposition is true or not with no preconditions, and the agent is maximally ignorant about these two events: at any of the old worlds, she cannot distinguish between these two events, is completely ignorant about the relative likelihood of these two events, and does not observe which event happens. Using the terminology from van Benthem et al. (2009), the agent is maximally and imprecisely ignorant about the occurrence probability of these two events and makes no observation about these two events. Definition 6.20. The semantics of Ip+ and Ip− are given by M, P, w |= Ip− ϕ iff M+p , Split(P), hw, 0i |= ϕ , M, P, w |= Ip+ ϕ iff M+p , Split(P), hw, 1i |= ϕ . Now let us put the new operators to work. We first use them to formalize the medical example (Example 1.1). Example 6.21. The following sentence is valid and represents the medical example if we take p to mean that the agent has the disease (that is, the proposition introduced by Ip is that the agent has the disease) and q to mean that the gland is swollen (that is, the proposition introduced by Iq is that the gland is swollen). Ip h¬p ≻ piIq h(q ∧ p) ≻ (q ∧ ¬p)ihqi(p ≻ ¬p). 28 (5) We interpret the first update by ¬p ≻ p as the result of the agent observing that she is not feeling uncomfortable and hence believing that her not having the disease is more likely than her having it. The second update represents what the agent learns from the doctor, and the third update represents a medical examination revealing that her gland is swollen. The above simple sentence does not capture more nuanced probabilistic relationships between p and q such as that conditioning on q, p is twice as likely as ¬p or that the medical examination does not reveal q but only a signal that is probabilistically related to q. But with the new operator I, we can easily say these things. For example, to express that p is twice as likely as ¬p conditioning on q, we may introduce two new propositions (like two coin flips) by Ir and Is at the beginning of the formula (note that our syntax forbids embedding I in updates) and later add after Iq the update h((q ∧ r ∧ s) ≈ (q ∧ r ∧ ¬s)) ∧ ((q ∧ ¬r ∧ s) ≈ (q ∧ ¬r ∧ ¬s)) ∧ ((q ∧ r ∧ s) ≈ (q ∧ ¬r ∧ s)) ∧ (⊥ % (q ∧ ¬r ∧ ¬s))i, which says that conditioning on q the two coin flips are fair and independent but the two tail situation is impossible (perhaps because the two coins will be retossed if they are both tails up). Then, using h(q ∧ p) ≈ (q ∧ s)i, we essentially say that p’s probability conditioning on q is 2/3 and thus twice as likely as ¬p. To express that the medical examination only provides an informative signal related to q, we may again introduce a new proposition t representing that signal and then let the agent learn the probabilistic relationship between t and q. Example 6.22. For the prisoner example, recall that α is the formula    ⊥ % (ea ∧ eb ∧ ec ) ∧ (ea ∧ eb ) ∨ (ea ∧ ec ) ∨ (eb ∧ ec ) % ⊤ ∧ (ea ≈ eb ) ∧ (eb ≈ ec ), saying that two of the prisoners will be executed and the probabilities for the three situations are equal. Recall also that β is the following formula ((sb → eb ) % ⊤) ∧ ((sc → ec ) % ⊤) ∧ (⊥ % (sb ∧ sc )), saying that the jailer will announce one and only one prisoner to be executed truthfully. Then the following formula is valid and represents the dilation when a hears that the jailer announces that b will be executed: Iea Ieb Iec hαiIsb Isc hβihsb i(♦(ea % ⊤) ∧ ♦(ea ≈ ¬ea )). As we have seen in Example 6.21, L(%, ≻, ♦, h i, I) is capable of expressing numerical relationships. Leveraging this capability, it is easy to observe that L(%, ≻, ♦, h i, I) is more expressive than L(%, ≻, ♦, h i). Example 6.23. Consider a propositional model M = hW, V i where W = {w, u} has two worlds, V (p) = {w}, and V (q) = ∅ for all q ∈ Prop \ {p}. Then let µ1 be a probability measure on ℘(W ) such that µ1 ({w}) = 0.6, and let µ2 also be a probability measure on ℘(W ) such that µ2 ({w}) = 0.9. Then it is easy to see that M, {µ1 }, w and M, {µ2 }, w satisfy the same formulas in L(%, ≻, ♦, h i). However, the following formula Iq Ir h((q ∧ r) ≈ (q ∧ ¬r)) ∧ ((q ∧ r) ≈ (¬q ∧ r)) ∧ ((q ∧ r) ≈ (¬q ∧ ¬r))i(p ≻ ¬(q ∧ r)), which intuitively says that p is more likely than not getting two heads up from two randomly and independently flipped fair coins, is true at M, {µ2 }, w, but false at M, {µ1 }, w. Indeed, we will show that L(%, ≻, ♦, h i, I) can express any linear inequality with integer coefficients about the probability of formulas. For this, we first introduce some notation. DefinitionV6.24. Let Γ be a finite set of formulas, C(Γ) the set of all clauses (conjunctions of the form ϕ∈Γ ±ϕ where ± is either the empty string or ¬), and p a propositional variable. Then define (p|Γ) to be the formula ^ ((ψ ∧ p) ≈ (ψ ∧ ¬p)). ψ∈C(Γ) 29 Intuitively, (p|Γ) says that p represents a fair coin flip independent of all events expressible using formulas in Γ. Proposition 6.25. For any sequences hϕi ii=1...n and hψi ii=1...m of formulas in L(%, ≻ , ♦, h i, I) and any sequences hai ii=1...n and hbi ii=1...m of natural numbers, there is a formula χ ∈ L(%, ≻, ♦, h i, I) such that for any IP model M, P, w, M, P, w |= χ iff ∀µ ∈ P, n X ai µ(Jϕi KM,P ) ≥ i=1 m X bi µ(Jψi KM,P ). i=1 Proof. The central idea is already in Kraft et al. (1959) and is also described in Section 2 of Ding et al. Forthcoming: we use I operators to introduce new propositions that evenly partition the logical space spanned by ϕi ’s so that we can take the union of multiple copies of the partitioned ϕi ’s to simulate addition. Let l be the smallest natural number such that 2l is larger than the sum of all ai ’s and bi ’s and pick propositional variables hpi ii=1...l not occurring in any of the ϕi ’s and ψi ’s. Then let C list all logically inequivalent clauses made from pi ’s. Since |C| = 2l and 2l is larger than the sum of all coefficients, let f be a function from {1, . . . n} × {0} ∪ {1, . . . m} × {1} to ℘(C) such that f (x) ∩ f (y) = ∅ whenever x 6= y and |f (i, 0)| = ai and |f (i, 1)| = bi . Let Γ be set of all ϕi ’s and ψi ’s. Then consider the following formula: Ip+1 Ip+2 · · · Ip+l h(p1 |Γ) ∧ (p2 |Γ ∪ {p1 }) · · · (pl |Γ ∪ {p1 , p2 , · · · , pl−1 })i n n _ _ _ ( (ϕi ∧ c)) % ( i=1 c∈f (i,0) _ (ψi ∧ c)). (6) i=1 c∈f (i,1) This is the required formula since W of new propositions and the anW after the introduction nouncement, the probability of c∈f (i,0) (ϕi ∧ c) (resp. c∈f (i,1) (ψi ∧ c)) is precisely ai /2l (resp. bi /2l ) times the probability of ϕi (resp. ψi ). Cancelling out the common denominator 2l , we see that the inequality expressed by formula (6) is the required one. Therefore, we see that with the new operators Ip+ and Ip− , L(%, ≻, ♦, h i, I) is capable of expressing quantitative (and in particular arbitrary additive) information. This also means that we cannot use the same reduction strategy we used for L(%, ≻, ♦, h i) to axiomatize the logic in L(%, ≻, ♦, h i, I). However, we conjecture that there is a computable translation from L(%, ≻, ♦, h i, I) to L(%, ≻, ♦, h i) that preserves satisfiability. Such a translation can then be coded as rules instead of axioms that completely axiomatize the logic. Problem 6.26. Find an axiomatization of the set of valid formulas in L(%, ≻, ♦, h i, I). Problem 6.27. Determine the complexity of the satisfiability problem for L(%, ≻, ♦, h i, I). 7 Conclusion In this paper, we have investigated a hierarchy of languages L(%) ⊆ L(%, ≻) ⊆ L(%, ≻, ♦) ⊆ L(%, ≻, ♦, h i) ⊆ L(%, ≻, ♦, h i, I) and matching complete logics for imprecise comparative probabilistic reasoning in the first four languages: IP(%) ⊆ IP(%, ≻) ⊆ IP(%, ≻, ♦) ⊆ IP(%, ≻, ♦, h i). The first four languages have straightforward extensions to the multi-agent setting, in which each agent i has their own comparative probability relations %i and ≻i , allowing us to formalize statements such as “Ann judges it more likely than not that Bob thinks hail is more 30 likely than lightning”: (h ≻b l) ≻a ¬(h ≻b l). A multi-agent version of the language L(%) was already studied in Alon and Heifetz (2014). Generalizing the other languages in this paper to the multi-agent setting presents no major challenges, although the complexity of the resulting multi-agent logics goes beyond that of the single-agent versions, just as the complexity of the basic epistemic logic S5 jumps from NP to PSPACE when moving from the single-agent to multi-agent setting (see Halpern and Moses 1992). When generalizing the language L(%, ≻, ♦, h i, I) to the multi-agent setting, there is a distinction between introducing a new proposition to every agent publicly and introducing a new proposition for only one agent so that she becomes privately aware of it. Our semantics naturally generalizes to model all agents publicly becoming aware of a new proposition, but the modeling of some agent’s privately becoming aware of a new proposition requires a different treatment. Further extensions to the language are natural to consider, such as adding comparative conditional probability formulas (ϕ | ψ) % (α | β) (resp. (ϕ | ψ) ≻ (α | β)) expressing that the conditional probability of ϕ given ψ is at least as great as (resp. greater than) the conditional probability of α given β for every measure in one’s set of measures, which is not expressible in the languages of this paper (see Luce 1968). For precise probabilistic models, such a quarternary operator is investigated in, e.g., Domotor 1969, § 2.6 and Suppes and Zanotti 1982 (and recently in Hawthorne 2016 using so-called Popper functions), but the interpretation in imprecise probabilistic models seems yet to be explored. Allowing inequalities of probabilistic products (ϕ×ψ) % (α×β) would allow even greater expressivity (such an extension in the precise case is also considered in Domotor 1969, §2.4). More generally, the systems in this paper are part of a much broader hierarchy of probabilistic languages, ranging from the very simple L(%) all the way to highly expressive probabilistic languages encompassing full quantified real number arithmetic (Halpern, 1990). In addition to their inherent theoretical interest, probabilistic logics have emerged as a foundational tool for many central computational tasks, from core knowledge representation (Russell, 2015), to reasoning about strategic interaction (Dekel and Siniscalchi, 2015; van Benthem and Klein, 2019), to causal inference (witness do-calculus, which is built on top of a probability calculus; see, e.g., Pearl 2009; Bareinboim et al. 2020; Ibeling and Icard 2020). Furthermore, applications in these contexts have motivated some of the very systems presented here (e.g., Alon and Heifetz 2014). Understanding the capacities and limitations of such systems may well be an important step toward further integration of explicit probabilistic tools in these and other domains. Acknowledgements We thank the two reviewers for the International Journal of Approximate Reasoning for helpful comments. References Shiri Alon and Aviad Heifetz. The logic of Knightian games. Economic Theory Bulletin, 2 (2):161–182, 2014. Shiri Alon and Ehud Lehrer. Subjective multi-prior probability: A representation of a partial likelihood relation. Journal of Economic Theory, 151:476–492, 2014. Thomas Augustin, Frank P. A. Coolen, Gert De Cooman, and Matthias C. M. Troffaes. Introduction to imprecise probabilities. John Wiley & Sons, 2014. Elias Bareinboim, Juan D. Correa, Duligur Ibeling, and Thomas Icard. On Pearl’s hierarchy and the foundations of causal inference. Technical Report R-60, Causal AI Lab, Columbia University, 2020. 31 Johan van Benthem. Logical Dynamics of Information and Interaction. Cambridge University Press, New York, 2011. Johan van Benthem and Dominik Klein. Logics for analyzing games. In Edward N. Zalta, editor, The Stanford Encyclopedia of Philosophy. 2019. Johan van Benthem, Jelle Gerbrandy, and Barteld Kooi. Dynamic update with probabilities. Studia Logica, 93(1):67–96, 2009. George Boole. An Investigation of the Laws of Thought. Walton & Maberly, 1854. Seamus Bradley. Imprecise probabilities. In Edward N. Zalta, editor, The Stanford Encyclopedia of Philosophy. 2019. Seamus Bradley and Katie Steele. Uncertainty, learning, and the “problem” of dilation. Erkenntnis, 79:1287–1303, 2014. Rudolf Carnap. Testability and meaning (part I). Philosophy of Science, 3(4):419–471, 1936. Inés Couso and Serafı́n Moral. Sets of desirable gambles: conditioning, representation, and precise probabilities. International Journal of Approximate Reasoning, 52(7):1034–1055, 2011. Eddie Dekel and Marciano Siniscalchi. Epistemic game theory. In Handbook of Game Theory with Economic Applications, volume 4, pages 619–702. 2015. Persi Diaconis. Review of “A mathematical theory of evidence” (G. Shafer). Journal of the American Statistical Association, 73(363):677–678, 1978. Persi Diaconis and Sandy L. Zabell. Some alternatives to Bayes’s rule. In B. Grofman and G. Owen, editors, Information Pooling and Group Decision Making, pages 25–38. J.A.I. Press, 1986. Nicholas DiBella. Qualitative probability and infinitesimal probability. Draft of 9/7/18, 2018. Yifeng Ding, Matthew Harrison-Trainor, and Wesley H. Holliday. The logic of comparative cardinality. The Journal of Symbolic Logic, Forthcoming. doi: 10.1017/jsl.2019.67. Hans van Ditmarsch, Wiebe van der Hoek, and Barteld Kooi. Dynamic Epistemic Logic. Springer, Dordrecht, 2008. Zoltan Domotor. Probabilistic relational structures and their applications. Technical Report No. 144 Psychology Series, Stanford University, California Institute for Mathematical Studies in the Social Sciences, 1969. Edward Elliott. ‘Ramseyfying’ probabilistic comparativism. Philosophy of Science, 87(4): 727–754, 2020. Benjamin Eva. Principles of indifference. Journal of Philosophy, 116(7):390–411, 2019. R. Fagin, J. Y. Halpern, and N. Megiddo. A logic for reasoning about probabilities. Information and Computation, 87:78–128, 1990. Terrence L. Fine. Theories of Probability. Academic Press, New York, 1973. Terrence L. Fine. An argument for comparative probability. In R.E. Butts and J. Hintikka, editors, Basic Problems in Methodology and Linguistics, pages 105–119. Springer, 1977. Bruno de Finetti. La ‘logica del plausible’ secondo la concezione di Polya. Atti della XLII Riunione, Societa Italiana per il Progresso delle Scienze, pages 227–236, 1949. 32 Peter C. Fishburn. The axioms of subjective probability. Statistical Science, 1(3):335–358, 1986. Brandon Fitelson and David McCarthy. Toward an epistemic foundation for comparative confidence. Draft of 1/19/14, 2014. Peter Gärdenfors. Qualitative probability as an intensional logic. Journal of Philosophical Logic, 4(2):171–185, 1975. Marvin Gardner. Mathematical games. Scientific American, page 180–182, 1959a. October issue. Marvin Gardner. Mathematical games. Scientific American, page 188, 1959b. November issue. Alfio Giarlotta and Salvatore Greco. Necessary and possible preference structures. Journal of Mathematical Economics, 49:163–172, 2013. I.J. Good. Subjective probability as the measure of a non-measurable set. In Ernest Nagel, Patrick Suppes, and Alfred Tarski, editors, Logic, Methodology and Philosophy of Science: Proceedings of the 1960 International Congress, pages 319–329, 1962. J. Y. Halpern. An analysis of first-order logics of probability. Artificial Intelligence, 46: 311–350, 1990. Joseph Y. Halpern. Reasoning about Uncertainty. MIT Press, Cambridge, Mass., 2003. Joseph Y. Halpern and Yoram Moses. A guide to completeness and complexity for modal logics of knowledge and belief. Artificial Intelligence, 54(3):319–379, 1992. Matthew Harrison-Trainor, Wesley H. Holliday, and Thomas F. Icard. A note on cancellation axioms for comparative probability. Theory and Decision, 80(1):159–166, 2016. Matthew Harrison-Trainor, Wesley H. Holliday, and Thomas F. Icard. Preferential structures for comparative probabilistic reasoning. Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, pages 1135–1141, 2017. James Hawthorne. A logic of comparative support: Qualitative conditional probability relations representable by Popper functions. In Alan Hájek and Christopher Hitchcock, editors, Oxford Handbook of Probability and Philosophy. Oxford University Press, 2016. Wesley H. Holliday, Tomohiro Hoshi, and Thomas F. Icard. A uniform logic of information dynamics. In Thomas Bolander, Torben Braüner, Silvio Ghilardi, and Lawrence Moss, editors, Advances in Modal Logic, volume 9, pages 348–367. College Publications, London, 2012. Wesley H. Holliday, Tomohiro Hoshi, and Thomas F. Icard. Information dynamics and uniform substitution. Synthese, 190(1):31–55, 2013. Duligur Ibeling and Thomas Icard. Probabilistic reasoning across the causal hierarchy. In Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020. Thomas F. Icard. Pragmatic considerations on comparative probability. Philosophy of Science, 83(3):348–370, 2016. James M. Joyce. How probabilities reflect evidence. Philosophical Perspectives, 19:153–178, 2005. John Maynard Keynes. A Treatise on Probability. Macmillan, 1921. 33 Jason Konek. Comparative probabilities. In Richard Pettigrew and Jonathan Weisberg, editors, The Open Handbook of Formal Epistemology, pages 267–348. The PhilPapers Foundation, 2019. Jason Konek. Epistemic conservativity and imprecise credence. Philosophy and Phenomenological Research, Forthcoming. Bernard O. Koopman. The axioms and algebra of intuitive probability. Annals of Mathematics, 41(2):269–292, 1940. Charles H. Kraft, John W. Pratt, and A. Seidenberg. Intuitive probability on finite sets. The Annals of Mathematical Statistics, 30(2):408–419, 1959. Daniel Lassiter. Gradable epistemic modals, probability, and scale structure. In N. Li and D. Lutz, editors, Semantics and Linguistic Theory (SALT) 20, pages 1–18. CLC (Cornell Linguistics Circle), 2010. Ehud Lehrer and Roee Teper. Justifiable preferences. Journal of Economic Theory, 146(2): 762–774, 2011. Isaac Levi. On indeterminate probabilities. Journal of Philosophy, 71:391–418, 1974. R. Duncan Luce. On the numerical representation of qualitative conditional probability. The Annals of Mathematical Statistics, 39(2):481–491, 1968. Carsten Lutz. Complexity and succinctness of public announcement logic. In Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems, pages 137–143. ACM, 2006. Krzysztof Mierzewski. Probabilistic stability: dynamics, nonmonotonic logics, and stable revision. Master’s thesis, Universiteit van Amsterdam, 2018. Sarah Moss. Probabilistic Knowledge. Oxford University Press, Oxford, 2018. Sarah Moss. Global constraints on imprecise credences: Solving reflection violations, belief inertia, and other puzzles. Philosophy and Phenomenological Research, 2020. T. S. Motzkin. Two consequences of the transposition theorem on linear inequalities. Econometrica, 19(2):184–185, 1951. Judea Pearl. Causality. Cambridge University Press, 2009. Susanna Rinard. Against radical credal imprecision. Thought: A Journal of Philosophy, 2: 157–165, 2013. D. Rı́os Insua. On the foundations of decision making under partial information. Theory and Decision, 33(1):83–100, 1992. Stuart Russell. Unifying logic and probability. Communications of the ACM, 58(7):88–97, 2015. Miriam Schoenfield. Chilling out on epistemic rationality: A defense of imprecise credences (and other imprecise doxastic attitudes). Philosophical Studies, 158:197–219, 2012. Dana Scott. Measurement structures and linear inequalities. Journal of Mathematical Psychology, 1:233–247, 1964. Krister Segerberg. Qualitative probability in a modal setting. In E. Fenstad, editor, Second Scandinavian Logic Symposium, pages 341–352, Amsterdam, 1971. North-Holland. Teddy Seidenfeld, Mark J. Schervish, and Joseph B. Kadane. Forecasting with imprecise probabilities. International Journal of Approximate Reasoning, 53(8):1248–1261, 2012. 34 Steve Selvin. On the Monty Hall problem. The American Statistician, 29(3):134, 1975. Patrick Suppes. The measurement of belief. The Journal of the Royal Statistical Society, Series B, 36(2):160–191, 1974. Patrick Suppes and Mario Zanotti. Necessary and sufficient conditions for existence of a unique measure strictly agreeing with a qualitative probability ordering. Journal of Philosophical Logic, 5(3):431–438, 1976. Patrick Suppes and Mario Zanotti. Necessary and sufficient qualitative axioms for conditional probability. Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete, 60: 163–169, 1982. Marilyn vos Savant. Marilyn vos Savant’s reply. The American Statistician, 45(4):347, 1991. Peter Walley. Statistical Reasoning with Imprecise Probabilities. Chapman and Hall, 1991. Peter Walley. Towards a unified theory of imprecise probability. International Journal of Approximate Reasoning, 24(2-3):125–148, 2000. Brian Weatherson. The Bayesian and the dogmatist. Proceedings of the Aristotelian Society, 107:169–185, 2007. Seth Yalcin. Context probabilism. In Logic, Language and Meaning - 18th Amsterdam Colloquium, Amsterdam, The Netherlands, December 19-21, 2011, Revised Selected Papers, pages 12–21, 2011. doi: 10.1007/978-3-642-31482-7 2. 35