UC Berkeley
Faculty Publications
Title
Logics of Imprecise Comparative Probability
Permalink
https://escholarship.org/uc/item/1m3156ps
Authors
Ding, Yifeng
Holliday, Wesley Halcrow
Icard, Thomas Frederick, III
Publication Date
2021-04-30
License
https://creativecommons.org/licenses/by-nc-nd/4.0/ 4.0
Peer reviewed
eScholarship.org
Powered by the California Digital Library
University of California
Logics of Imprecise Comparative Probability
Yifeng Ding† , Wesley H. Holliday† , and Thomas F. Icard, III‡
† University of California, Berkeley and ‡ Stanford University
Preprint of April, 2021. Forthcoming in International Journal of Approximate Reasoning .
Abstract
This paper studies connections between two alternatives to the standard probability
calculus for representing and reasoning about uncertainty: imprecise probability and
comparative probability. The goal is to identify complete logics for reasoning about
uncertainty in a comparative probabilistic language whose semantics is given in terms
of imprecise probability. Comparative probability operators are interpreted as quantifying over a set of probability measures. Modal and dynamic operators are added for
reasoning about epistemic possibility and updating sets of probability measures.
Keywords: imprecise probability, comparative probability, logic and probability.
1
Introduction
While the standard probability calculus remains the dominant formal framework for representing uncertainty across numerous disciplines, a small but significant tradition in philosophy, economics, computer science, and statistics has contended that the precision inherent
in assigning “sharp” probabilities to uncertain events is often inappropriate. The reasons
are several. One obvious concern is the psychological reality of arbitrarily precise real-valued
judgments (Boole 1854; Keynes 1921; Koopman 1940; Good 1962; Suppes 1974). As Suppes (1974) expresses the concern, “Almost everyone who has thought about the problems
of measuring beliefs in the tradition of subjective probability or Bayesian statistical procedures concedes some uneasiness with the problem of always asking for the next decimal of
accuracy in the prior estimation of a probability” (p. 160). Another quite distinct concern
is that even for a certain kind of idealized agent free of computational or representational
limitations, in many important cases the available evidence somehow underdetermines the
“right” probability function to have, and it would be epistemically unfitting to opt for any
one of them (Carnap 1936; Levi 1974; Joyce 2005; Konek Forthcoming).
A number of alternative formal frameworks have been advanced (see, e.g., Halpern 2003).
Our focus here is on two especially prominent alternatives. Some authors favor a sort of
generalization of the probability calculus, allowing uncertainty to be measured by sets of
probability functions (Good 1962; Levi 1974; Walley 1991; Seidenfeld et al. 2012; see Bradley
2019 for a philosophical overview). This imprecise probability framework retains many of the
benefits of standard Bayesian representation and reasoning—indeed allowing the standard
picture to emerge as a special case—while also affording a wider range of epistemic attitudes.
Philosophical questions about imprecise probability have generated a great deal of discussion
in recent years (see, e.g., Joyce 2005; Schoenfield 2012; Rinard 2013; Bradley and Steele
2014; Moss 2020). A second line of work renounces the demand for explicit numerical
judgments altogether, arguing that qualitative, especially comparative, judgments should be
the primitive building blocks for the theory of uncertainty (Keynes 1921; Koopman 1940;
Fine 1973; Hawthorne 2016; see Konek 2019 for a philosophical overview). Aside from
being intuitively simpler and arguably closer to “ordinary” expressions of uncertainty, some
1
authors have argued that this setting of comparative probability is perhaps uniquely suited to
solving notable epistemic puzzles (Fine 1977; DiBella 2018; Eva 2019). Others have sought
more ameliorative reconciliations between the quantitative and qualitative approaches so as
to capitalize on the advantages of each (see, e.g., Suppes and Zanotti 1976 and Elliott 2020).
Our aim in this paper is neither to weigh in on the debate between precise and imprecise
versions of probabilism, nor to adjudicate between the quantitative and the qualitative
alternatives, but rather to shed light on the connections between them. Only quite recently
have even the most basic questions about such connections been clarified (Rı́os Insua 1992;
Alon and Lehrer 2014; Alon and Heifetz 2014; Harrison-Trainor et al. 2016). This is of
interest from all perspectives. If one takes sets of probability measures as primitive, it
would nevertheless be desirable to understand some of the core qualitative commitments
implicit in this representation, including how such commitments relate to those of precise
probability and other frameworks. Most conspicuously, the generalization to sets of measures
brings with it a rejection of the infamous comparability principle (also sometimes called
opinionation or totality), according to which every two events ought to be compared in
probability. Indeed, rejection of this principle has served as one of the primary arguments
against precise probabilism. As Keynes (1921) expressed it a century ago:
Is our expectation of rain, when we start out for a walk, always more likely than
not, or less likely than not, or as likely as not? I am prepared to argue that on
some occasions none of these alternatives hold, and that it will be an arbitrary
matter to decide for or against the umbrella. If the barometer is high, but the
clouds are black, it is not always rational that one should prevail over the other
in our minds, or even that we should balance them. (p. 30)
Aside from the rejection of comparability, are there other differences between the precise and
imprecise probabilistic frameworks that surface in this qualitative setting? Likewise, we can
ask about various additional qualitative notions aside from the usual “weak” comparison
‘at least as likely as’. For example, whereas the strict version of this judgment, ‘more likely
than’, is easily definable in the precise setting in terms of weak comparison, this is no longer
the case in the imprecise setting (see Section 2 below), raising new questions about the
qualitative principles characterizing this distinctive kind of unanimity operator.
If, on the other hand, one takes qualitative judgments as primitive, this has the potential
advantage of discarding principles forced upon us by (even imprecise) probabilistic representations. This may be desirable, e.g., if one is solely concerned with certain epistemic virtues
such as maximizing accuracy (Fitelson and McCarthy 2014). At the same time, there are
also arguments that purport to show why an agent who maintains only comparative judgments would not want to violate qualitative probabilistic principles (Fishburn 1986; Fitelson
and McCarthy 2014; Icard 2016). For example, suppose that we operationalize a judgment
of the form ‘A is more likely than B’ in terms of a disposition to opt for a prospect that
pays some positive dividend conditional on A over one that pays the same amount conditional on B. Moreover, suppose that satisfying this preference is worth some cost, while
judgments of the form ‘A and B are equally likely’ engender no such disposition. Then one
can show that an agent will be forced into choosing strictly dominated actions (worse than
some other available option no matter how the world turns out) if and only if the agent’s
judgments fail to comport with any set of probability measures (Icard 2016). Arguments
like these highlight the importance of gaining a better understanding of what compatibility
of comparative judgments with imprecise probability means.
In the present paper we take a logical approach, studying a sequence of increasingly
expressive qualitative formal systems, all interpreted over sets of probability measures. To
illustrate the type of reasoning we would like to systemize, consider the following examples.
Example 1.1. A patient learns from her doctor of the existence of a gland in the human
body and of a disease previously unknown to her.1 The doctor informs her that if her gland
1 This
example is inspired by van Benthem’s (2011, p. 164, p. 166) example of the hypochondriac.
2
is swollen, then it is more likely than not that she has the disease. Subsequently the patient’s
gland is examined, and she learns that it is swollen. As a result, she comes to think it is
more likely than not that she has the disease.
How should we model the patient’s evolving uncertainty? A natural approach is to
represent her relevant uncertainty using the following set of four possible states:
{hswollen, diseasei, hswollen, no diseasei, hnot swollen, diseasei, hnot swollen, no diseasei}.
Initially, the patient knows nothing about the gland or the disease. We represent this
ignorance using the set P of all probability measures on the state space above. Next, when
her doctor informs her that if her gland is swollen, then it is more likely than not that she
has the disease, we eliminate from her set of measures all measures except those for which
the probability of disease conditional on a swollen gland is greater than the probability of
no disease conditional on a swollen gland. This gives us a new set P ′ of measures. Finally,
when she has the gland examined and learns that it is swollen, we condition each measure
in P ′ on the information that the gland is swollen, giving us a final set P ′′ of measures. All
measures in P ′′ give a higher probability to disease than no disease.
How should one model the example using the standard representation of an agent’s
uncertainty with a single probability measure? First, the standard representation forces the
agent to have sharp probabilities that her gland is swollen and that she has the disease,
even when she just learns of their existence and knows nothing else about them. It also
forces her to have a sharp conditional probability for having the disease conditional on her
gland being swollen, before the doctor tells her anything about the connection between the
two. Suppose she thinks that disease and no disease are equally likely conditional on her
gland being swollen. What do we then do with her probability measure when the doctor
informs her that if her gland is swollen, then it is more likely than not that she has the
disease? One idea would be to replace her probability measure with the “closest” measure
for which the conditional probability of disease given a swollen gland is greater than that
of no disease given a swollen gland; but the existence of a unique closest such measure is
clearly problematic. Another idea is that we must give up the simple state space above.
Instead, we must use a complicated state space involving possibilities for what her doctor
might say to her. On this approach, the patient must start out with a sharp conditional
probability for having the disease conditional on her doctor uttering at time t the words “if
your gland is swollen, then it is more likely than not that you have the disease.” Assuming
this conditional probability is greater than .5, it follows that conditional on the doctor not
uttering those words at time t, the probability she assigns to having the disease will be less
than .5. In order to allow that time t may pass in silence without the patient changing
her probability for disease, we must introduce still further distinctions in the state space,
beyond the distinction that the doctor may or may not utter the indicated words at t.
Though we will not argue that the modeling approach with a single probability measure
is unworkable, in this paper we wish to explore the multi-measure approach sketched above.
We will fully formalize the swollen gland example in Section 6.2. There we will even model
the patient’s becoming aware of the distinction between having a swollen gland and not
having a swollen gland and of the distinction between having the disease and not having
the disease, creating the state space and set P of measures above.
The next example is one in which it is essential to consider the possibilities for what an
informant may say. It was made famous by vos Savant (1991) in the Monty Hall version of
the puzzle posed by Selvin (1975). We will present the earlier but mathematically equivalent
Three Prisoners version of the puzzle from Gardner (1959a; 1959b).
Example 1.2. The following is Diaconis and Zabell’s (1986, p. 30) description of the Three
Prisoners puzzle (also see Diaconis 1978 and Halpern 2003):
Of three prisoners a, b, and c, two are to be executed, but a does not know
which. He therefore says to the jailer, “Since either b or c is certainly going to
3
be executed, you will give me no information about my own chances if you give
me the name of one man, either b or c, who is going to be executed.” Accepting
this argument, the jailer truthfully replies, “b will be executed.” Thereupon a
feels happier because before the jailer replied, his own chance of execution was
two-thirds, but afterward there are only two people, himself and c, who could
be the one not executed, and so his chance of execution is one-half.
Under what conditions could a’s reasoning possibly be sound? Imagine there are four
relevant ways the world could be: wab , wac , wbc , and wcb , where in wij prisoner i is the one
who lives and prisoner j is the one who the jailer says will be executed. Assuming that each
prisoner is equally likely to be spared, we can assume wbc and wcb both have probability onethird, and the disjunction “wab or wac ” has probability one-third. Concerning the relative
probability of wab and wac , we could apply a principle of indifference and proclaim that the
jailer is equally likely to announce b or announce c, in case a is the one to be spared. It is
then easy to compute that the conditional probability of being spared after learning that b
will be executed (and thus wac and wbc can be eliminated as possibilities) is still one-third.
In this case a learns nothing from the jailer’s announcement.
By contrast, if for whatever reason a thinks the jailer is certain to tell him it is b who will
be executed when a is the one to be spared, then learning b will be executed does rationally
lead a to conclude that he now has a one-half chance of survival.
There is an intuition in this scenario that the right way to respond to the evidence is
to leave the relative likelihood of wab and wac open: to represent a’s uncertainty in terms
of the set of all probability measures that assign one-third to each of wbc , wcb , and the
disjunction “wab or wac .” In this case the probabilities of wab and wac each range from zero
to one-third, under the constraint that their sum is one-third. Updating each such measure
by eliminating wac and wbc results in a range of posterior probability values for a surviving,
from zero to one-half. Thus, the probability that a is spared (the disjunction “wab or wac ”)
has dilated (Walley 1991) from precisely one-third to the entire interval [0, 1/2].
Examples 1.1 and 1.2 illustrate some important aspects of imprecise probabilistic reasoning, which surface already in a purely qualitative setting. By the end of this paper, we
will be able to formalize Examples 1.1 and 1.2 in a dynamic logic of updating imprecise
comparative probability (Examples 6.5 and 6.21).
The outline of the paper is as follows. In Section 2, we consider the pure order-theoretic
setting of comparative probability and prove a representation theorem extending previous
results in the literature. The theorem concerns both a weak and a strict comparative
relation together represented by a set of probability measures (Theorem 2.7). In Section 3,
we turn to the logical setting and review some completeness theorems for logics of precise
and imprecise probability with a single weak comparative relation (Theorems 3.7, 3.9). In
Section 4, we consider a logical language that includes both weak and strict comparative
relations and, using the representation in Theorem 2.7, prove a corresponding completeness
theorem (Theorem 4.4). Section 5 explores the addition of a primitive “possibility” operator
asserting the existence of a probability measure with a given property, culminating again in
a complete axiomatization (Theorem 5.5), plus an analysis of complexity (Theorem 5.12).
In Section 6, we turn to modeling the dynamics of learning. In Section 6.1, we add to our
language an update operator whose semantics is given by a process of discarding from one’s
set of measures any measure assigning zero probability to the learned proposition and then
conditioning the remaining measures on the learned proposition. With this we can model
updating on pure comparative probability formulas (through the discarding part), as well as
non-probabilistic (ontic) formulas (through the conditioning part) and mixed probabilisticontic formulas. The language also allows the formalization of basic comparative conditional
probabilities. Yet we prove that the extended language is in fact no more expressive than
the previous system from Section 5: the extended language can be completely axiomatized
by a set of “reduction axioms” (Theorem 6.8). Finally, in Section 6.2, we add a second
dynamic operator for becoming aware of a new proposition (recall how the patient becomes
4
aware of the existence of the gland and disease in Example 1.1). When an agent becomes
aware of a new proposition, we form a new state space by splitting each state in her old state
space in two, one where the new proposition is true and the other where it is false, and we
form a new set of probability measures by taking all measures on the new set of propositions
that when restricted to just the old propositions coincide with some old measure. We show
that this language is more expressive than our previous languages, allowing us to express
any linear inequality with integer coefficients about the probability of formulas.
What emerges is a landscape of increasingly expressive logical systems, consistent with
both precise and imprecise probabilistic representations, simple but sufficiently powerful to
model sophisticated reasoning about uncertainty. Perhaps surprisingly, the computational
complexity of reasoning (e.g., determining validity or consistency) in each of the “static”
systems is no worse than for the classical propositional calculus. The complexity of reasoning
in the dynamic logic of updating sets of probability measures is an open problem, as is the
complexity and axiomatization of the dynamic logic of becoming aware.
2
Representation
Before introducing any explicit logical calculus, in this section we consider the pure ordertheoretic setting of comparative probability. A comparative notion of probability is most
naturally formalized as a binary relation on an algebra of events. However, not all binary
relations can be intuitively interpreted as comparing how likely events are, just as not
all functions from events to [0, 1] can be interpreted as assigning quantitative probabilities.
Taking the usual axiomatization of quantitative probability for granted, a natural question—
posed early on by de Finetti (1949)—is what would be a set of axioms that are intuitive
and in harmony with those quantitative axioms.
This question was first solved for finite event algebras by Kraft et al. (1959). Given a
binary relation % on ℘(W ), where W is a finite set, and a probability measure µ on ℘(W ),
we say that % is precisely represented by µ if for all X, Y ⊆ W , X % Y iff µ(X) ≥ µ(Y ).
Theorem 2.1 (Kraft et al. 1959). Let W be a nonempty finite set and % a binary relation
on ℘(W ). Then % is precisely represented by some probability measure on ℘(W ) if and
only if:
• ∅ 6% W , {w} % ∅ for all w ∈ W , and for all A, B ∈ ℘(W ), A % B or B % A, and
• % satisfies the finite cancellation condition (FC): letting 1X denote the characteristic
function
any two finite sequences hAi ini=1 , hBi ini=1 of events in ℘(W ) such
Pn of X, for P
n
W
that i=1 1Ai =
i=1 1Bi (additions are done in the vector space R ), if for all
i < n, Ai % Bi , then Bn % An .
Following the same paradigm, we can consider a comparative notion of imprecise probability and ask the following question: which binary relations on a finite algebra of events
can be naturally interpreted as an imprecise version of the at-least-as-likely-as relation?
More precisely, given a binary relation % on ℘(W ), where W is a finite set, and a set P of
probability measures on ℘(W ), we say that % is imprecisely represented as the weak relation
by P if for all X, Y ⊆ W , X % Y iff for all µ ∈ P, µ(X) ≥ µ(Y ). The following analogue of
Theorem 2.1 was proved by Rı́os Insua (1992) (also see Alon and Lehrer 2014).
Theorem 2.2 (Rı́os Insua 1992). Let W be a nonempty finite set and % a binary relation on
℘(W ). Then % is imprecisely represented as the weak relation by some set P of probability
measures on ℘(W ) if and only if:
• ∅ 6% W , {w} % ∅ for all w ∈ W , and
5
• % satisfies the generalized finite cancellation condition (GFC): for any two finite sePn−1
quences hAi ini=1 , hBi ini=1 of events in ℘(W ) and k ∈ N \ {0} such that i=1 1Ai +
Pn−1
k1An = i=1 1Bi + k1Bn , if for all i < n, Ai % Bi , then Bn % An .2
Remark 2.3. Harrison-Trainor et al. (2016) prove that there are relations % satisfying the
conditions of Theorem 2.1 except for the comparability principle (that for all A, B ∈ ℘(W ),
A % B or B % A) and which fail to satisfy the GFC condition in Theorem 2.2. Thus, it is
necessary to strengthen FC to GFC when dropping comparability to obtain Theorem 2.2.
A subtlety not covered by Theorem 2.2 is that given a set P of probability measures,
there are two natural ways to generate a strict relation, corresponding to the strict and the
weak dominance relation in game theory:
• X strictly dominates Y in P iff for all µ ∈ P, µ(X) > µ(Y );
• X weakly dominates Y in P iff for all µ ∈ P, µ(X) ≥ µ(Y ), and there is a µ ∈ P such
that µ(X) > µ(Y ).
When % is represented as the weak relation by P, it is easy to see that X weakly dominates
Y iff X % Y but Y 6% X. However, we cannot pin down the strict dominance relation simply
from the weak relation % or vice versa, as shown by the following example.
Example 2.4. Let W = {w, v} and consider the four binary relations %1 , %2 , ≻1 , ≻2 pictured below from left to right (for dashed arrows, reflexive and transitive arrows are omitted;
for solid arrows, transitive arrows are omitted).
W
W
{w}
{v}
∅
%1
{w}
W
{v}
∅
{w}
%2
W
{v}
∅
≻1
{w}
{v}
∅
≻2
If all we know about a set P of probability measures on ℘(W ) is that its weak relation is
%1 , then both ≻1 and ≻2 may be P’s strict dominance relation. For example, we can define
a probability measure µw<v on ℘(W ) that favors v so that µw<v ({w}) = 1/3. Then let
µw=v be the uniform distribution on ℘(W ): µw=v ({w}) = µw=v ({v}) = 1/2. Then for both
{µw<v , µw=v } and {µw<v }, their weak relation is %1 . Yet the strict dominance relation of
the former is ≻1 while the strict dominance relation of the latter is ≻2 .
Similarly, if all we know about P is that its strict dominance relation is ≻1 , then both
%1 and %2 may be its weak relation. For this, define a probability measure µw>v that
favors w so that µw>v ({w}) = 2/3. Then we see that the strict dominance relation of both
{µw<v , µw=v } and {µw<v , µw>v } is ≻1 while the weak relation of the former is %1 and the
weak relation of the latter is %2 .
In light of these considerations, we introduce the following definition that accounts for
both relations; cf. Konek (2019, p. 275, footnote 4), who suggests that the study of comparative probability ought to start with pairs h%, ≻i because an agent who judges that X is at
least as likely as Y but withholds judgment about whether Y is at least as likely as X does
not necessarily judge that X is strictly more likely than Y .
Definition 2.5. Given a pair h%, ≻i of binary relations on ℘(W ) and a set P of probability
measures on ℘(W ), we say that h%, ≻i is represented by P iff for all X, Y ⊆ W ,
• X % Y iff for all µ ∈ P, µ(X) ≥ µ(Y ), and
• X ≻ Y iff for all µ ∈ P, µ(X) > µ(Y ).
2 Note
that n can be 1, in which case the condition simply expresses the reflexivity of %.
6
Remark 2.6. Define X < Y as not Y ≻ X, i.e., there is some µ ∈ P such that µ(X) ≥ µ(Y )
(cf. the notion of justifiable preference in Lehrer and Teper 2011). Then the pair h%, <i of
weak relations is what Giarlotta and Greco (2013) call a necessary and possible preference.
The following theorem characterizes the representable relation pairs.
Theorem 2.7. Let W be a nonempty finite set and %, ≻ two binary relations on ℘(W ).
Then h%, ≻i is represented by a set P of probability measures on ℘(W ) if and only if:
• ≻ is irreflexive and ≻ ⊆ %;
• W ≻ ∅, and {w} % ∅ for all w ∈ W ;
• % satisfies (GFC) and ≻ satisfies the strict generalized finite cancellation condition
(SGFC): for any two finite sequences hAi ini=1 , hBi ini=1 of events in ℘(W ) and k ∈ N\{0}
Pn−1
Pn−1
such that i=1 1Ai + k1An = i=1 1Bi + k1Bn , if for all i < n, Ai % Bi and there
is i < n with Ai ≻ Bi , then Bn ≻ An .
The rest of this section is devoted to the proof of Theorem 2.7. The proof is adapted from
the proof of Theorem 2.2 above in Alon and Lehrer 2014, which also generalizes the proof
in Scott 1964 for Theorem 2.1 (also see Mierzewski 2018, § 3.3 for a representation theorem
concerning h%, ≻i in the setting of precise probability). For this, pick a nonempty finite
set W and a pair h%, ≻i satisfying the conditions (the necessity of the conditions is easy).
The main strategy is to reframe the representability of h%, ≻i in terms of the existence
of solutions to some systems of homogeneous linear inequalities in the vector space RW .
Hence we use vectors in ∆(W ) = {µ ∈ RW | µ · 1W = 1 and for all w ∈ W, µ(w) ≥ 0} as
probability measures.
Define D% = {1A − 1B | A, B ⊆ W, A % B} and D≻ = {1A − 1B | A, B ⊆ W, A ≻ B}.
Intuitively, D% contains vectors that always receive non-negative measures and D≻ contains
vectors that always receive positive measures. Given the conditions satisfied by % and ≻,
we can prove the following lemmas.
Lemma 2.8. If f ∈ {−1, 0, 1}W is a non-negative linear combination of vectors in D% , then
f ∈ D% .
Proof. Suppose f ∈ {−1, 0, 1}W is a non-negative linear combination of vectors in D% .
Since all the vectors are in {−1, 0, 1}W , we can assume that all coefficients are rational since
a system of linear inequalities with rational coefficients has a solution if and only if it has a
rational solution. Then we can clear the denominators and obtain
Pan k ∈ N \ {0} such that
kf is simply a sum of vectors in D% possibly with repetitions:
i=1 gi . Since f and the
gi ’s are in D% , we can find subsets Ai , Bi for i = 1 . . . n + 1 of W such that
• gi = 1Ai − 1Bi for i = 1 . . . n and f = 1Bn+1 − 1An+1 (take Bn+1 = f −1 (1) and
An+1 = f −1 (−1)), and
• Ai % Bi for i = 1 . . . n.
Pn
Pn
Pn
Then given that kf = i=1 gi , we have n=1 1Ai + k1An+1 = i=1 1Bi + k1Bn+1 . Hence
n+1
we can apply (GFC) to hAi in+1
i=1 and hBi ii=1 and see that Bn+1 % An+1 . Therefore,
f = 1Bn+1 − 1An+1 ∈ D% .
Lemma 2.9. If f ∈ {−1, 0, 1}W is a non-negative linear combination of vectors in D% ∪ D≻
with a coefficient for a vector in D≻ being positive, then f ∈ D≻ .
Proof. Similar to the proof of the previous lemma. The only change in this case is that
when we find k and express kf as a sum of vectors in D% ∪ D≻ , at least one vector in D≻
must figure in the sum since initially the non-negative linear combination resulting in f has
a positive coefficient for a vector in D≻ . Then we can find sets Ai ’s and Bi ’s similarly and
apply (SGFC) to see that f must be in D≻ already.
7
Now define
P = {µ ∈ ∆(W ) | ∀f ∈ D% , µ · f ≥ 0 and ∀f ∈ D≻ , µ · f > 0}.
Our goal is to show that h%, ≻i is represented by this P. Note that one direction is done
already: for any A, B ⊆ W ,
• if A % B, then by the definition of P, for all µ ∈ P, µ · (1A − 1B ) ≥ 0, which means
that µ · 1A ≥ µ · 1B ;
• similarly, if A ≻ B, then for all µ ∈ P, µ · (1A − 1B ) > 0, which means that µ · 1A >
µ · 1B .
Hence all that are left to prove are the following two claims:
(a) If A 6% B, then there is a µ ∈ P such that µ · (1A − 1B ) < 0;
(b) If A 6≻ B, then there is a µ ∈ P such that µ · (1A − 1B ) ≤ 0.
For (a), it is enough to prove that for all f ∈ {−1, 0, 1}W , if f 6∈ D% , then there is µ ∈ P
such that µ · −f > 0, since for any A, B ⊆ W , we have 1A − 1B ∈ {−1, 0, 1}W . Hence take
such an f ∈ {−1, 0, 1}W \ D% . We need to find a µ such that (i) µ ∈ P and (ii) µ · −f > 0.
Given the definition of P, this amounts to the existence of a solution to the following system
of homogeneous linear inequalities (where we write [D] for the matrix containing as columns
the vectors in a set D of vectors):
[D% ]⊤ ~x ≥ ~0,
[D≻ ∪ {−f }]⊤ ~x > ~0.
(1)
The existence of a µ satisfiying (i) and (ii) is equivalent to the existence of a solution to the
above system of inequalities because by assumption, W ≻ ∅ and {w} % ∅ for all w ∈ W ,
which means that 1W ∈ D≻ and 1{w} ∈ D% for all w ∈ W , so any solution can be scaled to
be an element in P. The condition of the existence of a solution is given by a special case
of Motzkin’s Transposition Theorem (see Motzkin 1951).
Theorem 2.10 (Motzkin’s Transposition Theorem). The linear inequality system M1 ~x ≥
0, M2 ~x > ~0 has a solution if and only if there is no solution to the system M1⊤ ~y1 + M2⊤ ~y2 =
~0, ~y1 ≥ ~0, ~y2 ≥ ~0, y~2 6= ~0.
Suppose toward a contradiction that there is no solution to (1). Then by Motzkin’s
Transposition Theorem, there are non-negative ~y1 , y~2 with ~y2 non-trivial such that [D% ]⊤ ~y1 +
[D≻ ∪ {−f }]⊤ y~2 = ~0. In other words, ~0 is a non-negative linear combination of vectors in
D% ∪ D≻ ∪ {−f } with one of the vectors in D≻ ∪ {−f } having a positive coefficient. Now
there are two possibilities: either −f has a positive coefficient or not. If not, then ~0 is
a non-negative linear combination of vectors in D% ∪ D≻ with a vector in D≻ having a
positive coefficient. Then, by Lemma 2.9, ~0 ∈ D≻ . This contradicts the assumption that
≻ is irreflexive. If −f has a positive coefficient, then f is a linear combination of vectors
in D% ∪ D≻ = D% since ≻ ⊆ %. By Lemma 2.8, f ∈ D% , but we picked f specifically from
outside D% . Hence, either way, we have a contradiction. This completes the proof of (a).
The proof of (b) is almost identical. It is enough to show that for any f ∈ {−1, 0, 1}W \
D≻ , the following has a solution:
[D% ∪ {−f }]⊤ ~x ≥ ~0,
[D≻ ]⊤ ~x > ~0.
If there is no solution, then by Motzkin’s Transposition Theorem, ~0 is a non-negative linear
combination of vectors in D% ∪ {−f } ∪ D≻ with at least one vector in D≻ having a positive
coefficient. Again, we consider whether −f has a positive coefficient or not. If not, then ~0
should again be in D≻ , which is not the case. If indeed −f has a positive coefficient, then
f is a linear combination of vectors in D% ∪ D≻ with at least one vector in D≻ having a
positive coefficient. By Lemma 2.9, f ∈ D≻ , contradicting the way we picked f . Hence (b)
is also proved, which completes the proof of Theorem 2.7.
8
Remark 2.11. The sets D% and D≻ used in the proof above are reminiscent of an alternative but also prominent way of modelling uncertainty in the imprecise probability literature:
sets of desirable gambles (see Walley 2000 and chapters in Augustin et al. 2014 for introductions). For any event A ⊆ W , we may interpret it as a gamble that returns a unit of
utility for the states in A and returns nothing for states outside A. In other words, we can
understand comparing the likelihoods of two events A and B as comparing the two corresponding gambles 1A and 1B , which in turn reduces to the question of whether the gamble
1A − 1B is acceptable/desirable. However, there are two important differences between our
setting and the desirable gambles approach commonly presented in the literature.
First, since we are only comparing propositions, we do not need to appeal to the desirability of gambles not in {−1, 0, 1}W . In the literature on desirable gambles, all gambles in
RW are considered, and that is partly the reason for succinct axioms for coherent sets of
desirable gambles such as closure under positive scaling and pairwise addition. The same
cannot be done when restricting to {−1, 0, 1}W , since for example 1W + 1W is no longer in
{−1, 0, 1}W . Also, it is not hard to see that different coherent sets of desirable gambles have
the same intersection with {−1, 0, 1}W . This means that using sets of desirable gambles in
RW , we may encode more information than needed for comparing propositions.
Second, we model an agent’s uncertainty with a pair of binary relations, and hence
when translated to sets of desirable gambles, we use a pair of sets of desirable gambles
instead of a single one. This can be easily seen from the proof above: we constructed a
pair hD% , D≻ i from h%, ≻i. If we disregard the previous difference and consider all gambles
in RW , our approach can be understood as generalizing representation by a single set of
almost desirable gambles (using the terminology in Couso and Moral 2011) by pairing it
with another set of gambles that can be interpreted as strictly desirable gambles. However,
the axiomatic requirement for this set is weaker than the requirement for “sets of strictly
desirable gambles” in Couso and Moral 2011. More importantly, our axiomatic requirement
concerns two sets jointly, as can be seen from Lemma 2.9. In this way, we achieve greater
generality (expressivity) than merely using a set of almost desirable gambles. We leave
further comparison between these two approaches to imprecise probability for future work.
3
The Logic IP(%)
In this section and the following sections, we turn to the formalization of imprecise comparative probabilistic reasoning in logical systems. The representation theorems of Section 2
lead to completeness theorems for these logical systems.
The logics we consider form a hierarchy of increasing expressive power of their languages.
The least expressive language we will consider is the following.
Definition 3.1. The language L(%), generated from a nonempty set Prop of propositional
variables, is defined by the following grammar:
ϕ ::= p | ¬ϕ | (ϕ ∧ ϕ) | (ϕ % ϕ)
where p ∈ Prop. A propositional (or Boolean) formula is a formula generated from Prop
using only ¬ and ∧. We define the other propositional connectives ∨, →, ↔, ⊤, and ⊥ as
usual. Finally, we define ϕ ψ as (ϕ % ψ) ∧ ¬(ψ % ϕ) and ϕ ≈ ψ as (ϕ % ψ) ∧ (ψ % ϕ).
We will consider several semantics for this language, each of which builds on the standard
possible world models for propositional logics.
Definition 3.2. A propositional model is a pair M = hW, V i where W is a nonempty set
and V : Prop → ℘(W ). We may abuse notation and write ‘w ∈ M’ to mean w ∈ W .
The first semantics for L(%) that we will consider, which may be considered its “intended
semantics,” equips a propositional model with one or more probability measures, as follows.
9
Definition 3.3. An imprecise probabilistic model (IP model) is a pair hM, Pi where M =
hW, V i is a propositional model and P is a set of finitely additive probability measures on a
field F of subsets of W such that V (p) ∈ F for each p ∈ Prop. A precise probabilistic model
is an imprecise probabilistic model hM, Pi such that |P| = 1.
The key part of the truth definition of formulas of L(%) in IP models matches the notion
of imprecise representation from Section 2: ϕ % ψ is true just in case according to all the
probability measures in P, the probability of the set of worlds where ϕ is true is at least as
great as the probability of the set of worlds where ψ is true.
Definition 3.4. Given an IP model hM, Pi, w ∈ M, and ϕ ∈ L(%), we define M, P, w ϕ
and JϕKM,P = {w ∈ W | M, P, w ϕ} as follows:
1. M, P, w p iff w ∈ V (p);
2. M, P, w ¬ϕ iff M, P, w 2 ϕ;
3. M, P, w (ϕ ∧ ψ) iff M, P, w ϕ and M, P, w ψ;
4. M, P, w ϕ % ψ iff for all µ ∈ P, µ(JϕKM,P ) ≥ µ(JψKM,P ).
If α is a propositional formula, we may write ‘V (α)’ for JαKM,P to emphasize that the set
of worlds where α is true does not depend on the set P of probability measures.
Finally, given a class K of IP models, ϕ is valid with respect to K iff for all hM, Pi ∈ K
and w ∈ M, we have M, P, w ϕ.
An easy induction shows that for any formula ϕ, the set of worlds where ϕ is true belongs
to the algebra F of measurable sets.
Lemma 3.5. For every IP model hM, Pi and ϕ ∈ L(%), we have JϕKM,P ∈ F.
Below we define logics that are sound and complete with respect to the classes of imprecise probabilistic models and precise probabilistic models, respectively. To do so, we first
need to define a syntactic abbreviation that allows us to express the finite cancellation condition of Theorem 2.1 using formulas of our language. Given formulas ϕ1 , . . . , ϕn , ψ1 , . . . , ψn ∈
L(%), and 1 ≤ k ≤ n, define Ck to be the disjunction of all conjunctions
f 1 ϕ1 ∧ · · · ∧ f n ϕ n ∧ g 1 ψ 1 ∧ · · · ∧ g n ψ n
where exactly k of the f ’s and k of the g’s are the empty string, and the rest are ¬. Thus,
Ck is true at a state w ∈ W iff exactly k of the ϕ’s and k of the ψ’s are true at w. Then let
(ϕ1 , . . . , ϕn ) ≡ (ψ1 , . . . , ψn ) := C1 ∨ · · · ∨ Cn ,
which is true at a state w ∈ W iff the number of ϕ’s true at w is exactly the same as the
number of ψ’s true at w. Using these abbreviations, we can express the finite cancellation
condition with the axiom schema (A4) below.
Definition 3.6. The set of theorems of SP(%) (the logic of sharp probability) is the smallest
subset of L(%) that contains all tautologies of propositional logic, is closed under modus
ponens (if ϕ ∈ SP(%) and ϕ → ψ ∈ SP(%), then ψ ∈ SP(%)) and necessitation (if ϕ ∈ SP(%),
then ϕ % ⊤ ∈ SP(%)), and contains all instances of the following axiom schemas for all
n ∈ N:3
(A0) (ϕ % ψ) ∨ (ψ % ψ);
(A1) ϕ % ⊥;
3 The
labeling of axioms here follows Alon and Heifetz 2014.
10
(A2) ϕ % ϕ;4
(A3) ¬(⊥ % ⊤);
(A4) (ϕ1 % ψ1 ) ∧ · · · ∧ (ϕn % ψn ) ∧ (ϕ1 , . . . , ϕn , ϕ′ ) ≡ (ψ1 , . . . , ψn , ψ ′ ) % ⊤ → (ψ ′ % ϕ′ );
(A5) (ϕ % ψ) → (ϕ % ψ) % ⊤ ;
(A6) ¬(ϕ % ψ) → ¬(ϕ % ψ) % ⊤ .
The representation result in Theorem 2.1 may be used to prove the following completeness theorem for SP(%).
Theorem 3.7 (Segerberg 1971; Gärdenfors 1975). For all ϕ ∈ L(%): ϕ is a theorem of
SP(%) if and only if ϕ is valid with respect to the class of all precise probabilistic models.
To obtain a complete logic for imprecise probabilistic models, we express the generalized
finite cancellation conditions of Theorem 2.2 using formulas of our language as follows.
Definition 3.8. The logic IP(%) (the logic of imprecise probability) is defined in the same
way as SP(%) except without axiom (A0) and with (A4) replaced by:
(A4′ ) (ϕ1 % ψ1 ) ∧ · · · ∧ (ϕn % ψn ) ∧ (ϕ1 , . . . , ϕn , ϕ′ , . . . , ϕ′ ) ≡ (ψ1 , . . . , ψn , ψ ′ , . . . , ψ ′ ) % ⊤
| {z }
| {z }
k times
→ (ψ ′ % ϕ′ ).
k times
The representation result in Theorem 2.2 may be used to prove the following completeness theorem for IP(%).
Theorem 3.9 (Alon and Heifetz 2014). For all ϕ ∈ L(%): ϕ is a theorem of IP(%) if and
only if ϕ is valid with respect to the class of all imprecise probabilistic models.
In Section 4 we will give a completeness proof that shows how the proof of Theorem 3.9
goes as well.
4
The Logic IP(%, ≻)
Our first step beyond the existing literature on logics of imprecise comparative probability
is to add to our formal language the primitive strict operator ≻ from Section 2.
Definition 4.1. The language L(%, ≻) is defined by the following grammar:
ϕ ::= p | ¬ϕ | (ϕ ∧ ϕ) | (ϕ % ϕ) | (ϕ ≻ ϕ)
where p ∈ Prop. As before, we define ϕ ψ as (ϕ % ψ) ∧ ¬(ψ % ϕ). Let L(≻) be the
fragment of L(%, ≻) in which % does not occur.
Definition 4.2. We extend the semantics of Definition 3.4 to L(%, ≻) as follows:
• M, P, w ϕ ≻ ψ iff for all µ ∈ P, µ(JϕKM,P ) > µ(JψKM,P ).
It follows from Example 2.4 that the formula p ≻ q is not equivalent to any formula of
L(%), including p q, while the formula p % q is not equivalent to any formula of L(≻).
In the following, we first present a sound and complete logic for L(%, ≻) whose axioms
match the conditions of the representation result in Theorem 2.7. Then we discuss the
expressivity of this language, including how it is more expressive than L(%).
4 Axiom (A2) is redundant given (A0), but below we consider a logic that drops (A0). In fact, (A2) is
also derivable from the n = 0 case of (A4) and (A4′ ), but we include (A2) to match Alon and Heifetz 2014.
11
4.1
Logic
Definition 4.3. The logic IP(%, ≻) is the smallest subset of L(%, ≻) that contains all
tautologies of propositional logic, is closed under modus ponens (if ϕ ∈ IP(%, ≻) and ϕ →
ψ ∈ IP(%, ≻), then ψ ∈ IP(%, ≻)) and necessitation (if ϕ ∈ IP(%, ≻), then ϕ % ⊤ ∈
IP(%, ≻)), and contains all instances of the following axiom schemas for all n ∈ N:
(B1) ϕ % ⊥;
(B2) ⊤ ≻ ⊥;
(B3) (ϕ ≻ ψ) → (ϕ % ψ);
(B4) ¬(ϕ ≻ ϕ);
(B5) ϕ1 , . . . , ϕn , ϕ′ , . . . , ϕ′ ) ≡ (ψ1 , . . . , ψn , ψ ′ , . . . , ψ ′ ) % ⊤ →
| {z }
| {z }
k times
k times
Vn
( i=1 (ϕi % ψi )) → (ψ ′ % ϕ′ ) ;
(B6) (ϕ1 , . . . , ϕn , ϕ′ , . . . , ϕ′ ) ≡ (ψ1 , . . . , ψn , ψ ′ , . . . , ψ ′ ) % ⊤ →
| {z }
| {z }
k times
Vnk times
Wn
( i=1 (ϕi % ψi ) ∧ i=1 (ϕi ≻ ψi )) → (ψ ′ ≻ ϕ′ ) ;
(B7) (ϕ % ψ) → ((ϕ % ψ) % ⊤);
(B8) ¬(ϕ % ψ) → (¬(ϕ % ψ) % ⊤);
(B9) (ϕ ≻ ψ) → ((ϕ ≻ ψ) % ⊤);
(B10) ¬(ϕ ≻ ψ) → (¬(ϕ ≻ ψ) % ⊤).
The rest of this section is devoted to the proof of the following theorem.
Theorem 4.4 (Soundness and Completeness). For all ϕ ∈ L(%, ≻): ϕ is a theorem of
IP(%, ≻) if and only if ϕ is valid with respect to the class of all imprecise probabilistic
models.
The soundness direction is easy to check. For completeness, as usual, pick an arbitrary
formula γ consistent in IP(%, ≻) and let p be the set of propositional variables appearing
in γ and L0 the restriction of L(%, ≻) to p. Then extend {γ} to a set Γ that is maximally
consistent in IP(%, ≻) with respect to L0 . Now our goal is to build an IP model of γ by
extracting information from Γ. To this end, we view L0 as a term algebra of the type of
Boolean algebras expanded with two binary operations. Then define ϕ by ϕ ∧ (ϕ % ⊤),
F = {ϕ ∈ L0 | ϕ ∈ Γ}, and define a binary relation ∼ on L0 by ϕ ∼ ψ iff (ϕ ↔ ψ) ∈ F .
Lemma 4.5. F contains ⊤ and is closed under deduction in L0 : whenever ϕ → ψ ∈ L0
is a theorem of IP(%, ≻) and ϕ ∈ F , then ψ ∈ F too. Also, ∼ is an equivalence relation
extending the provable equivalence relation on L0 and is congruential over ¬, ∧, %, and ≻:
for all ϕ, ψ, χ ∈ L0 , if ϕ ∼ ψ, then ¬ϕ ∼ ¬ψ, (ϕ ∧ χ) ∼ (ψ ∧ χ), (χ ∧ ϕ) ∼ (χ ∧ ψ),
(ϕ % χ) ∼ (ψ % χ), (χ % ϕ) ∼ (χ % ψ), (ϕ ≻ χ) ∼ (ψ ≻ χ), and (χ ≻ ϕ) ∼ (χ ≻ ψ).
Proof. When n = 0, (B5) together with necessitation shows that for every ϕ, ϕ % ϕ is a
theorem. Then clearly ⊤ ∈ F . To show that F is closed under deduction in L0 , noting that
Γ is clearly closed under deduction in L0 due to its being a maximally consistent set, it is
enough to show that whenever ϕ → ψ ∈ IP(%, ≻), we have (ϕ % ⊤) → (ψ % ⊤) ∈ IP(%, ≻)
too. For this, apply (B5) to hϕ, ψ ∧ ¬ϕ, ⊤i and h⊤, ⊥, ψi.
Since F is closed under deduction in L0 and contains ⊤, F also contains all theorems
of IP(%, ≻) in L0 . Hence it is easy to show that ∼ is an equivalence relation extending the
provable equivalence relation on L0 that is congruential over ¬ and ∧. To show that ∼ is
congruential over % and ≻, using again that Γ is closed under deduction in L0 , we only need
to show that the following are derivable:
12
• ((ϕ ↔ ψ) % ⊤) → ((ϕ % χ) ↔ (ψ % χ));
• ((ϕ ↔ ψ) % ⊤) → (((ϕ % χ) ↔ (ψ % χ)) % ⊤);
• ((ϕ ↔ ψ) % ⊤) → ((ϕ ≻ χ) ↔ (ψ ≻ χ));
• ((ϕ ↔ ψ) % ⊤) → (((ϕ ≻ χ) ↔ (ψ ≻ χ)) % ⊤).
In fact, the second and the fourth follow from the first and the third using (B7) to (B10),
the closure of (· % ⊤) under deduction, and Boolean reasoning. The first and the third are
again simple exercises using (B5) and (B6), respectively.
Lemma 4.6. B = L0 /∼ is a Boolean algebra expanded with two binary operations which
we denote again by % and ≻. Moreover, by axioms (B7) to (B10), for any a, b ∈ B, a % b
is either the top element or the bottom element, and so is a ≻ b. In addition, B is finite.
Proof. Since ∼ is a congruence extending the provable equivalence relation and IP(%, ≻)
has all Boolean reasoning principles, B is a Boolean algebra. To see that a % b is either the
top element or the bottom element, pick any ϕ, ψ ∈ L0 such that [ϕ]∼ = a and [ψ]∼ = b.
Then note that either ϕ % ψ ∈ Γ or ¬(ϕ % ψ) ∈ Γ. In the former case, given (B7), we have
(ϕ % ψ) ∈ F and hence a % b = [ϕ % ψ]∼ is the top element. In the latter case, using (B8),
¬(a % b) is the top element, which means that a % b is the bottom element. The same
reasoning goes for a ≻ b, using (B9) and (B10). Finally, to see that B is finite, note first
that it has a finite set of generators: [p]∼ = {[p]∼ | p ∈ p}. Since we have just shown that %
and ≻ only bring elements to either the top element or the bottom element, in generating
B from [p]∼ we can use only the Boolean operations. Hence the Boolean reduct of B is a
finitely generated Boolean algebra, which must be finite.
Since (the Boolean reduct of) B is a finite Boolean algebra, it is isomorphic to the
powerset algebra of its set of atoms. However, to facilitate the proof of the completeness
theorem of the next section, we take the set that includes all possible truth-assignments of
propositional variables in p.
Definition 4.7. Let Wp = {0, 1}p and Vp : Prop → ℘(W ) be the natural valuation function
defined by Vp (p) = {f ∈ Wp | f (p) = 1} when p ∈ p and Vp (p) = ∅ when p 6∈ p. Finally, let
Mp = hWp , Vp i.
In this way, ℘(Wp ) is essentially the free Boolean algebra generated by the images of p
under Vp . The difference between ℘(Wp ) and the Boolean reduct of B is that B might be
missing some of the atoms in the sense that some truth-assignments to p may be inconsistent
in B. However, from the probabilistic point of view, it is enough to make them impossible
probabilistically by assigning them 0 probability. This gives us the advantage of always
using the same Mp when satisfying any consistent subset of L0 .
To connect Mp to B, first let π be the natural Boolean quotient map π from ℘(Wp ) to
B0 such that π(Vp (p)) = [p]∼ . This map is uniquely given since ℘(Wp ) is the free Boolean
algebra generated by {V (p) | p ∈ p} and B is generated by {[p]∼ | p ∈ p} using Boolean
operations. Then, on ℘(Wp ), we define two binary relations:
• X %Γ Y iff π(X) % π(Y ) is the top element of B;
• X ≻Γ Y iff π(X) ≻ π(Y ) is the top element of B.
Then it is not hard to show the following using the axioms (B1) to (B6).
Lemma 4.8. h%Γ , ≻Γ i satisfies all the conditions required in Theorem 2.7.
Proof. Note that for every a ∈ B, a = [ϕ]∼ for some ϕ ∈ L0 . Hence any quantification over
B, and by the quotient map π, any quantification over ℘(Wp ) as well, can be simulated by
13
quantification over L0 . Since the axioms are schematic, (B1) to (B4) directly translate the
first two bullet points of Theorem 2.7.
For (GFC) and (SGFC), it is enoughP
to note thatPfor any two finite sequences hAi ini=1
n
n
n
and hBi ii=1 of sets in ℘(Wp ) such that i=1 1Ai = i=1 1Bi , we can find two sequences
n
n
hϕi ii=1 and hψi ii=1 of formulas in L0 such that:
• for all i = 1 . . . n, we have [ϕi ]∼ = π(Ai ) and [ψi ]∼ = π(Bi ), which implies that
Ai %Γ Bi iff ϕi % ψi ∈ Γ and that Ai ≻Γ Bi iff ϕi ≻ ψi ∈ Γ;
• [(ϕ1 , . . . , ϕn ) ≡ (ψi , . . . , ψn )]∼ = [⊤]∼ and hence (ϕ1 , . . . , ϕn ) ≡ (ψi , . . . , ψn ) ∈ F ,
which in turn implies that ((ϕ1 , . . . , ϕn ) ≡ (ψi , . . . , ψn ) % ⊤) ∈ Γ.
The existence of these formulas means that we can use (B5) and (B6) to show (GFC) and
(SGFC), respectively.
Hence, by Theorem 2.7, we obtain a set PΓ of probability measures on ℘(Wp ) such that
• X %Γ Y iff for all µ ∈ PΓ , µ(X) ≥ µ(Y ), and
• X ≻Γ Y iff for all µ ∈ PΓ , µ(X) > µ(Y ).
From this, we can show the following truth lemma.
Lemma 4.9. For all ϕ ∈ L0 , π(JϕKhMp ,PΓ i ) = [ϕ]∼ .
Proof. By a simple induction on L0 . The only cases of interest are the inductive steps for
% and ≻. Note that Jϕ % ψKhMp ,PΓ i is either Wp or ∅. Similarly, we have shown that
[ϕ % ψ]∼ is either [⊤]∼ or [⊥]∼ . Then the only missing connection is the following:
Jϕ % ψKhMp ,PΓ i = Wp ⇐⇒ ∀µ ∈ PΓ , µ(JϕKhMp ,PΓ i ) ≥ µ(JψKhMp ,PΓ i )
⇐⇒ JϕKhMp ,PΓ i %Γ JψKhMp ,PΓ i
⇐⇒ (π(JϕKhMp ,PΓ i ) % π(JψKhMp ,PΓ i )) = [⊤]∼
⇐⇒ ([ϕ]∼ % [ψ]∼ ) = [⊤]∼
⇐⇒ [ϕ % ψ]∼ = [⊤]∼ .
The proof for the case with ϕ ≻ ψ is almost identical.
Now note that [γ]∼ is not the bottom element in B, since otherwise [¬γ]∼ would be
the top element, and then ¬γ ∈ Γ, which means ¬γ ∈ Γ too, rendering Γ inconsistent.
Hence JγKhMp ,PΓ i is nonempty because π(∅) must be [⊥]∼ , which is not [γ]∼ . Take a
w ∈ JγKhMp ,PΓ i . Then hMp , PΓ i, w γ, and we are done.
To sum up, we now have the following strengthening of the completeness theorem, noting
that there are only finitely many logically inequivalent formulas all using only a finite set p
of propositional variables (see Lemma 6.11).
Proposition 4.10. For any finite subset p of Prop with L0 being the set of formulas in
L(%, ≻) using only the propositional variables in p, and for any Γ ⊆ L0 that is consistent
relative to IP(%, ≻), there is a set PΓ of probability measures on ℘(Wp ) and a w ∈ Wp such
that Mp , PΓ , w γ for all γ ∈ Γ.
Before we discuss the expressivity of L(%, ≻), we comment on the logic of precise probabilistic models. While ≻ is not definable in L(%, ≻) with respect to all IP models, with
respect to precise probabilistic models, ϕ ≻ ψ can be defined simply by ¬(ψ % ψ). Hence
we can define the logic SP(%, ≻) as follows.
Definition 4.11. The logic SP(%, ≻) is the smallest subset of L(%, ≻) that is closed under
modus ponens (if ϕ ∈ SP(%, ≻) and ϕ → ψ ∈ SP(%, ≻), then ψ ∈ SP(%, ≻)) and necessitation (if ϕ ∈ SP(%, ≻) then ϕ % ⊤ ∈ SP(%, ≻)), contains all instances of tautologies
of propositional logic, all instances of the axiom schemas (A1) to (A6) for SP(%), and all
instances of the axiom schema (A7) (ϕ ≻ ψ) ↔ ¬(ψ % ϕ).
14
Then the following completeness theorem for SP(%, ≻) can be shown in the same way
that we just showed the completeness of IP(%, ≻) using instead the representation result in
Theorem 2.1. It will be used in the completeness proof for IP(%, ≻, ♦) in the next section.
Proposition 4.12. For any finite subset p of Prop with L0 being the set of formulas in
L(%, ≻) using only the propositional variables in p, and for any Γ ⊆ L0 that is consistent
relative to SP(%, ≻), there is a probability measure µΓ on ℘(Wp ) and a w ∈ Wp such that
Mp , {µΓ }, w γ for all γ ∈ Γ.
4.2
Expressivity
In this subsection we discuss the expressivity of L(%) and L(%, ≻) in distinguishing IP
models. Given Example 2.4, it should not be surprising that L(%, ≻) is more expressive
than L(≻). But here we precisely characterize the expressivity of these languages.
Definition 4.13. For any probability measure µ defined on a field F of sets, let %µ and
≻µ be binary relations on F such that for any X, Y ∈ F , X %µ Y iff µ(X) ≥ µ(Y ), and
X ≻µ Y iff
T µ(X) > µ(Y ). In addition,
T for any set P of probability measures defined on F ,
let %P = {%µ | µ ∈ P} and ≻P = {≻µ | µ ∈ P}.
For IP models hW, V, Pi and hW ′ , V ′ , P ′ i, we say that they are %-order-similar in
p ⊆ Prop if for any Boolean formulas α, β using only letters in p,
• JαKhW,V i %P JβKhW,V i iff JαKhW
′
,V ′ i
%P ′ JβKhW
′
,V ′ i
.
We say that they are order-similar in p ⊆ Prop if in addition to the above biconditional for
%, it is also true that for any Boolean formulas α, β using only letters in p,
• JαKhW,V i ≻P JβKhW,V i iff JαKhW
′
,V ′ i
≻P ′ JβKhW
′
,V ′ i
.
A special case for (%-)order-similarity is worth mentioning.
Proposition 4.14. Let hW, V, Pi and hW, V, P ′ i be IP models and p a subset of Prop. Let
F be the field of sets generated by the image of p under V . Then hW, V, Pi and hW, V, P ′ i
are %-order-similar (resp. order-similar) in p iff %P |F = %P ′ |F (resp. %P |F = %P ′ |F and
≻P |F = ≻P ′ |F ).
Proposition 4.15. Let hW, V, Pi and hW ′ , V ′ , P ′ i be IP models and w, w′ worlds in W and
W ′ , respectively. Then w and w′ satisfy the same formulas in L(%, ≻) (resp. L(%)) using
only propositional variables in p ⊆ Prop iff
1. w and w′ satisfy the same propositional variables in p, and
2. hW, V, Pi and hW ′ , V ′ , P ′ i are order-similar (resp. %-order-similar) in p.
Proof. The left-to-right direction is trivial since failure of either 1 or 2 directly translates
to a formula in the appropriate language with respect to which w and w′ disagrees. For
the right-to-left direction, note first that any comparative formula χ of the form ϕ % ψ or
ϕ ≻ ψ is true at one world iff it is true at all worlds. This means that a formula ϕ with
χ occurring is equivalent to (χ ∧ ϕ[χ/⊤]) ∨ (¬χ ∧ ϕ[χ/⊥]) where ϕ[χ/⊤] is the result of
replacing χ by ⊤ in ϕ, and similarly for ϕ[χ/⊥]. By repeated use of this method, it is not
hard to see that every formula in L(%, ≻) using only letters in p is semantically equivalent
to a Boolean combination of formulas of one of the following types:
• a propositional variables in p,
• α % β where α, β are Boolean formulas using only letters in p, and
• α ≻ β where α, β are Boolean formulas using only letters in p.
15
The case with L(%) is similar (without the last kind of formula in the above list). The
proposition then follows easily.
Now we can translate Example 2.4 to a pair of pointed IP models that L(%, ≻) can
distinguish but L(%) cannot. Let W = {w, v} and V be the valuation such that V (p) = {w}
and V (q) = ∅ for all q ∈ Prop \ {p}. Let µw<v and µw=v be defined as in Example 2.4.
Then by Propositions 4.15 and 4.14, L(%) cannot distinguish hW, V, {µw<v , µw=v }i, w from
hW, V, {µw<v }i, w, since %{µw<v ,µw=v } and %{µw<v } are the same on ℘(W ). However, ¬p ≻ p
distinguishes the pointed models.
The Logic IP(%, ≻, ♦)
5
In this section, we further extend our language with a possibility modal ♦. In the context
of natural language semantics, one proposal for the meaning of “possibly ϕ” in precise
probabilistic models is that ϕ has non-zero probability (Lassiter, 2010, §4.4). In imprecise
probabilistic models, we could require either (a) that all measures in P give ϕ non-zero
probability or (b) that at least some measure in P gives ϕ non-zero probability. We adopt the
weaker interpretation (b) of “possibly ϕ” (not as a proposal in natural language semantics,
but because it suits our technical purposes in the next section). In addition to making claims
about the possibility of factual states of affairs, e.g., “It is possible that it is raining,” we
would like to be able to make claims about the possibility of likelihood relations, e.g., “It is
possible that hail is more likely than lightning tonight.” According to the formal semantics
given below, the latter will be true when there exists a probability measure in P such that
according to that measure hail is more likely than lightning.
Definition 5.1. The language L(%, ≻, ♦) is defined by the following grammar:
ϕ ::= p | ¬ϕ | (ϕ ∧ ϕ) | (ϕ % ϕ) | (ϕ ≻ ϕ) | ♦ϕ
where p ∈ Prop. We define ϕ := ¬♦¬ϕ.
Definition 5.2. We extend the semantics of Definition 4.2 to L(%, ≻, ♦) as follows:
• M, P, w ♦ϕ iff there is a µ ∈ P such that µ(JϕKM,{µ} ) 6= 0.
Note that with ♦ added, we no longer need ≻ as a primitive in the language, since ϕ ≻ ψ
is definable as ¬♦(ψ % ϕ), but we keep ≻ as a primitive for convenience.
In the following, we first present a sound and complete logic for the valid formulas in
L(%, ≻, ♦). Then we briefly comment on the logic’s complexity. Finally, we show how
L(%, ≻, ♦) is more expressive than L(%, ≻) and characterize the expressivity of L(%, ≻, ♦).
5.1
Logic
An important logical fact about the set of valid formulas of L(%, ≻, ♦) is that it is not closed
under uniform substitution of arbitrary formulas for propositional variables.
Example 5.3. The formula (p ≻ ⊥) → ♦(p ≻ ⊥) is valid but
(¬((p % q) ∨ (q % p)) ≻ ⊥) → ♦(¬((p % q) ∨ (q % p)) ≻ ⊥)
is not valid. The reason is that there is no single probability measure that can make true
the non-comparability formula ¬((p % q) ∨ (q % p)).
While the failure of uniform substitution can complicate efforts to axiomatize a set of
validities (cf. Holliday et al. 2012, 2013), we will completely axiomatize the validities of
L(%, ≻, ♦) with the logic IP(%, ≻, ♦) defined below.
16
Definition 5.4. The logic SP(%, ≻, ♦) is the smallest subset of L(%, ≻, ♦) that is (i) closed
under modus ponens, uniform substitution, and the rule of replacement of provable equivalents, and (ii) contains all theorems of SP(%, ≻) and ♦p ↔ (p ≻ ⊥).
The logic IP(%, ≻, ♦) is the smallest subset of L(%, ≻, ♦) that is (i) closed under modus
ponens, the rule of replacement of provable equivalents, and the rule that if ϕ ∈ SP(%, ≻, ♦),
then ϕ ∈ IP(%, ≻, ♦), and (ii) contains all substitution instances in L(%, ≻, ♦) of the
theorems in IP(%, ≻) and also all instances of the following axiom schemas, where α and β
are propositional:
(C1) (ϕ ∧ (ϕ → ψ)) → ψ;
(C2) ♦⊤;
(C3) ϕ → (ϕ % ⊤);
(C4) ♦ϕ → (♦ϕ % ⊤);
(C5) ϕ ↔ (ϕ % ⊤);
(C6) (α % β) ↔ (α % β);
(C7) (α ≻ β) ↔ (α ≻ β).
The rest of this section is devoted to the proof of the following theorem.
Theorem 5.5 (Soundness and Completeness). For all ϕ ∈ L(%, ≻, ♦): ϕ is a theorem of
IP(%, ≻, ♦) if and only if ϕ is valid with respect to the class of all imprecise probabilistic
models.
To prove Theorem 5.5, we first show that (1) there is no need for a ♦ to scope over a
♦ and (2) there is no need for a % or ≻ to scope over a ♦. In other words, we will find a
significantly simpler fragment of L(%, ≻, ♦), which we call LSimp , such that every formula
in L(%, ≻, ♦) is provably equivalent to a formula in LSimp in IP(%, ≻, ♦).
Definition 5.6. Define T−♦ : L(%, ≻, ♦) → L(%, ≻) by:
• T−♦ (p) = p for all p ∈ Prop;
• T−♦ (¬ϕ) = ¬T−♦ (ϕ);
• T−♦ (ϕ ∧ ψ) = T−♦ (ϕ) ∧ T−♦ (ψ);
• T−♦ (ϕ % ψ) = T−♦ (ϕ) % T−♦ (ψ);
• T−♦ (ϕ ≻ ψ) = T−♦ (ϕ) ≻ T−♦ (ψ);
• T−♦ (♦ϕ) = ¬(⊥ % T−♦ (ϕ)).
Lemma 5.7. For every ϕ ∈ L(%, ≻, ♦), ϕ ↔ T−♦ (ϕ) is in SP(%, ≻, ♦). Moreover, T−♦ (ϕ)
uses the same propositional variables as ϕ does.
Proof. A simple induction with repeated use of replacement of equivalents suffices.
Lemma 5.8. In IP(%, ≻, ♦), formulas of the form ♦ϕ ↔ ¬¬ϕ are theorems. In addition,
is a normal operator: for any ϕ ∈ L(%, ≻, ♦), (ϕ ∧ (ϕ → ψ)) → ψ is in IP(%, ≻, ♦),
and whenever ϕ is in IP(%, ≻, ♦), so is ϕ.
Proof. To derive ♦ϕ ↔ ¬¬ϕ, it is enough to derive ♦ϕ ↔ ♦¬¬ϕ. But this is clearly
derivable by replacement of equivalents since ♦ϕ ↔ ♦ϕ and ϕ ↔ ¬¬ϕ are theorems.
Definition 5.9. Let LSimp be the fragment of L(%, ≻, ♦) generated from Prop and {♦ϕ |
ϕ ∈ L(%, ≻)} by ¬ and ∧.
17
In the following, for any p ⊆ Prop, we append [p] to the name of a language to denote
the set of formulas in that language using only variables in p.
Lemma 5.10. For every ϕ ∈ L(%, ≻, ♦), there is a T (ϕ) ∈ LSimp such that ϕ ↔ T (ϕ) ∈
IP(%, ≻, ♦). Moreover, T (ϕ) and ϕ use the same propositional variables.
Proof. By induction on L(%, ≻, ♦). The base case is trivial: we can simply define T (p) = p.
The Boolean cases are also trivial: we can define T (¬ϕ) = ¬T (ϕ) and T (ϕ ∧ ψ) = T (ϕ) ∧
T (ψ). For the ♦ case, define T (♦ϕ) = ♦T−♦ (ϕ). To see that ♦ϕ is provably equivalent
to ♦T−♦ (ϕ), first note that by Lemma 5.7, ϕ ↔ T−♦ (ϕ) ∈ SP(%, ≻, ♦). But then (ϕ ↔
T−♦ (ϕ)) ∈ IP(%, ≻, ♦). By the normality of , we have ♦ϕ ↔ ♦T−♦ (ϕ) ∈ IP(%, ≻, ♦).
To find the appropriate T (ϕ % ψ), given that the required T (ϕ) and T (ψ) in LSimp
have been found, we need to extract all ♦’ed formulas in T (ϕ) % T (ψ) so that they are no
longer in the scope of the main connective % in T (ϕ) % T (ψ). Clearly this can be done by
iteratively using the following claim:
(*) for any χ ∈ L(%, ≻) and ϕ, ψ ∈ LSimp ,
(ϕ % ψ) ↔ (♦χ ∧ (ϕ[♦χ/⊤] % ψ[♦χ/⊤])) ∨ (¬♦χ ∧ (ϕ[♦χ/⊥] % ψ[♦χ/⊥])))
is in IP(%, ≻, ♦).
The claim is easily proved using (C3) and (C4). Note that since ϕ, ψ are in LSimp , they
are Boolean combinations of propositional variables and formulas of the form ♦χ where
χ ∈ L(%, ≻). List all the V♦’ed formulas appearing in ϕ or ψ as δ1 , δ2 , . . . , δn . Then for
n
any f ∈ {0, 1}n , let δf be i=1 ¬f (i) δi where ¬0 δi is ¬δi and ¬1 δi is simply δi . Moreover,
let ϕ[f ] be ϕ[δ1 /⊤f (1) , · · · , δn /⊤f (n) ] and similarly for ψ[f ], where ⊤f (i) = ⊤ if f (i) = 1
and ⊤f (i) = ⊥ if f (i) = 0. With this notation,
W it is not hard to see that by repeatedly
applying (*), ϕ % ψ is provably equivalent to f ∈{0,1}n (δf ∧ (ϕ[f ] % ψ[f ])) and then also
W
to f ∈{0,1}n (δf ∧ (ϕ[f ] % ψ[f ])) since for any f , ϕ[f ] and ψ[f ] are propositional since
we have replaced all theW♦’ed formulas by either ⊤ or ⊥ and by axiom (C6) we can add a
there. The formula f ∈{0,1}n (δf ∧ (ϕ[f ] % ψ[f ])) is the desired T (ϕ % ψ) since it is
clearly in LSimp now. The definition of T (ϕ ≻ ψ) is almost identical: we can simply replace
ϕ[f ] % ψ[f ] by ϕ[f ] ≻ ψ[f ]. In this case, we use (C7) instead.
Now we are ready to prove the soundness and completeness of IP(%, ≻, ♦). Soundness is
clear as usual. For completeness, pick an arbitrary γ that is consistent relative to IP(%, ≻, ♦),
and let p be the set of propositional variables used in γ. Then take an arbitrary Γ that is
maximally consistent containing γ. Following the standard strategy, let Σ = {(ϕ % ⊤) |
ϕ ∈ Γ, ϕ ∈ L(%, ≻)[p]}. Note that Σ ⊆ L(%, ≻)[p]. Also, Σ must be consistent relative to
SP(%, ≻) since otherwise there are formulas (ϕ1 % ⊤), (ϕ2 % ⊤), . . . , (ϕn % ⊤) in Σ such
that ((ϕ1 % ⊤) ∧ · · · ∧ (ϕn % ⊤)) → ⊥ is in SP(%, ≻). But then by the rules of IP(%, ≻, ♦)
and the normality of , we have that ((ϕ1 % ⊤)∧· · ·∧(ϕn % ⊤)) → ⊥ is in IP(%, ≻, ♦).
Since ϕ is provably equivalent to (ϕ % ⊤) by (C5), we have that ⊥ is in Γ according
to the maximality of Γ, rendering Γ inconsistent since we have (C2).
Now let D = {Σ ∪ {¬(ϕ % ⊤)} | ¬ϕ ∈ Γ, ϕ ∈ L(%, ≻)[p]}. Note that for each
∆ = Σ ∪ {¬(ϕ % ⊤)} ∈ D, ∆ is also a set of formulas in L(%, ≻)[p]. Moreover, ∆ must
be consistent relative to SP(%, ≻) as well. If not, then since Σ is consistent, we must have
formulas (ϕ1 % ⊤), . . . , (ϕn % ⊤) in Σ such that ((ϕ1 % ⊤) ∧ · · · ∧ (ϕn % ⊤)) → (ϕ % ⊤) ∈
SP(%, ≻). Then by reasoning similar to that above, (¬ϕ % ⊤) and hence ¬ϕ are in Γ
using (C5), rendering Γ inconsistent.
Thus, for each ∆ ∈ D, according to Proposition 4.12, there is a probability measure µ∆
on ℘(Wp ) and a w ∈ Wp such that Mp , {µ∆ }, w ∆. Note that since all formulas in ∆
are comparison formulas of the form ϕ % ⊤ or its negation, it does not matter what w is.
Hence we have that Mp , {µ∆ } ∆. Take P to be the set {µ∆ | ∆ ∈ D}. Then we are left
only to show that there is a w ∈ Wp such that Mp , P, w ϕ for all ϕ ∈ Γ ∩ L(%, ≻, ♦)[p].
18
Let w0 be the element in Wp = {0, 1}p defined by w0 (p) = 1 iff p ∈ Γ for all p ∈ p. Then
we are ready to show the following truth lemma.
Lemma 5.11. For all ϕ ∈ L(%, ≻, ♦)[p], Mp , P, w0 ϕ iff ϕ ∈ Γ.
Proof. It is enough to show that for all ϕ ∈ LSimp [p], Mp , P, w0 ϕ iff ϕ ∈ Γ. This is
because for any ϕ ∈ L(%, ≻, ♦)[p], according to Lemma 5.10, ϕ ∈ Γ iff T (ϕ) ∈ Γ with
T (ϕ) ∈ LSimp [p]. But then
T (ϕ) ∈ Γ ⇐⇒ Mp , P, w0 T (ϕ) ⇐⇒ Mp , P, w0 ϕ.
The first equivalence holds by the fact that T (ϕ) ∈ LSimp [p] and the truth lemma we will
show below in this fragment. The second is by soundness.
We now focus on the fragment LSimp [p]. Since the generating operations of this fragment
are Boolean, the inductive cases are trivial. The atomic case for propositional variables in
p is also trivial by the definition of w0 . Hence we are left to show that for any ϕ ∈ {♦ψ |
ψ ∈ L(%, ≻)[p]}, we have ϕ ∈ Γ iff Mp , P, w0 ϕ. In other words, we only need to show
that for all ϕ ∈ L(%, ≻)[p], we have ♦ϕ ∈ Γ iff Mp , P, w0 ♦ϕ.
• Suppose ♦ϕ 6∈ Γ, so ¬ϕ ∈ Γ. Then (¬ϕ % ⊤) ∈ Σ since ¬ϕ ∈ L(%, ≻)[p], which
means (¬ϕ % ⊤) ∈ ∆ for all ∆ ∈ D. Then, for any µ∆ ∈ P, Mp , {µ∆ } ¬ϕ % ⊤
since (¬ϕ % ⊤) ∈ ∆, which in turn means that µ∆ (JϕKMp ,{µ∆ } ) = 0. This is precisely
the condition for ♦ϕ to be false at Mp , P, w0 .
• Suppose ♦ϕ ∈ Γ, so ¬¬ϕ ∈ Γ. Then there is a ∆ such that ¬(¬ϕ % ⊤) ∈ ∆
again because ¬ϕ ∈ L(%, ≻)[p]. For this µ∆ then, Mp , {µ∆ } 2 ¬ϕ % ⊤. In other
words, µ∆ (JϕKMp ,{µ∆ } ) 6= 0. The existence of this µ∆ ∈ P shows that ♦ϕ is true at
Mp , P, w0 .
Given the above truth lemma, Mp , P, w0 γ since γ ∈ Γ and γ ∈ L(%, ≻, ♦)[p]. Hence
we have successfully found a model for the arbitrarily chosen consistent γ, completing the
proof of the completeness of IP(%, ≻, ♦).
5.2
Complexity
In this section, we briefly comment on the complexity of the consistency problem of the
logic IP(%, ≻, ♦) or equivalently the satisfiability problem of L(%, ≻, ♦). First, adapting
the proof of Theorem 9 in Harrison-Trainor et al. 2017, it is not hard to see that the
satisfiability problem for a conjunction of literals where we take formulas in both Prop and
{♦ϕ | ϕ ∈ L(%, ≻)} as atomic formulas is in NP (note that Theorem 2.6 in Fagin et al.
1990, used in the proof of Harrison-Trainor et al. 2017, allows strict inequalities). Hence
the satisfiability problem for LSimp is also in NP. Then to see that the satisfiability problem
for L(%, ≻, ♦) is in NP, it is enough to show that every ϕ ∈ L(%, ≻, ♦) is equivalent to
a disjunction of formulas in LSimp where each disjunct’s length is bounded by O(|ϕ|). In
our proof of Lemma 5.10 above, this is done by extracting ♦ from the scope of % and ≻
and eliminating ♦ in the scope of ♦. Note that the elimination of ♦ in the scope of ♦ can
be done before the extraction: given an input formula ϕ, replace each subformula ♦χ not
in the scope of any ♦ by ♦T−♦ (χ). The resulting formula, which we call ϕ′ , is clearly at
most four times longer than ϕ. Then we only need to run the process of (1) extracting
♦’ed formulas in the scope of % or ≻ and (2) adding a to a % formula or a ≻ formula
when both arguments to the % or ≻ no longer contain modal operators. This process, while
introducing disjunctions exponentially, only grows the length of the disjuncts by at most a
constant for each extracting operation. The number of total extracting operations is clearly
at most the length of the input formula ϕ′ . Thus, we obtain the following.
Theorem 5.12. The complexity of the satisfiability problem for L(%, ≻, ♦) is NP-complete.
19
5.3
Expressivity
Reflecting the failure of uniform substitution, for any purely propositional formula α, ♦α is
already expressible in L(%).
Lemma 5.13. Let α, β be propositional formulas. Then:
1. ♦α is equivalent to ¬(⊥ % α);
2. ♦(α % β) and ♦¬(β ≻ α) are both equivalent to ¬(β ≻ α);
3. ♦(β ≻ α) and ♦¬(α % β) are both equivalent to ¬(α % β).
However, ♦ϕ is not in general expressible without ♦.
Example 5.14. The formula ♦(p ≈ ¬p) is not equivalent to any formula of L(%, ≻).
Consider again the propositional model M = hW, V i where W = {w, v} and V (p) = {w}
while V (q) = ∅ for all q ∈ Prop \ {p}. Then let P be the set of all probability measures on
℘(W ) and P ′ the set of all probability measure µ on ℘(W ) except the ones that give equal
probability to {w} and {v}. Then %P and %P ′ (resp. ≻P and ≻P ′ ) are the same on ℘(W )
and are pictured below:
W
{w}
{v}
∅
Thus, using Propositions 4.15 and 4.14, for any ϕ ∈ L(%, ≻), M, P, w ϕ iff M, P ′ , w ϕ.
Yet M, P, w ♦(p ≈ ¬p) while M, P ′ , w 2 ♦(p ≈ ¬p).
Now we characterize the expressivity of L(%, ≻, ♦) precisely.
Proposition 5.15. Let hW, V, Pi and hW ′ , V ′ , P ′ i be IP models and w, w′ worlds in W
and W ′ , respectively. Let p be a subset of Prop. Then w and w′ satisfy the same formulas
in L(%, ≻, ♦) using only propositional variables in p if
1. w and w′ satisfy the same propositional variables in p,
2. for any µ ∈ P, there is µ′ ∈ P ′ such that hW, V, {µ}i and hW ′ , V ′ , {µ′ }i are ordersimilar in p, and
3. for any µ′ ∈ P ′ , there is µ ∈ P such that hW, V, {µ}i and hW ′ , V ′ , {µ′ }i are ordersimilar in p.
The converse also holds if in addition p is finite.
Proof. The left-to-right direction is again easy. For the only non-obvious case, suppose
for example that the second clause fails: there is a µ ∈ P such that for any µ′ ∈ P ′ ,
hW, V, {µ}i and hW ′ , V ′ , {µ′ }i are not order-similar in p. Then let {αi }1≤i≤n be a finite
set of Boolean formulas such that every Boolean formula using only letters in p is logically
equivalent to some αi (such a set can be found using disjunctive
normal forms). We can
V
now describe µ in full relative to p by the conjunction χ = 1≤i,j≤n si (αi % αj ) where si
is empty if µ(Jαi KhW,V i ) % µ(Jαj KhW,V i ) and is ¬ otherwise. Indeed, by the definition of
order-similarity, whenever hW, V, {µ}i and hW ′ , V ′ , {µ′ }i are not order-similar in p, at any
world in W ′ , χ is false. This means that w′ would falsify ♦µ, but w satisfies ♦µ, showing
that the two worlds disagree on a formula in L(%, ≻, ♦).
The right-to-left direction follows from the normal form lemma, Lemma 5.10. If the last
two clauses hold, then for any formula of the form ♦ϕ where ϕ ∈ L(%, ≻)[p], ♦ϕ is true
20
at M, P, w iff it is true at M, P ′ , w′ . By the first clause, the two pointed IP models also
satisfy the same propositional variables in p. Then by a simple induction, they satisfy the
same formulas in LSimp [p]. But by Lemma 5.10, this is enough for them to satisfy the same
formulas in L(%, ≻, ♦)[p].
The special case where the two IP models share the same propositional model is again
worth spelling out.
Proposition 5.16. Let M = hW, V i be a propositional model, w and w′ two worlds in
W , and P and P ′ nonempty sets of probability measures defined on fields of sets extending
V [Prop]. Let p be a subset of Prop and F the field of sets on W generated by V [p]. Then
M, P, w and M, P ′ , w′ satisfy the same formulas in L(%, ≻, ♦)[p] if
• w and w′ satisfy the same propositional variables in p,
• for any µ ∈ P, there is µ′ ∈ P ′ such that %µ |F = %µ′ |F , and
• for any µ′ ∈ P ′ , there is µ ∈ P such that %µ |F = %µ′ |F .
The converse also holds if in addition p is finite.
6
Dynamics
In this section, we consider two kinds of information dynamics in the context of imprecise
probability. The first is a standard notion of updating a set of probability measure on new
evidence (see, e.g., Halpern 2003, p. 81) where we can eliminate both possible worlds (keeping only the worlds compatible with the evidence) and probability measures (keeping only
the probability measures that give the evidence a positive probability measure). Usually,
especially in a Bayesian framework, such updates are all we need for information dynamics,
since we can always model agents with a universal and all-inclusive state space, anticipating
all distinctions that could be made among states. However, there are numerous examples
where an agent is not initially aware of a distinction. In Example 1.1, the agent is not
initially aware of the gland and hence the distinction between a swollen and normal gland.
When the doctor tells the agent about the gland, we can model the agent as first learning
the mere existence of a new proposition—the swollen gland proposition—and then learning
how this proposition relates probabilistically to her having the disease. Without imprecise
probability, we face the perennial question of how to assign a probability for such a new
proposition. Given imprecise probability, however, we can simply choose the set of all probability measures that are compatible with one of the old probability measures. This models
how an agent can “initialize” her uncertainty toward a newly introduced proposition.
In the next two subsections, we discuss the two dynamic operators in more detail. For the
update operator, we show how it does not add expressivity to the language L(%, ≻, ♦), and
we present a sound and complete logic following the standard “reduction axiom” strategy
in dynamic epistemic logic. For the operators modeling the introduction of new propositions, however, we show that they significantly increase expressivity, and we leave the
axiomatization of the valid formulas as an open question.
6.1
Updating Probabilities and the Logic IP(%, ≻, ♦, h i)
In this subsection, we introduce the update operator h i that models learning the truth of a
proposition. Given an initial set P of probability measures, after learning some proposition
U ⊆ W with certainty, we update the set P to the set
PU = {µ(· | U ) : µ ∈ P, µ(U ) > 0},
where µ(· | U ) is defined by conditionalization as usual: for any V ⊆ W , µ(V | U ) =
21
µ(V ∩U )
µ(U ) .
Since we have a formal language with comparative probability operators, we can model
updating on sentences containing not only factual formulas but also comparative probability
formulas (cf. Weatherson 2007; Yalcin 2011; Moss 2018), as in “it is raining, and it is more
likely that there will be hail than it is that there will be lightning” (r ∧ (h ≻ ℓ)). Intuitively,
if Ann tells Bob that “hail is more likely than lightning,” she is not telling Bob something
about his own epistemic state (which he already knows, in the models of this paper) but
is rather recommending that he update his epistemic state to one according to which hail
is more likely than lightning—which he can do by discarding from his set of measures any
measure according to which hail is not more likely than lightning.5 Our semantics below,
developed in the style of dynamic epistemic logic (see, e.g., van Ditmarsch et al. 2008; van
Benthem 2011), will allow such updates in response to comparative probability claims.
Definition 6.1. The language L(%, ≻, ♦, h i) is defined by the following grammar:
ϕ ::= p | ¬ϕ | (ϕ ∧ ϕ) | (ϕ % ϕ) | (ϕ ≻ ϕ) | ♦ϕ | hϕiϕ
where p ∈ Prop. We read hαiϕ as “(update with α is possible and) after update with α, ϕ
is the case.” As usual, [α]ϕ abbreviates ¬hαi¬ϕ.
Definition 6.2. We extend the semantics of Definition 5.2 to L(%, ≻, ♦, h i) as follows:
• M, P, w hϕiψ iff there is a µ ∈ P such that µ(JϕKM,{µ} ) 6= 0 and M, Pϕ , w ψ,
where
Pϕ = {ν(· | JϕKM,{ν} ) : ν ∈ P and ν(JϕKM,{ν} ) 6= 0}.
Lemma 6.3. The semantics for [ϕ]ψ is as follows:
• M, P, w [ϕ]ψ iff if there is a µ ∈ P such that µ(JϕKM,{µ} ) 6= 0, then M, Pϕ , w ψ.
The following lemma states how updating with a formula ϕ % ψ, if possible, results in
restricting one’s set of measures to just those that individually satisfy ϕ % ψ.
Lemma 6.4. For any IP model hM, Pi and ϕ, ψ ∈ L(%, ≻, ♦), Pϕ%ψ = ∅ or
Pϕ%ψ = {ν ∈ P : M, {ν} ϕ % ψ}.
Let us see how this framework can be used to formalize the three prisoners scenario from
Example 1.2.
Example 6.5. Let ei and si stand for ‘prisoner i will be executed ’ and ‘the jailer says that
prisoner i will be executed’, respectively. Define a propositional model M = hW, V i with
W = {wab , wac , wbc , wcb }
where at wij , prisoner i is the only prisoner who lives and prisoner j is the prisoner who the
jailer says will be executed, so
V (ea ) = {wbc , wcb }, V (eb ) = {wab , wac , wcb }, V (ec ) = {wab , wac , wbc },
V (sb ) = {wab , wcb }, V (sc ) = {wac , wbc }.
Since prisoner a knows that each prisoner is equally likely to be executed but has no idea
about how the jailer is likely to answer his question about which of b or c will be executed
(except that the jailer is certain to give a true answer), prisoner a’s epistemic state may be
modelled by the following set of probability measures:
P = {µ : µ({wab , wac }) = µ({wbc }) = µ({wcb }) = 1/3}.
5 Another
possible interpretation is that there is some objectively correct probability measure, and Ann
is telling Bob a fact about that measure, which he wants his probabilities to ultimately match.
22
Then the following formulas together capture what is distinctive about the puzzle, all coming
out true in this model. First, we can state that each prisoner is equally likely to be spared—
indeed that each has one-third chance:
α := ⊥ % (ea ∧ eb ∧ ec ) ∧ (ea ∧ eb ) ∨ (ea ∧ ec ) ∨ (eb ∧ ec ) % ⊤ ∧ (ea ≈ eb ) ∧ (eb ≈ ec ).
Second, we can state that the jailer only announces truthfully one of sb and sc :
β := ((sb → eb ) % ⊤) ∧ ((sc → ec ) % ⊤) ∧ (⊥ % (sb ∧ sc )).
Given the dynamic operator, we can also express a fact about how a’s uncertainty is affected
upon learning that b is to be executed. After this announcement, a’s credences dilate from
a sharp two-thirds probability to including the possibilities that he is sure to be executed
and that he has merely one-half probability of being executed:
hsb i ♦(ea % ⊤) ∧ ♦(ea ≈ ¬ea ) .
If, however, a first updates with the information that the jailer is following a protocol of
reporting b or reporting c with equal probability in the case that a is to be spared, then
dilation no longer occurs. In fact, the probability of ea remains at two-thirds, and for
instance the following formula is true:
h(¬ea ∧ sb ) ≈ (¬ea ∧ sc )ihsb i (ea ≻ ec ) ∧ (ea ≻ ¬ea ) ∧ (⊤ ≻ ea ) .
Finally, were a to update with the information that the jailer would certainly announce eb
in case ea were false, then the probabilities of ea , eb , and ec would all remain equally likely:
h⊥ % (¬ea ∧ sc )iα.
But after learning that b will be executed, the probability of ea decreases to one-half:
h⊥ % (¬ea ∧ sc )ihsb i(ea ≈ ¬ea ).
It is important to note that we do not have to resort to the particular model above to
model the prisoner case. Indeed, the following formulas are true at any pointed IP model
and hence also provable in the complete logic to be presented:
(α ∧ β) → [(¬ea ∧ sb ) ≈ (¬ea ∧ sc )]hsb i (ea ≻ ec ) ∧ (ea ≻ ¬ea ) ∧ (⊤ ≻ ea )
(2)
(α ∧ β) → [⊥ % (¬ea ∧ sc )](α ∧ hsb i(ea ≈ ¬ea ))
(α ∧ β) →
((♦(⊥ % (¬ea ∧ sb )) ∧ ♦(⊥ % (¬ea ∧ sc ))) → hsb i(♦(ea % ⊤) ∧ ♦(ea ≈ ¬ea ))) .
(3)
(4)
In (2) and (3), we have to use [ ] instead of h i since there are models that satisfy α ∧ β
but do not contain probability measures satisfying either (¬ea ∧ sb ) ≈ (¬ea ∧ sc ) or ⊥ %
(¬ea ∧ sc ), unlike the particular model above using the all-inclusive P. To cope with this,
we need to use the box version of the update operator. In formula (4), the extra premise
♦(⊥ % (¬ea ∧ sb )) ∧ ♦(⊥ % (¬ea ∧ sc )) is again required since dilation crucially relies on P
containing both a measure assigning 0 to ¬ea ∧ sb and a measure assigning 0 to ¬ea ∧ sc .
In our current language, using the ♦ operator is the most straightforward way to express
this. An equivalent way is to use ¬((¬ea ∧ sb ) ≻ ⊥) ∧ ¬((¬ea ∧ sb ) ≻ ⊥). However, the ♦ in
♦(ea ≈ ¬ea ) is necessary: there is no formula in L(%, ≻) that is equivalent to ♦(ea ≈ ¬ea ).
To obtain a complete logic for reasoning about updating sets of probability measures, we
follow the standard “reduction axiom” strategy used in dynamic epistemic logic: identify
a set of valid biconditionals that allow us to reduce any formula containing the dynamic
operators hϕi to an equivalent formula of L(%, ≻, ♦) without dynamic operators, which can
then be handled by the complete logic for L(%, ≻, ♦).
23
Definition 6.6. The logic IP(%, ≻, ♦, h i) is the smallest set of L(%, ≻, ♦, h i) formulas that
is (i) closed under modus ponens and the rule of replacement of equivalents, and (ii) contains
all theorems of IP(%, ≻, ♦) as well as all instances of the following axiom schemas where
p ∈ Prop and α and β are propositional:
(R0) hϕip ↔ (♦ϕ ∧ p);
(R1) hϕi♦ψ ↔ ♦hϕiψ;
(R2) hϕi¬ψ ↔ (♦ϕ ∧ ¬hϕiψ);
(R3) hϕi(ψ ∧ χ) ↔ (hϕiψ ∧ hϕiχ);
(R4) hϕi(α % β) ↔ (♦ϕ ∧ ((ϕ ∧ α) % (ϕ ∧ β)));
(R5) hϕi(α ≻ β) ↔ (♦ϕ ∧ ((ϕ ≻ ⊥) → ((ϕ ∧ α) ≻ (ϕ ∧ β)))).
Example 6.7. In a given model, we may ask if after the agent updates with the information
that it is raining and that hail is more likely than lightning tonight the agent judges that it
is at least as likely that a window will break as it is that the power will go out:
hr ∧ (h ≻ l)i(w % p).
This is equivalent, in light of the reduction axiom (R4), to
♦(r ∧ (h ≻ l)) ∧ (r ∧ (h ≻ l)) ∧ w % (r ∧ (h ≻ l)) ∧ p ,
which is in turn equivalent to
♦(r ∧ (h ≻ l)) ∧ (h ≻ l) → (r ∧ w) % (r ∧ p) ,
i.e., there is some measure that gives r non-zero probability and gives h greater probability
than l, and every measure that gives h greater probability than l also makes the probability
of w conditional on r at least as great as the probability of p conditional on r.
The rest of this section is devoted to the proof of the following theorem.
Theorem 6.8 (Soundness and Completeness). For all ϕ ∈ L(%, ≻, ♦, h i): ϕ is a theorem of
IP(%, ≻, ♦, h i) if and only if ϕ is valid with respect to the class of all imprecise probabilistic
models.
The soundness of IP(%, ≻, ♦, h i) is less trivial than the soundness of the previous systems.
More importantly, we will use its soundness to prove its completeness, similar to the proof
of completeness of other dynamic epistemic logics axiomatized by reduction axioms.
Proposition 6.9. For all ϕ ∈ L(%, ≻, ♦, h i): if ϕ is a theorem of IP(%, ≻, ♦, h i), then ϕ is
valid with respect to the class of all imprecise probabilistic models.
Proof. Clearly it is enough to check the validity of (R0) to (R5).
• For (R0), note that the valuation of p is invariant under the updating.
• For (R1), the key is to treat hϕi♦ as a whole, whence the semantics of hϕi♦ψ at
M, P, w is that there is a µ ∈ Pϕ such that µ(JψKM,{µ} ) > 0. But given the construction of Pϕ , this is precisely saying that there is a ν ∈ P such that ν(JϕKM,{ν} ) > 0
and that, letting µ = ν(· | JϕKM,{ν} ), we have µ(JψKM,{µ} ) > 0. Now note that
for any ν ∈ P such that JϕKM,{ν} > 0, letting µ = ν(· | JϕKM,{ν} ), we have
JhϕiψKM,{ν} = JψKM,{µ} since {ν}ϕ = {µ}. Hence the truth condition of hϕi♦ψ
is transformed into the existence of ν ∈ P such that ν(JϕKM,{ν} ) > 0 and that
JhϕiψKM,{ν} > 0. But this is precisely the truth condition of ♦hϕiψ.
24
• For (R2), the key insight is that at M, P, w, assuming that there is a ν ∈ P such that
ν(JϕKM,{ν} ) > 0, we have:
M, P, w hϕi¬ψ ⇐⇒ M, Pϕ , w ¬ψ
⇐⇒ M, Pϕ , w 6 ψ
⇐⇒ M, P, w ¬hϕiψ.
• For (R3), the idea is similar to the above.
• For (R4), it is enough to observe the following chain of equivalences assuming that
there is a ν ∈ P such that ν(JϕKM,{ν} ) > 0:
M, P, w hϕi(α % β) ⇐⇒ M, Pϕ , w α % β
⇐⇒ ∀µ ∈ Pϕ , µ(JαKM,Pϕ ) ≥ µ(JβKM,Pϕ )
⇐⇒ ∀µ ∈ Pϕ , µ(V (α)) ≥ µ(V (β))
⇐⇒ ∀ν ∈ P such that ν(JϕKM,{ν} ) > 0,
ν(V (α) | JϕKM,{ν} ) ≥ ν(V (β) | JϕKM,{ν} )
⇐⇒ ∀ν ∈ P such that ν(JϕKM,{ν} ) > 0,
ν(V (α) ∩ JϕKM,{ν} ) ≥ ν(V (β) ∩ JϕKM,{ν} )
⇐⇒ ∀ν ∈ P, ν(V (α) ∩ JϕKM,{ν} ) ≥ ν(V (β) ∩ JϕKM,{ν} )
⇐⇒ ∀ν ∈ P, M, {ν} (ϕ ∧ α) % (ϕ ∧ β)
⇐⇒ ∀ν ∈ P, ν(J(ϕ ∧ α) % (ϕ ∧ β)KM,{ν} ) = 1
⇐⇒ M, P, w ((ϕ ∧ α) % (ϕ ∧ β)).
Note that the last three equivalences extensively use the fact that a Boolean combination of comparison formulas is true at a world if and only if it is true at all worlds.
The sixth equivalence is true because when ν(JϕKM,{ν} ) = 0, it trivially holds that
ν(V (α) ∩ JϕKM,{ν} ) ≥ ν(V (β) ∩ JϕKM,{ν} ).
• For (R5), the strategy is the same—it is enough to observe the following chain of
equivalences assuming that there is a ν ∈ P such that ν(JϕKM,{ν} ) > 0:
M, P, w hϕi(α ≻ β) ⇐⇒ M, Pϕ , w α ≻ β
⇐⇒ ∀µ ∈ Pϕ , µ(JαKM,Pϕ ) > µ(JβKM,Pϕ )
⇐⇒ ∀µ ∈ Pϕ , µ(V (α)) > µ(V (β))
⇐⇒ ∀ν ∈ P such that ν(JϕKM,{ν} ) > 0,
ν(V (α) | JϕKM,{ν} ) > ν(V (β) | JϕKM,{ν} )
⇐⇒ ∀ν ∈ P such that ν(JϕKM,{ν} ) > 0,
ν(V (α) ∩ JϕKM,{ν} ) > ν(V (β) ∩ JϕKM,{ν} )
⇐⇒ ∀ν ∈ P, if M, {ν} ϕ ≻ ⊥ then M, {ν} (ϕ ∧ α) ≻ (ϕ ∧ β)
⇐⇒ ∀ν ∈ P, M, {ν} (ϕ ≻ ⊥) → ((ϕ ∧ α) ≻ (ϕ ∧ β))
⇐⇒ ∀ν ∈ P, ν(J(ϕ ≻ ⊥) → ((ϕ ∧ α) ≻ (ϕ ∧ β))KM,{ν} ) = 1
⇐⇒ M, P, w ((ϕ ≻ ⊥) → ((ϕ ∧ α) ≻ (ϕ ∧ β))).
Again, the last four equivalences extensively use the fact that a Boolean combination
of comparison formulas is true at a world if and only if it is true at all worlds.
For completeness, we first show that the axioms allow us to provably-equivalently reduce
any formula in L(%, ≻, ♦, h i) to a fragment LSimpd1 that is even simpler than the fragment
LSimp : the comparison formulas in the scope of any ♦ must not have nested comparison.
25
Definition 6.10. Let LBool be the set of propositional formulas. In other words, this is the
fragment generated from Prop by ¬ and ∧.
Let LCompd1 be the fragment of L(%, ≻) with no nesting of % and ≻. In other words,
this is the fragment generated from Prop and {(α % β), (α ≻ β) | α, β ∈ LBool } by ¬ and ∧.
Finally, let LSimpd1 be the fragment of L(%, ≻, ♦) generated from Prop and {♦ϕ | ϕ ∈
LCompd1 } by ¬ and ∧.
Lemma 6.11. For every ϕ ∈ L(%, ≻), there is a TCompd1 (ϕ) ∈ LCompd1 such that ϕ ↔
TCompd1 (ϕ) ∈ IP(%, ≻). Moreover, ϕ and TCompd1 (ϕ) use the same propositional variables.
Proof. We use a standard argument for extracting comparisons embedded in comparisons.
Formally, an induction over L(%, ≻) is needed. The base case and the inductive cases for
¬ and ∧ are trivial as we can simply define TCompd1 (p) = p, TCompd1 (¬ϕ) = ¬TCompd1 (ϕ),
and TCompd1 (ϕ ∧ ψ) = TCompd1 (ϕ) ∧ TCompd1 (ψ).
For the non-trivial cases for % and ≻, we only need the following: for any α, β ∈ LBool
and ϕ, ψ ∈ LCompd1 , the following are in IP(%, ≻):
(ϕ % ψ) ↔ (((α % β) ∧ (ϕ[α % β/⊤] % ψ[α % β/⊤])) ∨ (¬(α % β) ∧ (ϕ[α % β/⊥] % ψ[α % β/⊥])));
(ϕ % ψ) ↔ (((α ≻ β) ∧ (ϕ[α % β/⊤] % ψ[α % β/⊤])) ∨ (¬(α ≻ β) ∧ (ϕ[α % β/⊥] % ψ[α % β/⊥])));
(ϕ ≻ ψ) ↔ (((α % β) ∧ (ϕ[α % β/⊤] ≻ ψ[α % β/⊤])) ∨ (¬(α % β) ∧ (ϕ[α % β/⊥] ≻ ψ[α % β/⊥])));
(ϕ ≻ ψ) ↔ (((α ≻ β) ∧ (ϕ[α % β/⊤] ≻ ψ[α % β/⊤])) ∨ (¬(α ≻ β) ∧ (ϕ[α % β/⊥] ≻ ψ[α % β/⊥]))).
They are proven mainly by (B7) to (B10). The key idea is to first derive the following:
(α % β) → ((ϕ ↔ ϕ[α % β/⊤]) % ⊤);
¬(α % β) → ((ϕ ↔ ϕ[α % β/⊥]) % ⊤);
(α ≻ β) → ((ϕ ↔ ϕ[α ≻ β/⊤]) % ⊤);
¬(α ≻ β) → ((ϕ ↔ ϕ[α ≻ β/⊥]) % ⊤).
Together with ((ϕ ↔ ψ) % ⊤) → ((ϕ % χ) ↔ (ψ % χ)) and ((ϕ ↔ ψ) % ⊤) → ((ϕ ≻ χ) ↔
(ψ ≻ χ)), the required equivalences can easily be derived.
Proposition 6.12. For every ϕ ∈ L(%, ≻, ♦) there is a TSimpd1 (ϕ) ∈ LSimpd1 such that
ϕ ↔ TSimpd1 (ϕ) is in IP(%, ≻, ♦).
Proof. The result of replacing all ♦χ in TSimp (ϕ) by ♦TCompd1 (χ) is the desired TSimpd1 (ϕ).
Proposition 6.13. For every ϕ ∈ L(%, ≻, ♦, h i) there is a TSimpd1 (ϕ) ∈ LSimpd1 such that
ϕ ↔ TSimpd1 (ϕ) is in IP(%, ≻, ♦, h i).
Proof. We proceed by induction. Given Proposition 6.12 and the rule of replacement of
equivalents, the only non-trivial case is to show that there is a TSimpd1 (hϕiψ) that is provably
equivalent to hϕiψ in IP(%, ≻, ♦, h i) where ϕ, ψ are in LSimpd1 . By repeated use of (R1)
to (R3) and the rule of replacement of equivalents, obviously we can push the hϕi into
ψ over Boolean connectives and ♦ and obtain a Boolean combination of formulas of the
form hϕip or of the form hϕi(α % β) or hϕi(α ≻ β) since in LSimpd1 , % and ≻ only scope
over propositional formulas. All three kinds of formulas can be replaced by formulas in
L(%, ≻, ♦) provably equivalently. Then we apply TSimpd1 again to finish off (to eliminate
any ♦’s appearing inside ♦’s).
With the above reduction method, the completeness of IP(%, ≻, ♦, h i) follows.
Proposition 6.14. For all ϕ ∈ L(%, ≻, ♦, h i): if ϕ is valid with respect to the class of all
imprecise probabilistic models, then ϕ is a theorem of IP(%, ≻, ♦, h i).
26
Proof. Let ϕ be any valid formula in L(%, ≻, ♦, h i). Then by the soundness of IP(%, ≻
, ♦, h i) and the fact that ϕ ↔ TSimpd1 (ϕ) ∈ IP(%, ≻, ♦, h i), TSimpd1 (ϕ) is also valid. But
TSimpd1 (ϕ) ∈ LSimpd1 ⊆ L(≻, ≻, ♦). By the completeness of IP(%, ≻, ♦), TSimpd1 (ϕ) ∈ IP(%
, ≻, ♦). By the definition of IP(%, ≻, ♦, h i), it contains all theorems of IP(%, ≻, ♦). Hence
TSimpd1 (ϕ) is in IP(%, ≻, ♦, h i). Then by Boolean reasoning, ϕ is in IP(%, ≻, ♦, h i).
Although the reduction axioms for L(%, ≻, ♦, h i) allow us to reduce the satisfiability
problem for L(%, ≻, ♦, h i) to that for L(%, ≻, ♦), which is in NP (Theorem 5.12), it does
not immediately follow that the satisfiability problem for L(%, ≻, ♦, h i) is in NP, due to the
blowup in the length of formulas during the reduction process. A similar obstacle occurs
in the case of the simplest dynamic epistemic logic (public announcement logic), in which
case a solution is to use a satisfiability-preserving reduction with only polynomial blowup
instead of the standard validity-preserving reduction with exponential blowup (Lutz 2006).
Whether this or other techniques apply to L(%, ≻, ♦, h i) we leave as an open problem.
Problem 6.15. Determine the complexity of the satisfiability problem for L(%, ≻, ♦, h i).
6.2
Introducing a New Proposition
In the previous subsection, we considered the dynamic update operator that concerns learning the truth of a proposition. In this subsection, we consider the complementary dynamics
of learning the mere existence of a proposition and then being maximally uncertain about
it in the way of imprecise probability (cf. Joyce 2005). Our goal is to show that this kind of
information dynamics is expressively helpful, especially in formalizing examples in a natural
way, and we leave the complete axiomatization of its logic as an open question.
Definition 6.16. The language L(%, ≻, ♦, h i, I) is defined by the following grammar:
ϕ ::= p | ¬ϕ | (ϕ ∧ ϕ) | (ϕ % ϕ) | (ϕ ≻ ϕ) | ♦ϕ | hϕiϕ | Ip+ ϕ | Ip− ϕ
where p ∈ Prop. We read Ip+ ϕ as “letting p be a true proposition that is newly introduced to
the agent, ϕ”; similarly, Ip− ϕ reads “letting p be a false proposition that is newly introduced
to the agent, ϕ”. We also take Ip ϕ as an abbreviation of (Ip+ ϕ ∧ Ip− ϕ).
We treat both Ip+ and Ip− as a kind of propositional quantifier, since they change the
meaning (denotation) of p, and we define free and bound propositional variables in the
obvious way. For any ϕ ∈ L(%, ≻, ♦, h i, I), let Prop(ϕ) be the set of freely occurring
propositional variables in ϕ.
Now we specify the semantics for I + and I − . First, we define how a model changes when
we introduce a new proposition.
Definition 6.17. Given a non-empty set W , a field of sets F on W , a valuation V such
that V (p) ∈ F for all p ∈ Prop, and a set of finitely additive probability measure P on F,
we interpret F as the collection of the “old” propositions. Our goal is to define the result of
adding a “new” proposition P . Intuitively, we first split each w ∈ W into hw, 1i and hw, 0i
corresponding to P being true and false, respectively, while keeping the truth value of the
old propositions. For the probability measures, we take all probability measures defined
on both the old and new propositions that, when restricted to just the old propositions,
coincide with some old probability measure. The following gives the formal details.
• Let F × 2 = {X × {0, 1} | X ∈ F}, which is a field of sets on W × {0, 1}.
• Let Split(F) be the smallest field of sets on W × {0, 1} extending F ∪ {W × {0}}.
• Let V × 2 be defined such that V × 2(p) = V (p) × {0, 1} for all p ∈ Prop; note that
V × 2(p) ∈ F × 2 for all p ∈ Prop.
27
• For any p ∈ Prop, let V +p be defined such that
(
V (q) × {0, 1}
+p
V (q) =
W × {1}
if q 6= p
if q = p ;
note that V +p (q) ∈ Split(F) for all q ∈ Prop.
• For any finitely additive measure µ on F, define µ × 2, a finitely additive measure on
F × 2, by µ × 2(X × {0, 1}) = µ(X) for all X ∈ F.
• Let P × 2 = {µ × 2 | µ ∈ P}.
• Let Split(P) be the set of all finitely additive measures µ on Split(F) such that µ|F ×2 ∈
P × 2.
Using the above definition, given a propositional variable p ∈ Prop and a propositional
model M = hW, V i, let M × 2 = hW × {0, 1}, V × 2i and M+p = hW × {0, 1}, V +p i. Then
if hM, Pi is an IP model, so is hM+p , Split(P)i, and hM+p , Split(P)i represents the result
of adding a new proposition, now denoted by p, to hM, Pi.
Remark 6.18. In the algebraic theory of Boolean algebras, there is a standard operation
of freely adjoining a new element to a Boolean algebra: for any Boolean algebra B and any
a 6∈ B, there is a unique up to isomorphism Boolean algebra B +a such that
• B is a subalgebra of B +a , and every element in B +a is generated from B ∪ {a};
• for any b ∈ B, b ∧ a and b ∧ ¬a are not the bottom element in B +a .
The operation of Split(F) is precisely the dual of this algebraic operation.
Hence, if we use an algebraic model hB, V, Pi where B is a Boolean algebra (of propositions of which the agent is currently aware), V a valuation function from Prop to B, and P
a set of finitely additive functions from B to [0, 1], we can easily define the result of adding
a new proposition a 6∈ B to be denoted by p as hB +a , V ′ , P ′ i where V ′ coincides with V on
Prop except that V ′ (p) = a and P ′ = {µ : B +a → [0, 1] | µ is finitely additive and µ|B ∈ P}.
Remark 6.19. The model construction from hM, Pi to hMp , Split(P)i can also be viewed
as an event-model update from (probabilistic) dynamic epistemic logic (van Benthem et al.
2009). The event model contains two events {1, 0} corresponding to whether the new proposition is true or not with no preconditions, and the agent is maximally ignorant about these
two events: at any of the old worlds, she cannot distinguish between these two events, is
completely ignorant about the relative likelihood of these two events, and does not observe
which event happens. Using the terminology from van Benthem et al. (2009), the agent is
maximally and imprecisely ignorant about the occurrence probability of these two events
and makes no observation about these two events.
Definition 6.20. The semantics of Ip+ and Ip− are given by
M, P, w |= Ip− ϕ iff M+p , Split(P), hw, 0i |= ϕ ,
M, P, w |= Ip+ ϕ iff M+p , Split(P), hw, 1i |= ϕ .
Now let us put the new operators to work. We first use them to formalize the medical
example (Example 1.1).
Example 6.21. The following sentence is valid and represents the medical example if we
take p to mean that the agent has the disease (that is, the proposition introduced by Ip
is that the agent has the disease) and q to mean that the gland is swollen (that is, the
proposition introduced by Iq is that the gland is swollen).
Ip h¬p ≻ piIq h(q ∧ p) ≻ (q ∧ ¬p)ihqi(p ≻ ¬p).
28
(5)
We interpret the first update by ¬p ≻ p as the result of the agent observing that she is
not feeling uncomfortable and hence believing that her not having the disease is more likely
than her having it. The second update represents what the agent learns from the doctor,
and the third update represents a medical examination revealing that her gland is swollen.
The above simple sentence does not capture more nuanced probabilistic relationships
between p and q such as that conditioning on q, p is twice as likely as ¬p or that the medical
examination does not reveal q but only a signal that is probabilistically related to q. But
with the new operator I, we can easily say these things. For example, to express that p is
twice as likely as ¬p conditioning on q, we may introduce two new propositions (like two coin
flips) by Ir and Is at the beginning of the formula (note that our syntax forbids embedding
I in updates) and later add after Iq the update h((q ∧ r ∧ s) ≈ (q ∧ r ∧ ¬s)) ∧ ((q ∧ ¬r ∧ s) ≈
(q ∧ ¬r ∧ ¬s)) ∧ ((q ∧ r ∧ s) ≈ (q ∧ ¬r ∧ s)) ∧ (⊥ % (q ∧ ¬r ∧ ¬s))i, which says that
conditioning on q the two coin flips are fair and independent but the two tail situation is
impossible (perhaps because the two coins will be retossed if they are both tails up). Then,
using h(q ∧ p) ≈ (q ∧ s)i, we essentially say that p’s probability conditioning on q is 2/3
and thus twice as likely as ¬p. To express that the medical examination only provides an
informative signal related to q, we may again introduce a new proposition t representing
that signal and then let the agent learn the probabilistic relationship between t and q.
Example 6.22. For the prisoner example, recall that α is the formula
⊥ % (ea ∧ eb ∧ ec ) ∧ (ea ∧ eb ) ∨ (ea ∧ ec ) ∨ (eb ∧ ec ) % ⊤ ∧ (ea ≈ eb ) ∧ (eb ≈ ec ),
saying that two of the prisoners will be executed and the probabilities for the three situations
are equal. Recall also that β is the following formula
((sb → eb ) % ⊤) ∧ ((sc → ec ) % ⊤) ∧ (⊥ % (sb ∧ sc )),
saying that the jailer will announce one and only one prisoner to be executed truthfully.
Then the following formula is valid and represents the dilation when a hears that the jailer
announces that b will be executed:
Iea Ieb Iec hαiIsb Isc hβihsb i(♦(ea % ⊤) ∧ ♦(ea ≈ ¬ea )).
As we have seen in Example 6.21, L(%, ≻, ♦, h i, I) is capable of expressing numerical
relationships. Leveraging this capability, it is easy to observe that L(%, ≻, ♦, h i, I) is more
expressive than L(%, ≻, ♦, h i).
Example 6.23. Consider a propositional model M = hW, V i where W = {w, u} has two
worlds, V (p) = {w}, and V (q) = ∅ for all q ∈ Prop \ {p}. Then let µ1 be a probability
measure on ℘(W ) such that µ1 ({w}) = 0.6, and let µ2 also be a probability measure on
℘(W ) such that µ2 ({w}) = 0.9. Then it is easy to see that M, {µ1 }, w and M, {µ2 }, w
satisfy the same formulas in L(%, ≻, ♦, h i). However, the following formula
Iq Ir h((q ∧ r) ≈ (q ∧ ¬r)) ∧ ((q ∧ r) ≈ (¬q ∧ r)) ∧ ((q ∧ r) ≈ (¬q ∧ ¬r))i(p ≻ ¬(q ∧ r)),
which intuitively says that p is more likely than not getting two heads up from two randomly
and independently flipped fair coins, is true at M, {µ2 }, w, but false at M, {µ1 }, w.
Indeed, we will show that L(%, ≻, ♦, h i, I) can express any linear inequality with integer
coefficients about the probability of formulas. For this, we first introduce some notation.
DefinitionV6.24. Let Γ be a finite set of formulas, C(Γ) the set of all clauses (conjunctions
of the form ϕ∈Γ ±ϕ where ± is either the empty string or ¬), and p a propositional variable.
Then define (p|Γ) to be the formula
^
((ψ ∧ p) ≈ (ψ ∧ ¬p)).
ψ∈C(Γ)
29
Intuitively, (p|Γ) says that p represents a fair coin flip independent of all events expressible
using formulas in Γ.
Proposition 6.25. For any sequences hϕi ii=1...n and hψi ii=1...m of formulas in L(%, ≻
, ♦, h i, I) and any sequences hai ii=1...n and hbi ii=1...m of natural numbers, there is a formula
χ ∈ L(%, ≻, ♦, h i, I) such that for any IP model M, P, w,
M, P, w |= χ iff ∀µ ∈ P,
n
X
ai µ(Jϕi KM,P ) ≥
i=1
m
X
bi µ(Jψi KM,P ).
i=1
Proof. The central idea is already in Kraft et al. (1959) and is also described in Section 2
of Ding et al. Forthcoming: we use I operators to introduce new propositions that evenly
partition the logical space spanned by ϕi ’s so that we can take the union of multiple copies
of the partitioned ϕi ’s to simulate addition.
Let l be the smallest natural number such that 2l is larger than the sum of all ai ’s and bi ’s
and pick propositional variables hpi ii=1...l not occurring in any of the ϕi ’s and ψi ’s. Then
let C list all logically inequivalent clauses made from pi ’s. Since |C| = 2l and 2l is larger
than the sum of all coefficients, let f be a function from {1, . . . n} × {0} ∪ {1, . . . m} × {1}
to ℘(C) such that f (x) ∩ f (y) = ∅ whenever x 6= y and |f (i, 0)| = ai and |f (i, 1)| = bi . Let
Γ be set of all ϕi ’s and ψi ’s. Then consider the following formula:
Ip+1 Ip+2 · · · Ip+l h(p1 |Γ) ∧ (p2 |Γ ∪ {p1 }) · · · (pl |Γ ∪ {p1 , p2 , · · · , pl−1 })i
n
n
_
_
_
(
(ϕi ∧ c)) % (
i=1 c∈f (i,0)
_
(ψi ∧ c)).
(6)
i=1 c∈f (i,1)
This is the required formula since
W of new propositions and the anW after the introduction
nouncement, the probability of c∈f (i,0) (ϕi ∧ c) (resp. c∈f (i,1) (ψi ∧ c)) is precisely ai /2l
(resp. bi /2l ) times the probability of ϕi (resp. ψi ). Cancelling out the common denominator
2l , we see that the inequality expressed by formula (6) is the required one.
Therefore, we see that with the new operators Ip+ and Ip− , L(%, ≻, ♦, h i, I) is capable of
expressing quantitative (and in particular arbitrary additive) information. This also means
that we cannot use the same reduction strategy we used for L(%, ≻, ♦, h i) to axiomatize
the logic in L(%, ≻, ♦, h i, I). However, we conjecture that there is a computable translation
from L(%, ≻, ♦, h i, I) to L(%, ≻, ♦, h i) that preserves satisfiability. Such a translation can
then be coded as rules instead of axioms that completely axiomatize the logic.
Problem 6.26. Find an axiomatization of the set of valid formulas in L(%, ≻, ♦, h i, I).
Problem 6.27. Determine the complexity of the satisfiability problem for L(%, ≻, ♦, h i, I).
7
Conclusion
In this paper, we have investigated a hierarchy of languages
L(%) ⊆ L(%, ≻) ⊆ L(%, ≻, ♦) ⊆ L(%, ≻, ♦, h i) ⊆ L(%, ≻, ♦, h i, I)
and matching complete logics for imprecise comparative probabilistic reasoning in the first
four languages:
IP(%) ⊆ IP(%, ≻) ⊆ IP(%, ≻, ♦) ⊆ IP(%, ≻, ♦, h i).
The first four languages have straightforward extensions to the multi-agent setting, in which
each agent i has their own comparative probability relations %i and ≻i , allowing us to formalize statements such as “Ann judges it more likely than not that Bob thinks hail is more
30
likely than lightning”: (h ≻b l) ≻a ¬(h ≻b l). A multi-agent version of the language L(%)
was already studied in Alon and Heifetz (2014). Generalizing the other languages in this
paper to the multi-agent setting presents no major challenges, although the complexity of
the resulting multi-agent logics goes beyond that of the single-agent versions, just as the
complexity of the basic epistemic logic S5 jumps from NP to PSPACE when moving from the
single-agent to multi-agent setting (see Halpern and Moses 1992). When generalizing the
language L(%, ≻, ♦, h i, I) to the multi-agent setting, there is a distinction between introducing a new proposition to every agent publicly and introducing a new proposition for only
one agent so that she becomes privately aware of it. Our semantics naturally generalizes to
model all agents publicly becoming aware of a new proposition, but the modeling of some
agent’s privately becoming aware of a new proposition requires a different treatment.
Further extensions to the language are natural to consider, such as adding comparative
conditional probability formulas (ϕ | ψ) % (α | β) (resp. (ϕ | ψ) ≻ (α | β)) expressing
that the conditional probability of ϕ given ψ is at least as great as (resp. greater than)
the conditional probability of α given β for every measure in one’s set of measures, which
is not expressible in the languages of this paper (see Luce 1968). For precise probabilistic
models, such a quarternary operator is investigated in, e.g., Domotor 1969, § 2.6 and Suppes
and Zanotti 1982 (and recently in Hawthorne 2016 using so-called Popper functions), but
the interpretation in imprecise probabilistic models seems yet to be explored. Allowing
inequalities of probabilistic products (ϕ×ψ) % (α×β) would allow even greater expressivity
(such an extension in the precise case is also considered in Domotor 1969, §2.4).
More generally, the systems in this paper are part of a much broader hierarchy of probabilistic languages, ranging from the very simple L(%) all the way to highly expressive probabilistic languages encompassing full quantified real number arithmetic (Halpern, 1990). In
addition to their inherent theoretical interest, probabilistic logics have emerged as a foundational tool for many central computational tasks, from core knowledge representation
(Russell, 2015), to reasoning about strategic interaction (Dekel and Siniscalchi, 2015; van
Benthem and Klein, 2019), to causal inference (witness do-calculus, which is built on top of
a probability calculus; see, e.g., Pearl 2009; Bareinboim et al. 2020; Ibeling and Icard 2020).
Furthermore, applications in these contexts have motivated some of the very systems presented here (e.g., Alon and Heifetz 2014). Understanding the capacities and limitations of
such systems may well be an important step toward further integration of explicit probabilistic tools in these and other domains.
Acknowledgements
We thank the two reviewers for the International Journal of Approximate Reasoning for
helpful comments.
References
Shiri Alon and Aviad Heifetz. The logic of Knightian games. Economic Theory Bulletin, 2
(2):161–182, 2014.
Shiri Alon and Ehud Lehrer. Subjective multi-prior probability: A representation of a partial
likelihood relation. Journal of Economic Theory, 151:476–492, 2014.
Thomas Augustin, Frank P. A. Coolen, Gert De Cooman, and Matthias C. M. Troffaes.
Introduction to imprecise probabilities. John Wiley & Sons, 2014.
Elias Bareinboim, Juan D. Correa, Duligur Ibeling, and Thomas Icard. On Pearl’s hierarchy
and the foundations of causal inference. Technical Report R-60, Causal AI Lab, Columbia
University, 2020.
31
Johan van Benthem. Logical Dynamics of Information and Interaction. Cambridge University Press, New York, 2011.
Johan van Benthem and Dominik Klein. Logics for analyzing games. In Edward N. Zalta,
editor, The Stanford Encyclopedia of Philosophy. 2019.
Johan van Benthem, Jelle Gerbrandy, and Barteld Kooi. Dynamic update with probabilities.
Studia Logica, 93(1):67–96, 2009.
George Boole. An Investigation of the Laws of Thought. Walton & Maberly, 1854.
Seamus Bradley. Imprecise probabilities. In Edward N. Zalta, editor, The Stanford Encyclopedia of Philosophy. 2019.
Seamus Bradley and Katie Steele. Uncertainty, learning, and the “problem” of dilation.
Erkenntnis, 79:1287–1303, 2014.
Rudolf Carnap. Testability and meaning (part I). Philosophy of Science, 3(4):419–471, 1936.
Inés Couso and Serafı́n Moral. Sets of desirable gambles: conditioning, representation, and
precise probabilities. International Journal of Approximate Reasoning, 52(7):1034–1055,
2011.
Eddie Dekel and Marciano Siniscalchi. Epistemic game theory. In Handbook of Game Theory
with Economic Applications, volume 4, pages 619–702. 2015.
Persi Diaconis. Review of “A mathematical theory of evidence” (G. Shafer). Journal of the
American Statistical Association, 73(363):677–678, 1978.
Persi Diaconis and Sandy L. Zabell. Some alternatives to Bayes’s rule. In B. Grofman and
G. Owen, editors, Information Pooling and Group Decision Making, pages 25–38. J.A.I.
Press, 1986.
Nicholas DiBella. Qualitative probability and infinitesimal probability. Draft of 9/7/18,
2018.
Yifeng Ding, Matthew Harrison-Trainor, and Wesley H. Holliday. The logic of comparative
cardinality. The Journal of Symbolic Logic, Forthcoming. doi: 10.1017/jsl.2019.67.
Hans van Ditmarsch, Wiebe van der Hoek, and Barteld Kooi. Dynamic Epistemic Logic.
Springer, Dordrecht, 2008.
Zoltan Domotor. Probabilistic relational structures and their applications. Technical Report
No. 144 Psychology Series, Stanford University, California Institute for Mathematical
Studies in the Social Sciences, 1969.
Edward Elliott. ‘Ramseyfying’ probabilistic comparativism. Philosophy of Science, 87(4):
727–754, 2020.
Benjamin Eva. Principles of indifference. Journal of Philosophy, 116(7):390–411, 2019.
R. Fagin, J. Y. Halpern, and N. Megiddo. A logic for reasoning about probabilities. Information and Computation, 87:78–128, 1990.
Terrence L. Fine. Theories of Probability. Academic Press, New York, 1973.
Terrence L. Fine. An argument for comparative probability. In R.E. Butts and J. Hintikka,
editors, Basic Problems in Methodology and Linguistics, pages 105–119. Springer, 1977.
Bruno de Finetti. La ‘logica del plausible’ secondo la concezione di Polya. Atti della XLII
Riunione, Societa Italiana per il Progresso delle Scienze, pages 227–236, 1949.
32
Peter C. Fishburn. The axioms of subjective probability. Statistical Science, 1(3):335–358,
1986.
Brandon Fitelson and David McCarthy. Toward an epistemic foundation for comparative
confidence. Draft of 1/19/14, 2014.
Peter Gärdenfors. Qualitative probability as an intensional logic. Journal of Philosophical
Logic, 4(2):171–185, 1975.
Marvin Gardner. Mathematical games. Scientific American, page 180–182, 1959a. October
issue.
Marvin Gardner. Mathematical games. Scientific American, page 188, 1959b. November
issue.
Alfio Giarlotta and Salvatore Greco. Necessary and possible preference structures. Journal
of Mathematical Economics, 49:163–172, 2013.
I.J. Good. Subjective probability as the measure of a non-measurable set. In Ernest Nagel,
Patrick Suppes, and Alfred Tarski, editors, Logic, Methodology and Philosophy of Science:
Proceedings of the 1960 International Congress, pages 319–329, 1962.
J. Y. Halpern. An analysis of first-order logics of probability. Artificial Intelligence, 46:
311–350, 1990.
Joseph Y. Halpern. Reasoning about Uncertainty. MIT Press, Cambridge, Mass., 2003.
Joseph Y. Halpern and Yoram Moses. A guide to completeness and complexity for modal
logics of knowledge and belief. Artificial Intelligence, 54(3):319–379, 1992.
Matthew Harrison-Trainor, Wesley H. Holliday, and Thomas F. Icard. A note on cancellation
axioms for comparative probability. Theory and Decision, 80(1):159–166, 2016.
Matthew Harrison-Trainor, Wesley H. Holliday, and Thomas F. Icard. Preferential structures
for comparative probabilistic reasoning. Proceedings of the Thirty-First AAAI Conference
on Artificial Intelligence, pages 1135–1141, 2017.
James Hawthorne. A logic of comparative support: Qualitative conditional probability
relations representable by Popper functions. In Alan Hájek and Christopher Hitchcock,
editors, Oxford Handbook of Probability and Philosophy. Oxford University Press, 2016.
Wesley H. Holliday, Tomohiro Hoshi, and Thomas F. Icard. A uniform logic of information
dynamics. In Thomas Bolander, Torben Braüner, Silvio Ghilardi, and Lawrence Moss,
editors, Advances in Modal Logic, volume 9, pages 348–367. College Publications, London,
2012.
Wesley H. Holliday, Tomohiro Hoshi, and Thomas F. Icard. Information dynamics and
uniform substitution. Synthese, 190(1):31–55, 2013.
Duligur Ibeling and Thomas Icard. Probabilistic reasoning across the causal hierarchy. In
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020.
Thomas F. Icard. Pragmatic considerations on comparative probability. Philosophy of
Science, 83(3):348–370, 2016.
James M. Joyce. How probabilities reflect evidence. Philosophical Perspectives, 19:153–178,
2005.
John Maynard Keynes. A Treatise on Probability. Macmillan, 1921.
33
Jason Konek. Comparative probabilities. In Richard Pettigrew and Jonathan Weisberg,
editors, The Open Handbook of Formal Epistemology, pages 267–348. The PhilPapers
Foundation, 2019.
Jason Konek. Epistemic conservativity and imprecise credence. Philosophy and Phenomenological Research, Forthcoming.
Bernard O. Koopman. The axioms and algebra of intuitive probability. Annals of Mathematics, 41(2):269–292, 1940.
Charles H. Kraft, John W. Pratt, and A. Seidenberg. Intuitive probability on finite sets.
The Annals of Mathematical Statistics, 30(2):408–419, 1959.
Daniel Lassiter. Gradable epistemic modals, probability, and scale structure. In N. Li and
D. Lutz, editors, Semantics and Linguistic Theory (SALT) 20, pages 1–18. CLC (Cornell
Linguistics Circle), 2010.
Ehud Lehrer and Roee Teper. Justifiable preferences. Journal of Economic Theory, 146(2):
762–774, 2011.
Isaac Levi. On indeterminate probabilities. Journal of Philosophy, 71:391–418, 1974.
R. Duncan Luce. On the numerical representation of qualitative conditional probability.
The Annals of Mathematical Statistics, 39(2):481–491, 1968.
Carsten Lutz. Complexity and succinctness of public announcement logic. In Proceedings
of the fifth international joint conference on Autonomous agents and multiagent systems,
pages 137–143. ACM, 2006.
Krzysztof Mierzewski. Probabilistic stability: dynamics, nonmonotonic logics, and stable
revision. Master’s thesis, Universiteit van Amsterdam, 2018.
Sarah Moss. Probabilistic Knowledge. Oxford University Press, Oxford, 2018.
Sarah Moss. Global constraints on imprecise credences: Solving reflection violations, belief
inertia, and other puzzles. Philosophy and Phenomenological Research, 2020.
T. S. Motzkin. Two consequences of the transposition theorem on linear inequalities. Econometrica, 19(2):184–185, 1951.
Judea Pearl. Causality. Cambridge University Press, 2009.
Susanna Rinard. Against radical credal imprecision. Thought: A Journal of Philosophy, 2:
157–165, 2013.
D. Rı́os Insua. On the foundations of decision making under partial information. Theory
and Decision, 33(1):83–100, 1992.
Stuart Russell. Unifying logic and probability. Communications of the ACM, 58(7):88–97,
2015.
Miriam Schoenfield. Chilling out on epistemic rationality: A defense of imprecise credences
(and other imprecise doxastic attitudes). Philosophical Studies, 158:197–219, 2012.
Dana Scott. Measurement structures and linear inequalities. Journal of Mathematical Psychology, 1:233–247, 1964.
Krister Segerberg. Qualitative probability in a modal setting. In E. Fenstad, editor, Second
Scandinavian Logic Symposium, pages 341–352, Amsterdam, 1971. North-Holland.
Teddy Seidenfeld, Mark J. Schervish, and Joseph B. Kadane. Forecasting with imprecise
probabilities. International Journal of Approximate Reasoning, 53(8):1248–1261, 2012.
34
Steve Selvin. On the Monty Hall problem. The American Statistician, 29(3):134, 1975.
Patrick Suppes. The measurement of belief. The Journal of the Royal Statistical Society,
Series B, 36(2):160–191, 1974.
Patrick Suppes and Mario Zanotti. Necessary and sufficient conditions for existence of
a unique measure strictly agreeing with a qualitative probability ordering. Journal of
Philosophical Logic, 5(3):431–438, 1976.
Patrick Suppes and Mario Zanotti. Necessary and sufficient qualitative axioms for conditional probability. Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete, 60:
163–169, 1982.
Marilyn vos Savant. Marilyn vos Savant’s reply. The American Statistician, 45(4):347, 1991.
Peter Walley. Statistical Reasoning with Imprecise Probabilities. Chapman and Hall, 1991.
Peter Walley. Towards a unified theory of imprecise probability. International Journal of
Approximate Reasoning, 24(2-3):125–148, 2000.
Brian Weatherson. The Bayesian and the dogmatist. Proceedings of the Aristotelian Society,
107:169–185, 2007.
Seth Yalcin. Context probabilism. In Logic, Language and Meaning - 18th Amsterdam Colloquium, Amsterdam, The Netherlands, December 19-21, 2011, Revised Selected Papers,
pages 12–21, 2011. doi: 10.1007/978-3-642-31482-7 2.
35