Skip to main content
It is sensible in the philosophy of the quantitative sciences to distinguish between three kinds of hypothesis. The main goal of this chapter is to explain why the distinction is philosophically useful. The distinction itself is best... more
It is sensible in the philosophy of the quantitative sciences to distinguish between three kinds of hypothesis. The main goal of this chapter is to explain why the distinction is philosophically useful. The distinction itself is best explained as follows. At the empirical level (at the bottom), there are curves, or functions, or laws, such as PV = constant the Boyle's example, or a = M/r 2 in Newton's example. The first point is that such formulae are actually ambiguous as to the hypotheses they represent. They can be understood in two ways. In order to make this point clear, let me first introduce a terminological distinction between variables and parameters. Acceleration and distance (a and r) are variables in Newton's formula because they represent quantities that are more or less directly measured. The distinction between what is directly measured and what it is not is to be understood relative the context. All I mean is that values of acceleration and distance are d...
The debate between William Whewell and John Stuart Mill is not only hard in the sense that both sides are difficult to understand, but the issue itself is unresolved. Whewell's idea of predictive tests is similar to the method of... more
The debate between William Whewell and John Stuart Mill is not only hard in the sense that both sides are difficult to understand, but the issue itself is unresolved. Whewell's idea of predictive tests is similar to the method of cross validation in statistics and machine learning, except that Whewell applies it in a hierarchical way at multiple levels. Or at least, that is how Whewell argues that hypothesis testing works in science. In contrast, the received view of theory testing is that the confirmation of rival hypotheses is measured by their degree of fit with the total evidence, provided that the rival hypotheses are equally simple. However, there is a growing realization that predictive tests are stronger in many ways. What this suggests is that the history of science could be used as a source of examples against which theories of learning may be tested. The purpose of this paper is to explain and highlight some of the features of Whewell's theory of hypothesis testin...
We find ourselves agreeing with much of what Kruse says in his reply to “Why Likelihood?”, and we appreciate his clarification on other points. Boik, on the other hand, believes there is a satisfactory Bayesian solution to the problem of... more
We find ourselves agreeing with much of what Kruse says in his reply to “Why Likelihood?”, and we appreciate his clarification on other points. Boik, on the other hand, believes there is a satisfactory Bayesian solution to the problem of model selection. He also believes that the Akaike solution is flawed. We discuss these two points in sections 2 and 3. In addition, we disagree with Boik about several points of interpretation. We discuss these in section 1.
Recent approaches to causal modelling rely upon the causal Markov condition, which specifies which probability distributions are compatible with a directed acyclic graph (DAG). Further principles are required in order to choose among the... more
Recent approaches to causal modelling rely upon the causal Markov condition, which specifies which probability distributions are compatible with a directed acyclic graph (DAG). Further principles are required in order to choose among the large number of DAGs compatible with a given probability distribution. Here we present a principle that we call frugality. This principle tells one to choose the DAG with the fewest causal arrows. We argue that frugality has several desirable properties compared to the other principles that have been suggested, including the well-known causal faithfulness condition. 1 Introduction 2 The Causal Markov Condition 3 Faithfulness 4 Frugality   4.1 Basic independences and frugality   4.2 General properties of directed acyclic graphs satisfying frugality   4.3 Connection to minimality assumptions 5 Frugality as a Parsimony Principle 6 Conclusion  Appendix 1 Introduction 2 The Causal Markov Condition 3 Faithfulness 4 Frugality   4.1 Basic independences and ...
According to one definition, a general philosophy of science seeks to describe and understand how science works within a wide range of sciences. This does not have to include every kind of science. But it had better not be confined to a... more
According to one definition, a general philosophy of science seeks to describe and understand how science works within a wide range of sciences. This does not have to include every kind of science. But it had better not be confined to a single branch of a single science, for such an understanding would add little to what scientists working in that area already know. Deductive logic is about the validity of arguments. An argument is valid when its conclusion follows deductively from its premises. Here’s an example: If Alice is guilty then Bob is guilty, and Alice is guilty. Therefore, Bob is guilty. The validity of the argument has nothing to do with what the argument is about. It has nothing to do with the meaning, or content, of the argument beyond the meaning of logical phrases such as if…then. Thus, any argument of the following form (called modus ponens) is valid: If P then Q, and P, therefore Q. Any claims substituted for P and Q lead to an argument that is valid. Probability t...
This chapter examines four solutions to the problem of many models, and finds some fault or limitation with all of them except the last. The first is the naïve empiricist view that best model is the one that best fits the data. The second... more
This chapter examines four solutions to the problem of many models, and finds some fault or limitation with all of them except the last. The first is the naïve empiricist view that best model is the one that best fits the data. The second is based on Popper’s falsificationism. The third approach is to compare models on the basis of some kind of trade off between fit and simplicity. The fourth is the most powerful: Cross validation testing.
Recent approaches to causal modeling rely upon the Causal Markov Condition, which specifies which probability distributions are compatible with a Directed Acyclic Graph (DAG). Further principles are required in order to choose among the... more
Recent approaches to causal modeling rely upon the Causal Markov Condition, which specifies which probability distributions are compatible with a Directed Acyclic Graph (DAG). Further principles are required in order to choose among the large number of DAGs compatible with a given probability distribution. Here we present a principle that we call frugality. This principle tells one to choose the DAG with the fewest causal arrows. We argue that frugality has several desirable properties compared to the other principles that have been suggested, including the well-known Causal Faithfulness Condition.
Statisticians and philosophers of science have many common interests but restricted communication with each other. This volume aims to remedy these shortcomings. It provides state-of-the-art research in the area of Philosophy of... more
Statisticians and philosophers of science have many common interests but restricted communication with each other. This volume aims to remedy these shortcomings. It provides state-of-the-art research in the area of Philosophy of Statistics by encouraging numerous experts to communicate with one another without feeling ���restricted��� by their disciplines or thinking ���piecemeal��� in their treatment of issues. A second goal of this book is to present work in the field without bias toward any particular statistical paradigm. Broadly speaking, ...
] published a theorem that appears to be stronger than the Bell (1964) theorem in a way that is more significant than the other variations of Bell's theorem that have been published in the 50 years since Bell's theorem. Colbeck and Renner... more
] published a theorem that appears to be stronger than the Bell (1964) theorem in a way that is more significant than the other variations of Bell's theorem that have been published in the 50 years since Bell's theorem. Colbeck and Renner themselves do not relate their theorem directly to Bell's theorem, so here I present a version of the Colbeck-Renner theorem that makes the relationship explicit.
William Whewell and J. S. Mill disagreed about the role of conceptual innovation in the evolution of sci- ence. For Mill, concepts, correctly constructed, are determined by the empirical facts. They are born fully warranted, free from... more
William Whewell and J. S. Mill disagreed about the role of conceptual innovation in the evolution of sci- ence. For Mill, concepts, correctly constructed, are determined by the empirical facts. They are born fully warranted, free from prejudice and bias. For Whewell, concepts are introduced conjecturally, and the process of testing in science is largely devoted to justifying their objective credentials. The key ar- guments concern the role of concepts in extracting information, and exposing the unifying connections amongst the data. We believe that Whewell provides some important new insights about what counts as empirical evidence, and the relationship between theory and evidence. The resulting theory of scientific inference is importantly different from hypothetico-deductivism, and its recent variants.
William Whewell is famous for his philosophy of scientific discovery. Too few recognize that Whewell has important things to say on the normative question of theory testing, perhaps because J. S. Mill mini- mized Whewell's... more
William Whewell is famous for his philosophy of scientific discovery. Too few recognize that Whewell has important things to say on the normative question of theory testing, perhaps because J. S. Mill mini- mized Whewell's contribution at every point. We aim to show how the many facets of their debate arise systematically from a single key issue—the normative function of concepts in science. The resulting theory of scientific inference could provide new insights into to the nature of inference in general. 1 The Central Argument
Statisticians and philosophers of science have many common interests but restricted communication with each other. This volume aims to remedy these shortcomings. It provides state-of-the-art research in the area of Philosophy of... more
Statisticians and philosophers of science have many common interests but restricted communication with each other. This volume aims to remedy these shortcomings. It provides state-of-the-art research in the area of Philosophy of Statistics by encouraging numerous experts to communicate with one another without feeling'restricted'by their disciplines or thinking'piecemeal'in their treatment of issues.
William Whewell's philosophy of scientific discovery is applied to the problem of understanding the nature of unification and explanation by the composi- tion of causes in Newtonian mechanics. The essay attempts to demonstrate: (1)... more
William Whewell's philosophy of scientific discovery is applied to the problem of understanding the nature of unification and explanation by the composi- tion of causes in Newtonian mechanics. The essay attempts to demonstrate: (1) The sense in which 'approximate' laws (e.g. Kepler's laws of planetary motion) success- fully refer to real physical systems rather than to (fictitious) idealizations of them;
ABSTRACT The simple question "What is empirical success?" turns out to have a surprisingly intricate answer. The paper begins with the point that empirical success cannot be equated with goodness-of-fit without making... more
ABSTRACT The simple question "What is empirical success?" turns out to have a surprisingly intricate answer. The paper begins with the point that empirical success cannot be equated with goodness-of-fit without making some kind of distinction between meritorious fit and fudged fit. The proposal that empirical success is adequately defined by Akaike's Information Criterion (AIC) is analyzed in this light. What is called cross-validated fit is proposed as a further improvement. But it still leaves something out. The final proposal is that empirical success has a hierarchical structure that commonly emerges from the agreement of independent measurements of theoretically postulated quantities.
This paper began life as a serious attempt to understand the classical foundations of thermodynamics, but ended up doing crazy things like defining the temperature and entropy of single molecules. Nevertheless, there is method behind the... more
This paper began life as a serious attempt to understand the classical foundations of thermodynamics, but ended up doing crazy things like defining the temperature and entropy of single molecules. Nevertheless, there is method behind the madness, because in terms of these generalized definitions I am able to prove that the generalized notion of temperature reduces to the standard notion under certain quite general conditions, in which case the standard entropy is equal to the sum of the molecular entropies. The generalized viewpoint is intended to provide a deeper understanding of how statistical mechanics works, …and why it doesn't work, sometimes, as in the case of a single particle in a box. The purpose of such examples is to make a conceptual point, not to advance physics. The sense in which things work 'well is defined in terms of the Kullback-Leibler discrepancy, which has also been used to define predictive accuracy (Forster and Sober 1994). From this vantage point, t...
Research Interests:
David Deutsch has a forthcoming article called "Quantum Theory of Probability and Decisions" in which he claims to derive the standard probabilistic interpretation of the wavefunction from non-probabilistic considerations. In... more
David Deutsch has a forthcoming article called "Quantum Theory of Probability and Decisions" in which he claims to derive the standard probabilistic interpretation of the wavefunction from non-probabilistic considerations. In this note, I try to develop his argument in a classical context, to see whether the same ideas can generate probabilistic conclusions from non- probabilistic assumptions in a more general way. While my answer is negative, the attempt pinpoints the parts of Deutsch's quantum mechanical argument that must rely on special features of the formalism (assuming that his proof is correct).
Research Interests:
The Likelihood Principle has been defended on Bayesian grounds, on the grounds that it coincides with and systematizes intuitive judgments about example problems, and by appeal to the fact that it generalizes what is true when hypotheses... more
The Likelihood Principle has been defended on Bayesian grounds, on the grounds that it coincides with and systematizes intuitive judgments about example problems, and by appeal to the fact that it generalizes what is true when hypotheses have deductive consequences about observations. Here we divide the Principle into two parts --one qualitative, the other quantitative --and evaluate each in the light of the Akaike information criterion. Both turn out to be correct in a special case (when the competing hypotheses have the same number of adjustable parameters), but not otherwise. Mark Anthony said that he came to bury Caesar, not to praise him. In contrast, our goal in connection with the likelihood concept is neither to bury likelihood nor to praise it. Instead of praising the concept, we will present what we think is an important criticism. However, the upshot of this criticism is not the conclusion that likelihood should be buried, but paradoxically, a justification of likelihood,...
Causal inference is commonly viewed in two steps: (1) Represent the empirical data in terms of a prob-ability distribution. (2) Draw causal conclusions from the conditional independencies exhibited in that distribution. I challenge this... more
Causal inference is commonly viewed in two steps: (1) Represent the empirical data in terms of a prob-ability distribution. (2) Draw causal conclusions from the conditional independencies exhibited in that distribution. I challenge this reconstruction by ar-guing that the empirical data are often better parti-tioned into different domains and represented by a separate probability distribution within each domain. For then their similarities and the differences provide a wealth of relevant causal information. Computer simulations confirm this hunch, and the results are explained in terms of a distinction between predic-tion and accommodation, and William Whewell's consilience of inductions. If the diagnosis is correct, then the standard notion of the empirical distinguish-ability, or equivalence, of causal models needs revi-sion, and the idea that cause can be defined in terms of probability is far more plausible than before. Someone knocks on your door selling subscriptions to th...
We prove that a probabilistic average over possible, but not actual, hidden variable distributions maximizes predictive accuracy (defined in terms of the Kullback-Leibler discrepancy) within a context in which only the relative... more
We prove that a probabilistic average over possible, but not actual, hidden variable distributions maximizes predictive accuracy (defined in terms of the Kullback-Leibler discrepancy) within a context in which only the relative frequencies of hidden variables are known. Our detailed analysis of the Bernoulli model (e. g, coin flipping) reveals striking similarities with the derivation of thermodynamics from microphysics. In both cases, the macroscopic description is derived from a probabilistic average over possible microstates, or counterfactual hidden variables. In neither case is the macroscopic description deducible from a description of the microstate.
Research Interests:
Research Interests:
ABSTRACT Although in every inductive inference, an act of invention is requisite, the act soon slips out of notice. Although we bind together facts by superinducing upon them a new Conception, this Conception, once introduced and applied,... more
ABSTRACT Although in every inductive inference, an act of invention is requisite, the act soon slips out of notice. Although we bind together facts by superinducing upon them a new Conception, this Conception, once introduced and applied, is looked upon as inseparably connected with the facts, and necessarily implied in them. Having once had the phenomena bound together in their minds in virtue of the Conception men can no longer easily restore them back to the detached and incoherent condition in which they were before they were thus combined. The pearls once strung, they seem to form a chain by their nature. Induction has given them unity which it is so far from costing us an effort to preserve, that it requires an effort to imagine it dissolved — William Whewell, 1858. (Quoted from Butts (ed.), 1989, p. 143)
ABSTRACT The central problem with Bayesian philosophy of science is that it cannot take account of the relevance of simplicity and unification to confirmation, induction, and scientific inference. The standard Bayesian folklore about... more
ABSTRACT The central problem with Bayesian philosophy of science is that it cannot take account of the relevance of simplicity and unification to confirmation, induction, and scientific inference. The standard Bayesian folklore about factoring simplicity into the priors, and convergence theorems as a way of grounding their objectivity are some of the myths that Earman's book does not address adequately.
... Kukla, Andre [1995]: 'Forster and Sober on the Curve-Fitting Problem', British Journal for the Philosophy of Science, 46, pp. 248-52. Sakamoto, Y., Ishiguro, M., and Kitagawa, G. [1986]: Akaike Information Criter-ion... more
... Kukla, Andre [1995]: 'Forster and Sober on the Curve-Fitting Problem', British Journal for the Philosophy of Science, 46, pp. 248-52. Sakamoto, Y., Ishiguro, M., and Kitagawa, G. [1986]: Akaike Information Criter-ion Statistics, Dordrecht, Kluwer Academic Publishers. ...
ABSTRACT What has science actually achieved? A theory of achievement should (1) define what has been achieved, (2) describe the means or methods used in science, and (3) explain how such methods lead to such achievements. Predictive... more
ABSTRACT What has science actually achieved? A theory of achievement should (1) define what has been achieved, (2) describe the means or methods used in science, and (3) explain how such methods lead to such achievements. Predictive accuracy is one truth-related achievement of science, and there is an explanation of why common scientific practices (of trading off simplicity and fit) tend to increase predictive accuracy. Akaike's explanation for the success of AIC is limited to interpolative predictive accuracy. But therein lies the strength of the general framework, for it also provides a clear formulation of many open problems of research.
... Woodbridge, CN: Ox Bow Press. First citation in article. Gillespie, Daniel T. (1970), AQuantum Mechanics Primer: An Elementary Introduction to the Formal Theory of Nonrelativistic Quantum Mechanics. Scranton, PA: International... more
... Woodbridge, CN: Ox Bow Press. First citation in article. Gillespie, Daniel T. (1970), AQuantum Mechanics Primer: An Elementary Introduction to the Formal Theory of Nonrelativistic Quantum Mechanics. Scranton, PA: International Textbook Co. ...

And 13 more

Notes on Thomas Kuhn's The Structure of Scientific Revolutions, 1970 edition, 21 single-spaced  pages.
Notes on Thomas Kuhn's famous book The Structure of Scientific Revolutions, 1970 edition.  This version is 41 single-spaced pages.
The Principle of Common Cause is generalized to the idea that that correlated phenomena (not just correlated events) are indicators of a common cause. Examples in which the correlation between sets of variables do not reduce to pairwise... more
The Principle of Common Cause is generalized to the idea that that correlated phenomena (not just correlated events) are indicators of a common cause.  Examples in which the correlation between sets of variables do not reduce to pairwise correlations (Bernstein’s paradox) prove that it is strictly more general, in an interesting way.  It is also explained how the generalized principle (like the original corrected version of Reichenbach’s principle) follows from the Causal Markov Condition, which is the main axiom of the structural theory of causation (aka Bayes causal nets).
Research Interests: