Elementary Decision Theory
By Herman Chernoff and Lincoln E. Moses
4/5
()
About this ebook
This volume is a well-known, well-respected introduction to a lively area of statistics. Professors Chernoff and Moses bring years of professional expertise as classroom teachers to this straightforward approach to statistical problems. And happily, for beginning students, they have by-passed involved computational reasonings which would only confuse the mathematical novice.
Developed from nine years of teaching statistics at Stanford, the book furnishes a simple and clear-cut method of exhibiting the fundamental aspects of a statistical problem. Beginners will find this book a motivating introduction to important mathematical notions such as set, function and convexity. Examples and exercises throughout introduce new topics and ideas.
The first seven chapters are recommended for beginning courses in the basic ideas of statistics and require only a knowledge of high school math. These sections include material on data processing, probability and random variables, utility and descriptive statistics, uncertainty due to ignorance of the state of nature, computing Bayes strategies and an introduction to classical statistics. The last three chapters review mathematical models and summarize terminology and methods of testing hypotheses. Tables and appendixes provide information on notation, shortcut computational formulas, axioms of probability, properties of expectations, likelihood ratio test, game theory, and utility functions.
Authoritative, yet elementary in its approach to statistics and statistical theory, this work is also concise, well-indexed and abundantly equipped with exercise material. Ideal for a beginning course, this modestly priced edition will be especially valuable to those interested in the principles of statistics and scientific method.
Related to Elementary Decision Theory
Titles in the series (100)
Infinite Series Rating: 4 out of 5 stars4/5Geometry: A Comprehensive Course Rating: 4 out of 5 stars4/5Fourier Series and Orthogonal Polynomials Rating: 0 out of 5 stars0 ratingsMathematics for the Nonmathematician Rating: 4 out of 5 stars4/5Laplace Transforms and Their Applications to Differential Equations Rating: 5 out of 5 stars5/5Counterexamples in Topology Rating: 4 out of 5 stars4/5First-Order Partial Differential Equations, Vol. 1 Rating: 5 out of 5 stars5/5First-Order Partial Differential Equations, Vol. 2 Rating: 0 out of 5 stars0 ratingsElementary Matrix Algebra Rating: 3 out of 5 stars3/5Advanced Calculus: Second Edition Rating: 5 out of 5 stars5/5A Catalog of Special Plane Curves Rating: 2 out of 5 stars2/5Introduction to the Theory of Abstract Algebras Rating: 0 out of 5 stars0 ratingsAnalytic Inequalities Rating: 5 out of 5 stars5/5Calculus Refresher Rating: 3 out of 5 stars3/5The Calculus Primer Rating: 0 out of 5 stars0 ratingsMethods of Applied Mathematics Rating: 3 out of 5 stars3/5Theory of Approximation Rating: 0 out of 5 stars0 ratingsExtremal Graph Theory Rating: 3 out of 5 stars3/5Numerical Methods Rating: 5 out of 5 stars5/5An Adventurer's Guide to Number Theory Rating: 4 out of 5 stars4/5An Introduction to Lebesgue Integration and Fourier Series Rating: 0 out of 5 stars0 ratingsHow to Gamble If You Must: Inequalities for Stochastic Processes Rating: 0 out of 5 stars0 ratingsMathematics in Ancient Greece Rating: 5 out of 5 stars5/5Theory of Games and Statistical Decisions Rating: 4 out of 5 stars4/5Fifty Challenging Problems in Probability with Solutions Rating: 4 out of 5 stars4/5Chebyshev and Fourier Spectral Methods: Second Revised Edition Rating: 4 out of 5 stars4/5Elementary Number Theory: Second Edition Rating: 4 out of 5 stars4/5Topology for Analysis Rating: 4 out of 5 stars4/5Differential Forms with Applications to the Physical Sciences Rating: 5 out of 5 stars5/5An Introduction to the Theory of Canonical Matrices Rating: 0 out of 5 stars0 ratings
Related ebooks
Information Theory and Statistics Rating: 0 out of 5 stars0 ratingsGames and Decisions: Introduction and Critical Survey Rating: 4 out of 5 stars4/5Introduction to the Theory of Games Rating: 0 out of 5 stars0 ratingsA Profile of Mathematical Logic Rating: 0 out of 5 stars0 ratingsIntroduction to Mathematical Thinking: The Formation of Concepts in Modern Mathematics Rating: 4 out of 5 stars4/5Practical Statistics Simply Explained Rating: 4 out of 5 stars4/5Philosophy of Mathematics Rating: 0 out of 5 stars0 ratingsSimply Complexity: A Clear Guide to Complexity Theory Rating: 4 out of 5 stars4/5Flaws and Fallacies in Statistical Thinking Rating: 4 out of 5 stars4/5Probability, Statistics, and Stochastic Processes Rating: 0 out of 5 stars0 ratingsJourney into Mathematics: An Introduction to Proofs Rating: 4 out of 5 stars4/5An Introduction to Mathematical Modeling Rating: 5 out of 5 stars5/5First Course in Mathematical Logic Rating: 3 out of 5 stars3/5The Philosophy of Set Theory: An Historical Introduction to Cantor's Paradise Rating: 4 out of 5 stars4/5The Fascinating World of Graph Theory Rating: 4 out of 5 stars4/5Attacking Probability and Statistics Problems Rating: 0 out of 5 stars0 ratingsIntroduction to Graph Theory Rating: 4 out of 5 stars4/5Splines and Variational Methods Rating: 5 out of 5 stars5/5Probabilistic Metric Spaces Rating: 3 out of 5 stars3/5Gamma: Exploring Euler's Constant Rating: 4 out of 5 stars4/5Philosophical Introduction to Set Theory Rating: 0 out of 5 stars0 ratingsSets, Sequences and Mappings: The Basic Concepts of Analysis Rating: 0 out of 5 stars0 ratingsBAYES Theorem Rating: 2 out of 5 stars2/5Theory of Games and Statistical Decisions Rating: 4 out of 5 stars4/5Introduction to Real Analysis: An Educational Approach Rating: 0 out of 5 stars0 ratingsHarnessing Complexity: Organizational Implications of a Scientific Frontier Rating: 4 out of 5 stars4/5The Mathematics of Games of Strategy Rating: 4 out of 5 stars4/5Diversity and Complexity Rating: 4 out of 5 stars4/5Thinking Clearly with Data: A Guide to Quantitative Reasoning and Analysis Rating: 5 out of 5 stars5/5An Essay on the Psychology of Invention in the Mathematical Field Rating: 5 out of 5 stars5/5
Mathematics For You
What If?: Serious Scientific Answers to Absurd Hypothetical Questions Rating: 5 out of 5 stars5/5Calculus Made Easy Rating: 4 out of 5 stars4/5Relativity: The special and the general theory Rating: 5 out of 5 stars5/5Mental Math Secrets - How To Be a Human Calculator Rating: 5 out of 5 stars5/5My Best Mathematical and Logic Puzzles Rating: 4 out of 5 stars4/5Math Magic: How To Master Everyday Math Problems Rating: 3 out of 5 stars3/5Standard Deviations: Flawed Assumptions, Tortured Data, and Other Ways to Lie with Statistics Rating: 4 out of 5 stars4/5Quantum Physics for Beginners Rating: 4 out of 5 stars4/5Summary of The Black Swan: by Nassim Nicholas Taleb | Includes Analysis Rating: 5 out of 5 stars5/5Algebra I For Dummies Rating: 4 out of 5 stars4/5The Little Book of Mathematical Principles, Theories & Things Rating: 3 out of 5 stars3/5Algebra II For Dummies Rating: 3 out of 5 stars3/5Calculus For Dummies Rating: 4 out of 5 stars4/5Basic Math & Pre-Algebra For Dummies Rating: 4 out of 5 stars4/5Pre-Calculus For Dummies Rating: 5 out of 5 stars5/5Mental Math: Tricks To Become A Human Calculator Rating: 2 out of 5 stars2/5Algebra I Workbook For Dummies Rating: 3 out of 5 stars3/5Geometry For Dummies Rating: 4 out of 5 stars4/5How to Solve It: A New Aspect of Mathematical Method Rating: 4 out of 5 stars4/5Alan Turing: The Enigma: The Book That Inspired the Film The Imitation Game - Updated Edition Rating: 4 out of 5 stars4/5Basic Math & Pre-Algebra Workbook For Dummies with Online Practice Rating: 4 out of 5 stars4/5Limitless Mind: Learn, Lead, and Live Without Barriers Rating: 4 out of 5 stars4/5Logicomix: An epic search for truth Rating: 4 out of 5 stars4/5Calculus Essentials For Dummies Rating: 5 out of 5 stars5/5GED® Math Test Tutor, 2nd Edition Rating: 0 out of 5 stars0 ratingsThe Everything Everyday Math Book: From Tipping to Taxes, All the Real-World, Everyday Math Skills You Need Rating: 5 out of 5 stars5/5Game Theory: A Simple Introduction Rating: 4 out of 5 stars4/5
Reviews for Elementary Decision Theory
1 rating0 reviews
Book preview
Elementary Decision Theory - Herman Chernoff
Index
CHAPTER 1
Introduction
1. INTRODUCTION
Beginning students are generally interested in what constitutes the subject matter of the theory of statistics. Years ago a statistician might have claimed that statistics deals with the processing of data. As a result of relatively recent formulations of statistical theory, today’s statistician will be more likely to say that statistics is concerned with decision making in the face of uncertainty. Its applicability ranges from almost all inductive sciences to many situations that people face in everyday life when it is not perfectly obvious what they should do.
What constitutes uncertainty? There are two kinds of uncertainty. One is that due to randomness. When someone tosses an ordinary coin, the outcome is random and not at all certain. It is as likely to be heads as tails. This type of uncertainty is in principle relatively simple to treat. For example, if someone were offered two dollars if the coin falls heads, on the condition that he pay one dollar otherwise, he would be inclined to accept the offer since he knows
that heads is as likely to fall as tails. His knowledge concerns the laws of randomness involved in this particular problem.
The other type of uncertainty arises when it is not known which laws of randomness apply. For example, suppose that the above offer were made in connection with a coin that was obviously bent. Then one could assume that heads and tails were not equally likely but that one face was probably favored. In statistical terminology we shall equate the laws of randomness which apply with the state of nature.
What can be done in the case where the state of nature is unknown? The statistician can perform relevant experiments and take observations. In the above problem, a statistician would (if he were permitted) toss the coin many times to estimate what is the state of nature. The decision on whether or not to accept the offer would be based on his estimate of the state of nature.
One may ask what constitutes enough observations. That is, how many times should one toss the coin before deciding? A precise answer would be difficult to give at this point. For the time being it suffices to say that the answer would depend on (1) the cost of tossing the coin, and (2) the cost of making the wrong decision. For example, if one were charged a nickel per toss, one would be inclined to take very few observations compared with the case when one were charged one cent per toss. On the other hand, if the wager were changed to $2000 against $1000, then it would pay to take many observations so that one could be quite sure that the estimate of the state of nature were good enough to make it almost certain that the right action is taken.
It is important to realize that no matter how many times the coin is tossed, one may never know for sure what the state of nature is. For example, it is possible, although very unlikely, that an ordinary coin will give 100 heads in a row. It is also possible that a coin which in the long run favors heads will give more tails than heads in 100 tosses. To evaluate the chances of being led astray by such phenomena, the statistician must apply the theory of probability.
Originally we stated that statistics is the theory of decision making in the face of uncertainty. One may argue that, in the above example, the statistician merely estimated the state of nature and made his decision accordingly, and hence, decision making is an overly pretentious name for merely estimating the state of nature. But even in this example, the statistician does more than estimate the state of nature and act accordingly. In the $2000 to $1000 bet he should decide, among other things, whether his estimate is good enough to warrant accepting or rejecting the wager or whether he should take more observations to get a better estimate. An estimate which would be satisfactory for the $2 to $1 bet may be unsatisfactory for deciding the $2000 to $1000 bet.
2. AN EXAMPLE
To illustrate statistical theory and the main factors that enter into decision making, we shall treat a simplified problem in some detail. It is characteristic of many statistical applications that, although real problems are too complex, they can be simplified without changing their essential characteristics. However, the applied statistician must try to keep in mind all assumptions which are not strictly realistic but are introduced for the sake of simplicity. He must do so to avoid assumptions that lead to unrealistic answers.
Example 1.1. The Contractor Example. Suppose that an electrical contractor for a house knows from previous experience in many communities that houses are occupied by only three types of families: those whose peak loads of current used are 15 amperes (amp) at one time in a circuit, those whose peak loads are 20 amp, and those whose peak loads are 30 amp. He can install 15-amp wire, or 20-amp wire, or 30-amp wire. He could save on the cost of his materials in wiring a house if he knew the actual needs of the occupants of that house. However, this is not known to him.
One very easy solution to the problem would be to install 30-amp wire in all houses, but in this case he would be spending more to wire a house than would actually be necessary if it were occupied by a family who used no more than 15 amp or by one that used no more than 20 amp. On the other hand, he could install 15-amp wire in every house. This solution also would not be very good because families who used 20 or 30 amp would frequently burn out the fuses, and not only would he have to replace the wire with more suitable wire but he might also suffer damage to his reputation as a wiring contractor.
TABLE 1.1
LOSSES INCURRED BY CONTRACTOR
Table 1.1 presents a tabulation of the losses which he sustains from taking various actions for the various types of users.
The thetas (θ) are the possible categories that the occupants of a particular house fall into; or they are the possible states of nature. These are: θ1—the family has peak loads of 15 amp; θ2—the family has peak loads of 20 amp; and θ3—the family has peak loads of 30 amp.¹
The α′s across the top are the actions or the different types of installations he could make. The numbers appearing in the table are his own estimates of the loss that he would incur if he took a particular action in the presence of a particular state.
For example, the 1 in the first row represents the cost of the 15-amp wire. The 2 in the first row represents the cost of the 20-amp wire, which is more expensive since it is thicker.²
In the second row we find a 5 opposite state θ2, under action α1. This reflects the loss to the contractor of installing 15-amp wire in a home with 20-amp peak loads; cost of reinstallation, and damage to his reputation, all enter into this number. It is the result of a subjective determination on his part; for one of his competitors this number might be, instead, a 6. Other entries in the table have similar interpretations.
Since he could cut down the losses incurred in wiring a house if he knew the value of θ for the house (i.e., what were the electricity requirements of the occupant), he tries to learn this by performing an experiment. His experiment consists of going to the future occupant and asking how many amperes he uses. The response is always one of four numbers: 10, 12, 15, or 20. From previous experience it is known that families of type θ1, (15-amp users) answer z1, (10 amp) half of the time and z2 (12 amp) half of the time; families of type θ2 (20-amp users) answer z2 (12 amp) half of the time and z3, (15 amp) half of the time; and families of type θ3 (30-amp users) answer z3, (15 amp) one-third of the time and z4 (20 amp) two-thirds of the time. These values are shown in Table 1.2. In fact, the entries represent the probabilities of observing the z values for the given states of nature.
TABLE 1.2
FREQUENCY OF RESPONSES FOR VARIOUS STATES OF NATURE IN THE CONTRACTOR EXAMPLE
The contractor now formulates a strategy (rule for decision making) which will tell him what action to take for each kind of observation. For instance, one possible rule would be to install 20-amp wire if he observes z1; 15-amp wire if he observes z2; 20-amp wire if he observes z3; and 30-amp wire if he observes z4. This we symbolize by s=(α2, α1 α2, α3), where the first α2 is the action taken if our survey yields z1; α1 is the action taken if z2 is observed; the second α2 corresponds to z3; and α3 corresponds to z4.
Table 1.3 shows five of the 81 possible strategies that might be employed, using the above notation.
TABLE 1.3
STRATEGIES (RULES FOR DECISION MAKING)
Note that s2 is somewhat more conservative than s1. Both s3 and s4, completely ignore the data. The strategy s5 seems to be one which only a contractor hopelessly in love could select.
How shall we decide which of the various strategies to apply?
First, we compute the average loss that the contractor would incur for each of the three states and each strategy. For the five strategies, these losses are listed in Table 1.4.
TABLE 1.4
AVERAGE LOSS IN CONTRACTOR EXAMPLE
They are computed in the following fashion:
First we compute the action probabilities for s1, = (α1, α1 α2, α3). If θ1 is the state of nature, we observe z1 half the time and z2 half the time (see Table 1.2). If s1 is applied, action α, is taken in either case, and actions α2 and α3 are not taken. If θ2 is the state of nature, we observe z2, half the time and z3 half the time. Under strategy s1, this leads to action α1 with probability 1/2, action α2 with probability 1/2, and action α3 never. Similarly, under θ3, we shall take action α1, never, α2 with probability 1/3, and α3 with probability 2/3. These results are summarized in the action probabilities for s1 (Table 1.5) which are placed next to the losses (copied from Table 1.1).
If θ1 is the state of nature, action α1 is taken all of the time, giving a loss of 1 all of the time. If θ2 is the state of nature, action α1 yielding a loss of 5 is taken half the time and action α2 yielding a loss of 2 is taken half the time. This leads to an average loss of
5 × 1/2 + 2 × 1/2 = 3.5.
Similarly the average loss under θ3 is
6 × 1/3 + 3 × 2/3 = 4.
Thus the column of average losses corresponding to s1 has been computed. The corresponding tables for strategy s2 are indicated in Table 1.5. The other strategies are evaluated similarly.
In relatively simple problems such as this one, it is possible to compute the average losses with less writing by juggling Tables 1.1, 1.2, and 1.3 simultaneously.
Is it clear now which of these strategies should be used? If we look at the chart of average losses (Table 1.4), we see that some of the strategies give greater losses than others. For example, if we compare s5 with s2, we see that in each of the three states the average loss associated with s5 is equal to or greater than that corresponding to s2. The contractor would therefore do better to use strategy s2 than strategy s5 since his average losses would be less for states θ1 and θ3 and no more for θ2. In this case, we say " s2 dominates s5." Likewise, if we compare s4 and s1, we see that except for state θ1 where they were equal, the average losses incurred by using s4 are larger than those incurred by using s1. Again we would say that s4 is dominated by strategy s1. It would be senseless to keep any strategy which is dominated by some other strategy. We can thus discard strategies s4 and s5. We can also discard s3 for we find that it is dominated by s2.
TABLE 1.5
LOSSES, ACTION PROBABILITIES, AVERAGE LOSS
If we were to confine ourselves to selecting one of the five listed strategies, we would need now only choose between s1 and s2. How can we choose between them? The contractor could make this choice if he had a knowledge of the percentages of families in the community corresponding to states θ1, θ2, and θ3. For instance, if all three states are equally likely, i.e., in the community one-third of the families are in state θ1, one-third in state θ2, and one-third in state θ3, then he would use s2, because for s2 his average loss would on the average be
1.5 × 1/3 +2.5 × 1/3 + 3 × 1/3 = 2.33
whereas, for s, his average loss would on the average be
1 × 1/3 + 3.5 x 1/3 + 4 x 1/3 = 2.83.
However, if one knew that in this community 90% of the families were in state θ1 and 10% in θ2, one would have the average losses of
and s1 would be selected. Therefore, the strategy that should be picked depends on the relative frequencies of families in the three states. Thus, when the actual proportions of the families in the three classes are known, a good strategy is easily selected. In the absence of such knowledge, choice is inherently difficult. One principle which has been suggested for choosing a strategy is called the minimax average loss rule.
This says, Pick that strategy for which the largest average loss is as small as possible, i.e., minimize the maximum average loss.
Referring to Table 1.4, we see that, for s1, the maximum average loss is 4 and for s2 it is three. The minimax rule would select s2. This is clearly a pessimistic approach since the choice is based entirely on consideration of the worst that can happen.
In considering our average loss table, we discarded some strategies as being dominated
by other procedures. Those we rejected are called inadmissible strategies. Strategies which are not dominated are called admissible.
In our example it might turn out that s1 or s2 would be dominated by one of the 76 strategies which have not been examined; on the other hand, other strategies not dominated by s1 or s2 might be found. An interesting problem in the theory of decision making is that of finding all the admissible strategies.
Certain questions suggest themselves. For example, one may ask why we put so much dependence on the average losses.
This question will be discussed in detail in Chapter 4 on utility. Another question that could be raised would be concerned with the reality of our assumptions. One would actually expect that peak loads of families could vary continuously from less than 15 amp to more than 30 amp. Does our simplification (which was presumably based on previous experience) lead to the adoption of strategies which are liable to have very poor consequences (large losses)? Do you believe that the assumption that the only possible observations are z1, z2, z3, and z4 is a serious one? Finally, suppose that several observations were available, i.e., the contracter could interview all the members of the family separately. What would be the effect of such data? First, it is clearly apparent that, with the resulting increase in the number of possible combinations of data, the number of strategies available would increase considerably. Second, in statistical problems, the intelligent use of more data generally tends to decrease the average losses.
In this example we ignored the possibility that the strategy could suggest (1) compiling more data before acting, or (2) the use of altogether different data such as examining the number of electric devices in the family’s kitchen.
3. PRINCIPLES USED IN DECISION MAKING
Certain points have been illustrated in the example. One is that the main gap in present-day statistical philosphy involves the question of what constitutes a good criterion for selecting strategies. The awareness of this gap permits us to see that in many cases it really is not serious. First, in many industrial applications, the frequencies with which the state of nature is θ1, θ2, etc., is approximately known, and one can average the average losses as suggested in the example. In many other applications, the minimum of the maximum average loss is so low that the use of the minimax rule cannot be much improved upon.
Another point is that the statistician must consider the consequences of his actions and strategies in order to select a good rule for decision making (strategy). This will be illustrated further in Exercise 1.1 where it is seen that two reasonable people with different loss tables may react differently to the same data even though they apply the same criterion.
Finally, the example illustrates the relation between statistical theory and the scientific method. Essentially every scientist who designs experiments and presents conclusions is engaging in decision making where the costs involved are the time and money for his experiments, on one hand, and damage to society and his reputation if his conclusions are seriously in error, on the other hand. It is not uncommon to hear of nonsense about conclusions being scientifically proved.
In real life very little can be certain. If we tossed a coin a million times, we would not know the exact probability of its falling heads, and we could (although it is unlikely) have a large error in our estimate of this probability. It is true that, generally, scientists attach a large loss to the act of presenting a conclusion which is in error and, hence, they tend to use strategies which decide to present conclusions only when the evidence in favor is very great.
4. SUMMARY
The essential components in decision-making problems are the following :
1. The available actions α1, α. A problem exists when there is a choice of alternative actions. The consequence of taking one of these actions must depend on the state of nature. Usually the difficulty in deciding which action to take is due to the fact that it is not known which of
2. the possible states of nature θ1, θis the true one.
3. The loss table (consequence of actions) measures the cost of taking actions α1, αrespectively when the states of nature are θ1, θ
respectively.
Given the loss table, it would be easy to select the best action if the state of nature were known. In other words, the state of nature represents the underlying facts of life,
knowledge of which we would like in order to choose a proper action. A state of nature can be made to reveal itself partially through an
4. experiment, which leads to one of
4(a). the possible observations z1, z. The probabilities of these various observations depend upon what the state of nature actually is.
4(b). The table of frequency responses shows this dependence. An informative experiment is one where the frequencies of responses depend heavily on the state of nature. Each of
5. the available strategies s1, sis a recipe which tells us how to react (which action to take) to any possible data.
For example, Paul Revere’s assistant was to hang lamps according to the strategy One if by land, two if by sea
(and implicitly, none if the British did not come).
6. Finally, the average loss table gives us the consequence of the strategies. It is in terms of this table that we must determine what constitutes a good strategy. With a well-designed or informative experiment, there will be strategies for which the average loss will tend to be small.
An intermediate step in the computation of the average loss table consists of evaluating a table of action probabilities for each strategy. This table of action probabilities tells how often the strategy will lead to a given action when a certain state of nature applies.
³Exercise 1.1. Suppose that our contractor is a fly-by-night operator,
and his loss table is not given by Table 1.1 but by the following table (Table 1.6):
TABLE 1.6
LOSSES FOR FLY-BY-NIGHT OPERATOR
What would constitute a good strategy for him?
Exercise 1.2. Suppose that the contractor (the original respectable business man and not the fly-by-night operator) discovers a new type of wire available on the market. This is a 25-amp wire. Even though this wire would not be appropriate for any state of nature if the state of nature were known, a reasonable strategy may call for its use occasionally. Extend Table 1.1 so that, for a, (installing 25-amp wire), the losses are 2.4, 2.4, and 4 for θ1, θ2, and θ3, respectively. Introduce at least two new reasonable
strategies which occasionally call for α4. Evaluate the associated average losses. Are these strategies admissible? If one of them is admissible, describe the circumstances, if any, in which you would use it.
*Exercise 1.3. Construct and carry through the details of a problem which exhibits all the characteristics of decision making summarized in Section 4. Present and evaluate some reasonable and unreasonable strategies. Indicate which, if any, of the simplifications introduced into this problem are liable to be serious in that they may strongly affect the nature of a good strategy.
Remarks. Among other interesting problems are those which deal with diagnosis of illness, the question of wearing rainclothes, and the decision between not shaving and arriving late on a date.
In the contractor problem the experiment yields only one out of four possible observations. If the contractor had supplemented his question with a count of the number of electric appliances in the house, his data would have consisted of two observations out of very many possible pairs of observations. Then the number of available strategies would have been multiplied considerably. For the sake of simplicity, it is suggested that you construct a problem where the experiment will yield only one of a few possible observations; for example, one temperature reading on a thermometer with possible readings normal,
warm,
and hot.
Please note that if there is only one possible observation, you have no experiment. Thus the executive who asks his yes
man for criticism obtains no relevant information.
Exercise 1.4. The minimax average regret principle is a modification of the minimax average loss principle. In Table 1.1, we note that, if θ1 were the state of nature, the least loss of all the actions considered would be 1. For θ2, and θ3, these minimum losses would be 2 and 3. These losses may be considered unavoidable and due to the state of nature. Each loss then represents this unavoidable loss plus a regret (loss due to ignorance of θ). Subtracting these unavoidable losses, we obtain the regret table, Table 1.7, and the average regret table, Table 1.8.
Table 1.8 could be obtained in either of two ways. First, we could construct it from Table 1.7 just as Table 1.4 was constructed from Table 1.1. Alternatively, we can subtract the minimum possible losses 1, 2, and 3 from the θ1, θ2, and θ3 rows respectively, of Table 1.4.
TABLE 1.7
REGRET IN CONTRACTOR EXAMPLE
TABLE 1.8
AVERAGE REGRET IN CONTRACTOR EXAMPLE
The strategy which minimizes the maximum regret is s2. Apply these ideas in your example of Exercise 1.3 by evaluating the minimax average loss and minimax average regret strategies. Must these principles always yield the same strategy for all decision-making examples?
Exercise 1.5. When Mr. Clark passed through East Phiggins in its pioneer days, he was expected to bring back information to guide the coming settlers on whether or not to take along air conditioners. The available actions for the settlers were α1, to take air conditioners, and α2, to leave them behind. He considered three possible states of nature describing the summer weather in East Phiggins; θ1−80% of the days are very hot and 20% hot; θ2−50% of the days are very hot, 30% hot, and 20% mild; and θ3−20% of the days are very hot, 30% hot, and 50% mild. Since he passed through Phiggins in one day, his data consisted of one of the following possible observations describing that day: z1 very hot, z2 hot, and z3 mild. One of the settlers, after considerable introspection involving the difficulty of carrying air conditioners which would replace other important items and the discomfort without them, represented his losses by Table 1.9.
TABLE 1.9
LOSSES IN AIR-CONDITIONER PROBLEM
The first row of the frequency of response table is given by
(a) Complete the frequency of response table.
(b) List five strategies and evaluate their average losses.
(c) Among these five strategies, point out which is the minimax average loss strategy, and which is the best strategy if θ1, θ2, and θ3 were equally likely.
(d) Mr. Clark passed through on a hot day. What actions do the strategies of (c) call for?
Exercise 1.6. For a decision-making problem with losses, frequency of responses, and strategies given in , (b), and (c), evaluate the average losses for strategies s1, s2, s3, and s4. How many possible strategies are there?
TABLE 1.10 (a)
LOSSES
TABLE 1.10 (b)
FREQUENCY OF RESPONSE
TABLE 1.10 (c)
ACTIONS REQUIRED BY CERTAIN STRATEGIES
Exercise 1.7. Jane Smith can cook spaghetti, hamburger, or steak for dinner. She has learned from past experience that if her husband is in a good mood she can serve him spaghetti and save money, but if he is in a bad mood, only a juicy steak will calm him down and make him bearable. In short, there are three actions:
α1 prepare spaghetti;
α2 prepare hamburger;
α3 prepare steak.
Three states of nature:
θ1 Mr. Smith is in a good mood;
θ2 Mr. Smith is in a normal mood;
θ3 Mr. Smith is in a bad mood.
The loss table is:
The experiment she performs is to tell him when he returns home that she lost the afternoon paper. She foresees four possible responses. These are:
z1 Newspapers will get lost
;
z2 "I keep telling you ’a place for everything
and everything in its place’ ";
z3 Why did I ever get married?
z4 an absent-minded, far-away look.
The frequency of response table is
(a) List four strategies and evaluate their average losses.
(b) Point out the minimax average loss strategy.
(c) Which is the best of these strategies if Mr. Smith is in a good mood 30% of the time and in a normal mood 50% of the time?
Exercise 1.8. Replace Table 1.1 by
Evaluate the average losses for the five strategies of Table 1.3.
SUGGESTED READINGS
Problems in statistics and in game theory are very closely related. For example, statistical problems are sometimes referred to as games against nature. A very brief discussion will be found in Appendix F1.
An elementary exposition of game theory is given in:
Williams, J. D., The Compleat Strategyst, Being a Primer on the Theory of Games of Strategy, McGraw-Hill Book Co., New York, 1954, Dover Publications, Inc., New York, 1986. A popularized version of decision making applied to statistics will be found in:
Bross, I.D.J., Design for Decision, The Macmillan Co., New York, 1953.
CHAPTER 2
Data Processing
1. INTRODUCTION
In the preceding chapter we remarked that the number of available strategies increases rapidly with the amount of data available. When there are many observations, at least some rough methods of summarizing their content should be considered. Of course, what constitutes a good method of summarizing data is determined by the uses to which the data will be put. For example, if a manufacturer were interested in the probability that an item produced by a certain machine will be defective, this quantity could be estimated by taking the proportion of defectives obtained during the output of one day. If, however, he were interested in knowing whether his machine was in control,
i.e., whether the probability of producing a defective was constant during the day, then he might compare the proportion of defectives obtained during the morning with that for the afternoon.
In this chapter we shall very briefly present some standard procedures of data processing. This chapter may well be considered a digression (but a necessary one) in the presentation of statistical ideas, especially since a better understanding of the value of these procedures will come after the basic ideas of statistics are more thoroughly examined.
2. DATA REPRESENTATION
Example 2.1. A new process for making automobile tires was developed by the research staff of the Wearwell Rubber Company. They took a sample of 60 of these tires and tested them on a device which simulates the ordinary environment of automobile tires in use. When the tread was worn off, the