Aims Community College Random Variables and Distributions Discussion Questions I am having difficult to solve 6 questions from “Random Variables and Distributions” because of this COVID 19 lockdown online class. If you need a book then I will attach book file as a .pdf. Probability and Statistics

Fourth Edition

This page intentionally left blank

Probability and Statistics

Fourth Edition

Morris H. DeGroot

Carnegie Mellon University

Mark J. Schervish

Carnegie Mellon University

Addison-Wesley

Boston Columbus Indianapolis New York San Francisco Upper Saddle River

Amsterdam Cape Town Dubai London Madrid Milan Munich Paris Montréal Toronto

Delhi Mexico City São Paulo Sydney Hong Kong Seoul Singapore Taipei Tokyo

Editor in Chief: Deirdre Lynch

Acquisitions Editor: Christopher Cummings

Associate Content Editors: Leah Goldberg, Dana Jones Bettez

Associate Editor: Christina Lepre

Senior Managing Editor: Karen Wernholm

Production Project Manager: Patty Bergin

Cover Designer: Heather Scott

Design Manager: Andrea Nix

Senior Marketing Manager: Alex Gay

Marketing Assistant: Kathleen DeChavez

Senior Author Support/Technology Specialist: Joe Vetere

Rights and Permissions Advisor: Michael Joyce

Manufacturing Manager: Carol Melville

Project Management, Composition: Windfall Software, using ZzTEX

Cover Photo: Shutterstock/© Marilyn Volan

The programs and applications presented in this book have been included for their instructional value. They have been tested with care, but are not guaranteed for any particular

purpose. The publisher does not offer any warranties or representations, nor does it accept

any liabilities with respect to the programs or applications.

Many of the designations used by manufacturers and sellers to distinguish their products are

claimed as trademarks. Where those designations appear in this book, and Pearson Education

was aware of a trademark claim, the designations have been printed in initial caps or all caps.

Library of Congress Cataloging-in-Publication Data

DeGroot, Morris H., 1931–1989.

Probability and statistics / Morris H. DeGroot, Mark J. Schervish.—4th ed.

p. cm.

ISBN 978-0-321-50046-5

1. Probabilities—Textbooks. 2. Mathematical statistics—Textbooks.

I. Schervish, Mark J. II. Title.

QA273.D35 2012

519.2—dc22

2010001486

Copyright © 2012, 2002 Pearson Education, Inc.

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system,

or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording,

or otherwise, without the prior written permission of the publisher. Printed in the United

States of America. For information on obtaining permission for use of material in this work,

please submit a written request to Pearson Education, Inc., Rights and Contracts Department,

75 Arlington Street, Suite 300, Boston, MA 02116, fax your request to 617-848-7047, or e-mail

at http://www.pearsoned.com/legal/permissions.htm.

1 2 3 4 5 6 7 8 9 10—EB—14 13 12 11 10

ISBN 10: 0-321-50046-6

www.pearsonhighered.com

ISBN 13: 978-0-321-50046-5

To the memory of Morrie DeGroot.

MJS

This page intentionally left blank

Contents

Preface

1

xi

Introduction to Probability

1.1

The History of Probability

1.2

Interpretations of Probability

1.3

Experiments and Events

1.4

Set Theory

1.5

The Deﬁnition of Probability

1.6

Finite Sample Spaces

1.7

Counting Methods

1.8

Combinatorial Methods

1.9

Multinomial Coefﬁcients

1

1

2

5

6

16

22

25

32

42

1.10 The Probability of a Union of Events

1.11 Statistical Swindles

51

1.12 Supplementary Exercises

2

53

Conditional Probability

55

2.1

The Deﬁnition of Conditional Probability

2.2

Independent Events

2.3

Bayes’ Theorem

2.4

2.5

3

46

55

66

76

The Gambler’s Ruin Problem

Supplementary Exercises

86

90

Random Variables and Distributions

3.1

Random Variables and Discrete Distributions

3.2

Continuous Distributions

3.3

The Cumulative Distribution Function

3.4

Bivariate Distributions

118

3.5

Marginal Distributions

130

3.6

Conditional Distributions

141

3.7

Multivariate Distributions

152

3.8

Functions of a Random Variable

3.9

Functions of Two or More Random Variables

3.10 Markov Chains

100

188

3.11 Supplementary Exercises

vii

93

202

107

167

175

93

viii

Contents

4

Expectation

4.1

The Expectation of a Random Variable

4.2

Properties of Expectations

4.3

Variance

4.4

Moments

4.5

The Mean and the Median

241

4.6

Covariance and Correlation

248

4.7

Conditional Expectation

4.8

4.9

5

207

Utility

217

225

234

256

265

Supplementary Exercises

272

Special Distributions

275

5.1

Introduction

275

5.2

The Bernoulli and Binomial Distributions

5.3

The Hypergeometric Distributions

5.4

The Poisson Distributions

5.5

The Negative Binomial Distributions

5.6

The Normal Distributions

302

5.7

The Gamma Distributions

316

5.8

The Beta Distributions

5.9

The Multinomial Distributions

287

297

327

5.11 Supplementary Exercises

7

333

337

345

Large Random Samples

347

6.1

Introduction

6.2

The Law of Large Numbers

348

6.3

The Central Limit Theorem

360

6.4

The Correction for Continuity

6.5

Supplementary Exercises

Estimation

275

281

5.10 The Bivariate Normal Distributions

6

207

347

371

375

376

7.1

Statistical Inference

376

7.2

Prior and Posterior Distributions

7.3

Conjugate Prior Distributions

7.4

Bayes Estimators

408

385

394

Contents

7.5

Maximum Likelihood Estimators

7.6

Properties of Maximum Likelihood Estimators

7.7

Sufﬁcient Statistics

7.8

Jointly Sufﬁcient Statistics

7.9

Improving an Estimator

449

455

461

Sampling Distributions of Estimators

The Sampling Distribution of a Statistic

8.2

The Chi-Square Distributions

8.3

Joint Distribution of the Sample Mean and Sample Variance

8.4

The t Distributions

8.5

Conﬁdence Intervals

8.7

8.8

8.9

464

469

485

Bayesian Analysis of Samples from a Normal Distribution

Unbiased Estimators

Fisher Information

473

480

495

506

514

Supplementary Exercises

528

Testing Hypotheses

9.1

530

Problems of Testing Hypotheses

9.2

Testing Simple Hypotheses

9.3

Uniformly Most Powerful Tests

9.4

Two-Sided Alternatives

530

550

559

567

9.5

The t Test

576

9.6

Comparing the Means of Two Normal Distributions

9.7

The F Distributions

9.8

Bayes Test Procedures

9.9

Foundational Issues

587

597

605

617

9.10 Supplementary Exercises

10

464

8.1

8.6

9

426

443

7.10 Supplementary Exercises

8

417

621

Categorical Data and Nonparametric Methods

10.1 Tests of Goodness-of-Fit

624

10.2 Goodness-of-Fit for Composite Hypotheses

10.3 Contingency Tables

641

10.4 Tests of Homogeneity

10.5 Simpson’s Paradox

647

653

10.6 Kolmogorov-Smirnov Tests

657

633

624

ix

x

Contents

10.7 Robust Estimation

10.8 Sign and Rank Tests

666

678

10.9 Supplementary Exercises

11

686

Linear Statistical Models

11.1 The Method of Least Squares

11.2 Regression

689

689

698

11.3 Statistical Inference in Simple Linear Regression

11.4 Bayesian Inference in Simple Linear Regression

11.5 The General Linear Model and Multiple Regression

11.6 Analysis of Variance

754

11.7 The Two-Way Layout

763

11.8 The Two-Way Layout with Replications

11.9 Supplementary Exercises

12

Simulation

783

787

12.1 What Is Simulation?

787

12.2 Why Is Simulation Useful?

791

12.3 Simulating Speciﬁc Distributions

804

12.4 Importance Sampling

816

12.5 Markov Chain Monte Carlo

823

12.6 The Bootstrap

839

12.7 Supplementary Exercises

Tables

850

853

Answers to Odd-Numbered Exercises

References

Index

879

885

865

772

707

729

736

Preface

Changes to the Fourth Edition

.

.

.

.

.

.

.

.

I have reorganized many main results that were included in the body of the

text by labeling them as theorems in order to facilitate students in ﬁnding and

referencing these results.

I have pulled the important deﬁntions and assumptions out of the body of the

text and labeled them as such so that they stand out better.

When a new topic is introduced, I introduce it with a motivating example before

delving into the mathematical formalities. Then I return to the example to

illustrate the newly introduced material.

I moved the material on the law of large numbers and the central limit theorem

to a new Chapter 6. It seemed more natural to deal with the main large-sample

results together.

I moved the section on Markov chains into Chapter 3. Every time I cover this

material with my own students, I stumble over not being able to refer to random

variables, distributions, and conditional distributions. I have actually postponed

this material until after introducing distributions, and then gone back to cover

Markov chains. I feel that the time has come to place it in a more natural

location. I also added some material on stationary distributions of Markov

chains.

I have moved the lengthy proofs of several theorems to the ends of their

respective sections in order to improve the ﬂow of the presentation of ideas.

I rewrote Section 7.1 to make the introduction to inference clearer.

I rewrote Section 9.1 as a more complete introduction to hypothesis testing,

including likelihood ratio tests. For instructors not interested in the more mathematical theory of hypothesis testing, it should now be easier to skip from

Section 9.1 directly to Section 9.5.

Some other changes that readers will notice:

.

.

.

.

.

.

.

I have replaced the notation in which the intersection of two sets A and B had

been represented AB with the more popular A ∩ B. The old notation, although

mathematically sound, seemed a bit arcane for a text at this level.

I added the statements of Stirling’s formula and Jensen’s inequality.

I moved the law of total probability and the discussion of partitions of a sample

space from Section 2.3 to Section 2.1.

I deﬁne the cumulative distribution function (c.d.f.) as the prefered name of

what used to be called only the distribution function (d.f.).

I added some discussion of histograms in Chapters 3 and 6.

I rearranged the topics in Sections 3.8 and 3.9 so that simple functions of random

variables appear ﬁrst and the general formulations appear at the end to make

it easier for instructors who want to avoid some of the more mathematically

challenging parts.

I emphasized the closeness of a hypergeometric distribution with a large number of available items to a binomial distribution.

xi

xii

Preface

.

.

.

.

.

.

.

I gave a brief introduction to Chernoff bounds. These are becoming increasingly

important in computer science, and their derivation requires only material that

is already in the text.

I changed the deﬁnition of conﬁdence interval to refer to the random interval

rather than the observed interval. This makes statements less cumbersome, and

it corresponds to more modern usage.

I added a brief discussion of the method of moments in Section 7.6.

I added brief introductions to Newton’s method and the EM algorithm in

Chapter 7.

I introduced the concept of pivotal quantity to facilitate construction of conﬁdence intervals in general.

I added the statement of the large-sample distribution of the likelihood ratio

test statistic. I then used this as an alternative way to test the null hypothesis

that two normal means are equal when it is not assumed that the variances are

equal.

I moved the Bonferroni inequality into the main text (Chapter 1) and later

(Chapter 11) used it as a way to construct simultaneous tests and conﬁdence

intervals.

How to Use This Book

The text is somewhat long for complete coverage in a one-year course at the undergraduate level and is designed so that instructors can make choices about which topics

are most important to cover and which can be left for more in-depth study. As an example, many instructors wish to deemphasize the classical counting arguments that

are detailed in Sections 1.7–1.9. An instructor who only wants enough information

to be able to cover the binomial and/or multinomial distributions can safely discuss only the deﬁnitions and theorems on permutations, combinations, and possibly

multinomial coefﬁcients. Just make sure that the students realize what these values

count, otherwise the associated distributions will make no sense. The various examples in these sections are helpful, but not necessary, for understanding the important

distributions. Another example is Section 3.9 on functions of two or more random

variables. The use of Jacobians for general multivariate transformations might be

more mathematics than the instructors of some undergraduate courses are willing

to cover. The entire section could be skipped without causing problems later in the

course, but some of the more straightforward cases early in the section (such as convolution) might be worth introducing. The material in Sections 9.2–9.4 on optimal

tests in one-parameter families is pretty mathematics, but it is of interest primarily

to graduate students who require a very deep understanding of hypothesis testing

theory. The rest of Chapter 9 covers everything that an undergraduate course really

needs.

In addition to the text, the publisher has an Instructor’s Solutions Manual, available for download from the Instructor Resource Center at www.pearsonhighered

.com/irc, which includes some speciﬁc advice about many of the sections of the text.

I have taught a year-long probability and statistics sequence from earlier editions of

this text for a group of mathematically well-trained juniors and seniors. In the ﬁrst

semester, I covered what was in the earlier edition but is now in the ﬁrst ﬁve chapters (including the material on Markov chains) and parts of Chapter 6. In the second

semester, I covered the rest of the new Chapter 6, Chapters 7–9, Sections 11.1–11.5,

and Chapter 12. I have also taught a one-semester probability and random processes

Preface

xiii

course for engineers and computer scientists. I covered what was in the old edition

and is now in Chapters 1–6 and 12, including Markov chains, but not Jacobians. This

latter course did not emphasize mathematical derivation to the same extent as the

course for mathematics students.

A number of sections are designated with an asterisk (*). This indicates that

later sections do not rely materially on the material in that section. This designation

is not intended to suggest that instructors skip these sections. Skipping one of these

sections will not cause the students to miss deﬁnitions or results that they will need

later. The sections are 2.4, 3.10, 4.8, 7.7, 7.8, 7.9, 8.6, 8.8, 9.2, 9.3, 9.4, 9.8, 9.9, 10.6,

10.7, 10.8, 11.4, 11.7, 11.8, and 12.5. Aside from cross-references between sections

within this list, occasional material from elsewhere in the text does refer back to

some of the sections in this list. Each of the dependencies is quite minor, however.

Most of the dependencies involve references from Chapter 12 back to one of the

optional sections. The reason for this is that the optional sections address some of

the more difﬁcult material, and simulation is most useful for solving those difﬁcult

problems that cannot be solved analytically. Except for passing references that help

put material into context, the dependencies are as follows:

.

.

.

The sample distribution function (Section 10.6) is reintroduced during the

discussion of the bootstrap in Section 12.6. The sample distribution function

is also a useful tool for displaying simulation results. It could be introduced as

early as Example 12.3.7 simply by covering the ﬁrst subsection of Section 10.6.

The material on robust estimation (Section 10.7) is revisited in some simulation

exercises in Section 12.2 (Exercises 4, 5, 7, and 8).

Example 12.3.4 makes reference to the material on two-way analysis of variance

(Sections 11.7 and 11.8).

Supplements

The text is accompanied by the following supplementary material:

.

.

Instructor’s Solutions Manual contains fully worked solutions to all exercises

in the text. Available for download from the Instructor Resource Center at

www.pearsonhighered.com/irc.

Student Solutions Manual contains fully worked solutions to all odd exercises in

the text. Available for purchase from MyPearsonStore at www.mypearsonstore

.com. (ISBN-13: 978-0-321-71598-2; ISBN-10: 0-321-71598-5)

Acknowledgments

There are many people that I want to thank for their help and encouragement during

this revision. First and foremost, I want to thank Marilyn DeGroot and Morrie’s

children for giving me the chance to revise Morrie’s masterpiece.

I am indebted to the many readers, reviewers, colleagues, staff, and people

at Addison-Wesley whose help and comments have strengthened this edition. The

reviewers were:

Andre Adler, Illinois Institute of Technology; E. N. Barron, Loyola University; Brian

Blank, Washington University in St. Louis; Indranil Chakraborty, University of Oklahoma; Daniel Chambers, Boston College; Rita Chattopadhyay, Eastern Michigan

University; Stephen A. Chiappari, Santa Clara University; Sheng-Kai Chang, Wayne

State University; Justin Corvino, Lafayette College; Michael Evans, University of

xiv

Preface

Toronto; Doug Frank, Indiana University of Pennsylvania; Anda Gadidov, Kennesaw State University; Lyn Geisler, Randolph–Macon College; Prem Goel, Ohio

State University; Susan Herring, Sonoma State University; Pawel Hitczenko, Drexel

University; Lifang Hsu, Le Moyne College; Wei-Min Huang, Lehigh University;

Syed Kirmani, University of Northern Iowa; Michael Lavine, Duke University; Rich

Levine, San Diego State University; John Liukkonen, Tulane University; Sergio

Loch, Grand View College; Rosa Matzkin, Northwestern University; Terry McConnell, Syracuse University; Hans-Georg Mueller, University of California–Davis;

Robert Myers, Bethel College; Mario Peruggia, The Ohio State University; Stefan

Ralescu, Queens University; Krishnamurthi Ravishankar, SUNY New Paltz; Diane

Saphire, Trinity University; Steven Sepanski, Saginaw Valley State University; HenSiong Tan, Pennsylvania University; Kanapathi Thiru, University of Alaska; Kenneth Troske, Johns Hopkins University; John Van Ness, University of Texas at Dallas; Yehuda Vardi, Rutgers University; Yelena Vaynberg, Wayne State University;

Joseph Verducci, Ohio State University; Mahbobeh Vezveai, Kent State University;

Brani Vidakovic, Duke University; Karin Vorwerk, Westﬁeld State College; Bette

Warren, Eastern Michigan University; Calvin L. Williams, Clemson University; Lori

Wolff, University of Mississippi.

The person who checked the accuracy of the book was Anda Gadidov, Kennesaw State University. I would also like to thank my colleagues at Carnegie Mellon

University, especially Anthony Brockwell, Joel Greenhouse, John Lehoczky, Heidi

Sestrich, and Valerie Ventura.

The people at Addison-Wesley and other organizations that helped produce

the book were Paul Anagnostopoulos, Patty Bergin, Dana Jones Bettez, Chris

Cummings, Kathleen DeChavez, Alex Gay, Leah Goldberg, Karen Hartpence, and

Christina Lepre.

If I left anyone out, it was unintentional, and I apologize. Errors inevitably arise

in any project like this (meaning a project in which I am involved). For this reason,

I shall post information about the book, including a list of corrections, on my Web

page, http://www.stat.cmu.edu/~mark/, as soon as the book is published. Readers are

encouraged to send me any errors that they discover.

Mark J. Schervish

October 2010

Chapter

Introduction to

Probability

1.1

1.2

1.3

1.4

1.5

1.6

1

The History of Probability

Interpretations of Probability

Experiments and Events

Set Theory

The Deﬁnition of Probability

Finite Sample Spaces

1.7

1.8

1.9

1.10

1.11

1.12

Counting Methods

Combinatorial Methods

Multinomial Coefﬁcients

The Probability of a Union of Events

Statistical Swindles

Supplementary Exercises

1.1 The History of Probability

The use of probability to measure uncertainty and variability dates back hundreds

of years. Probability has found application in areas as diverse as medicine, gambling, weather forecasting, and the law.

The concepts of chance and uncertainty are as old as civilization itself. People have

alway…

Purchase answer to see full

attachment