Statistical Just need help with the remaining of the assignment Statistical Literacy and Critical Thinking
1.Frequency Table. What is a frequency table? How does it show categories and frequencies?
A basic frequency table has two columns: one lists the categories of the data, and the other lists
the frequency of each category, which is the number of data values in the category
2. Relative Frequency. What do we mean by relative frequency?
Relative frequency means how often an event takes place divided by all outcomes.
3. Cumulative Frequency. What do we mean by cumulative frequency?
The cumulative frequency of any category is the number of data values in that category and all
4. Binning. What is the purpose of binning? Give an example for which binning is useful?
Binning is how you group number of more or less continuous values into a smaller number of bins. Ex. Grouping people by their job titles.
5. Frequency Table. I made a frequency table with two columns, one labeled State and one labeled State
That don’t make sense it wouldn’t be a frequency table because one column in a frequency table must consist of frequency counts, but State and State capital are both column headings for names only.
6. Relative Frequency. The third category in my frequency table has a relative frequency of −25%.
That don’t make sense because of the negative number the table frequency wouldn’t be -25% it has to be a whole number
7. Cumulative Frequency. The third category in my frequency table has a cumulative frequency of 150.
Makes sense. Although no details are given, it’s certainly possible for the sum of the frequencies for the first three categories in a table to be 150, or any other whole number.
8. Bins. I saw two frequency tables of airline passenger weights, one using bins that spanned 10-pound
ranges (e.g., 101 to 110 pounds) and the second with bins that spanned 20-pound ranges (e.g., 101 to
120 pounds). The first table had twice as many categories as the second.
The statement makes sense. Since the bin width doubled in second frequency table, the number of classes will be halved.
Concepts and Applications
Pulse Rates of Females. In Exercises 9–12, refer to the following frequency table of pulse rates of a
sample of females.
PR (beats per min)
9. How many females are represented in the frequency table? Females: 145
How many categories are in the frequency table? 6 categories
10. List the relative frequencies that correspond to the given frequencies
11. List the cumulative frequencies that correspond to the given frequencies.
17, 50, 91, 128, 141, 145
12. What are the lowest and highest possible pulse rates that could be included in the frequency table?
Lowest 50 Highest 109
13. Birth Days. Births at a hospital in New York State occurred on the different days of the week (in the
order Monday through Sunday) with these frequencies: 52, 66, 72, 57, 57, 43, 53. Construct a frequency
table with a column for relative frequencies given as percentages. Do the data seem reasonable?
We would generally expect the number of births to be about the same on any day of the week, but with a relatively small data set like this one, it’s reasonable to see variation of the type shown in the table.
14. Clinical Trial. As part of a clinical trial, the drug tofacitinib citrate was administered in 5-mg doses to
1336 subjects as a rheumatoid arthritis treatment. Here are the numbers of adverse reactions: 57
subjects had headaches, 21 had hypertension, 60 had upper respiratory tract infections, 51 had
nasopharyngitis, and 53 had diarrhea. Construct a frequency table, and include a column for relative
frequencies given as percentages. Which side effect was the most common?
15. Train Derailments. An analysis of 50 train derailment incidents identified the main causes listed
below, where T denotes bad track, E denotes faulty equipment, H denotes human error, and O denotes
other causes (based on data from the Federal Railroad Administration). Construct a frequency table,
and include a column for relative frequencies expressed as per-centages. What was the most common
cause of derailment?
The most common cause of derailment was bad track.
16. Analysis of Last Digits. Weights of respondents were recorded as part of the California Health
Interview Survey. The last digits of weights from 50 randomly selected respondents are listed below.
Construct a frequency table with 10 classes. Based on the distribution, do the weights appear to be
estimates or actual measurements?
5 0 1 0 2 0 5 0 5 0 3 8 5 0 5 0 5 6 0 0 0 0 0 0 8
5 5 0 4 5 0 0 4 0 0 0 0 0 8 0 9 5 3 0 5 0 0 0 5 8
1 7. Academy Award–Winning Male Actors. The following data show the ages of all Academy Award
winning male actors at the time when they won the award (through 2016). Construct a frequency table
for the data, using bins of 20–29, 30–39, and so on. What ages are the most common for these winners?
44 41 62 53 47 35 34 34 49 41 37 38 34 32 40
43 48 41 39 49 57 41 38 39 52 51 35 30 39 36
43 49 36 47 31 47 37 57 42 45 42 45 62 43 42
48 49 56 38 60 30 40 42 37 76 39 53 45 36 62
43 51 32 42 54 52 37 38 32 45 60 46 40 36 47
29 43 37 38 45 50 48 60 50 39 55 44 33 41
Number of actors
The most common ages for these Academy Award–winning actors are in the 40s.
18. Body Temperatures. The following data show the body temperatures (in °F) of randomly selected
subjects who were not ill at the time. Construct a frequency table with seven bins:
96.9–97.2, 97.3–97.6, 97.7–98.0, and so on.
98.6 98.6 98.0 98.0 99.0 98.4 98.4 98.4
98.4 98.6 98.6 98.8 98.6 97.0 97.0 98.8
97.6 97.7 98.8 98.0 98.0 98.3 98.5 97.3
98.7 97.4 98.9 98.6 99.5 97.5 97.3 97.6
98.2 99.6 98.7 99.4 98.2 98.0 98.6 98.6
97.2 98.4 98.6 98.2 98.0 97.8 98.0 98.4
19. Loaded Die. An experiment was conducted in which a hole was drilled in a die and filled in with lead.
The die was then rolled repeatedly, giving the results shown in the following frequency table.
Outcome of die roll
a. How many times was the die rolled? 200
b. How many times was the outcome greater than 2? 142
c. What percentage of outcomes were 6? 16%
d. List the relative frequencies, as percentages, that correspond to the given frequencies. 13.5%, 15.5%, 21.0%, 20.0%, 14.0%, 16.0%
e. List the cumulative frequencies that correspond to the given frequency? 27, 58, 100, 140, 168, 200
20. Interpreting Family Data. Consider the following frequency table for the number of children in American families.
Number of children Number of families (millions)
4 or more 1.97
o a. According to the data, how many families are there in America?
o b. How many families have two or fewer children?
o c. What percentage of American families have no children?
o d. What percentage of American families have three or more children?
21. Computer Keyboards. The traditional keyboard configuration is called a Qwerty keyboard because of the positioning of the letters QWERTY on the left in the top row of letters. Developed in 1872, the Qwerty configuration supposedly forced people to type slower so that the early typewriters would not jam. Developed in 1936, the Dvorak configuration supposedly provides a more efficient arrangement by positioning the most used keys on the middle row (or “home” row), where they are more accessible.
A Discover magazine article suggested that you can measure the ease of typing by using this point rating
system: Count each letter on the home row as 0, count each letter on the top row as 1, and count each
letter on the bottom row as 2. For example, the word statistics would result in a rating of 7 on the Qwerty
keyboard and 1 on the Dvorak keyboard, as shown below.
S T A T I S T I C S
Qwerty keyboard 0 1 0 1 1 0 1 1 2 0 (sum: 7)
Dvorak keyboard 0 0 0 0 0 0 0 0 1 0 (sum: 1)
Using this rating system with each of the 52 words in the Preamble to the U.S. Constitution, we get the
Qwerty Keyboard Word Ratings
2 2 5 1 2 6 3 3 4 2 4 0 5
7 7 5 6 6 8 10 7 2 2 10 5 8
2 5 4 2 6 2 6 1 7 2 7 2 3
8 1 5 2 5 2 14 2 2 6 3 1 7
Dvorak Keyboard Word Ratings
2 0 3 1 0 0 0 0 2 0 4 0 3
4 0 3 3 1 3 5 4 2 0 5 1 4
0 3 5 0 2 0 4 1 5 0 4 0 1
3 0 1 0 3 0 1 2 0 0 0 1 4
o a. Create a frequency table for the Qwerty word ratings data. Use bins of 0–2, 3–5, 6–8, 9–11,
and 12–14. Include a column for relative frequency.
o b. Create a frequency table for the Dvorak word ratings data, using the same bins as in part
(a). Include a column for relative frequency.
o c. Based on your results from parts (a) and (b), which keyboard arrangement is easier for
The Dvorak configuration appears to make typing easier because it has more lower word ratings and fewer high ones.
22. Double Binning. The students in a statistics class conduct a transportation survey of students in their high school. Among other data, they record the age and mode of transportation between home and school for each student. The following table gives some of the data that were collected.
o a. Classify the two variables, age, and transportation, as qualitative or quantitative, and give
the level of measurement for each.
o b. In order to be analyzed or displayed, the data must be binned with respect to both
variables. Count the number of students in each of the 25 age/transportation categories and fill
in the blank cells in the following table.
Transportation to school
Age of student
23. Energy Table. The U.S. Energy Information Administration (EIA) website offers dozens of tables relating to energy use, energy prices, and pollution. Explore the selection of tables. Find a table of raw data that is of interest to you and convert it to an appropriate frequency table. Briefly discuss what you can learn from the frequency table that is less obvious in the raw data table.
24. Endangered Species. The website for the World Conservation Monitoring Centre in Great Britain provides data on extinct, endangered, and threatened animal species. Explore these data and summarize some of your more interesting findings with frequency tables.
25. Navel Data. The navel ratio is defined to be a person’s height divided by the height (from the floor) of his or her navel. An old theory says that, on average, the navel ratio of humans is the golden ratio: (1+5–√)/2.(1+5)/2. open , 1 plus square root of 5 , close . slash 2 . Measure the navel ratio of each person in your class. What percentage of students have a navel ratio within 5% of the golden ratio? What percentage of students have a navel ratio within 10% of the golden ratio? Does the old theory seem reliable?
26. Your Own Frequency Table (Unbinned). Collect your own frequency data for some set of categories that will not require binning. (For example, you might collect data by asking friends to do a taste test on some brand of cookie.) State how you collected your data, and make a list of all your raw data. Then summarize the data in a frequency table. Include a column for relative frequency, and also include a column for cumulative frequency if it is meaningful.
27. Your Own Frequency Table (Binned). Collect your own frequency data for some set of categories that will require binning (for example, weights of your friends or scores on a recent exam). State how you collected your data, and make a list of all your raw data. Then summarize the data in a frequency table. Include columns for relative frequency and cumulative frequency.
IN THE NEWS
28. Frequency Tables. Find a recent news article that includes some type of frequency table. Briefly describe the table, and explain how it is useful to the news report. Do you think the table was constructed in the best possible way for the article? If so, why? If not, what would you have done differently?
29. Relative Frequencies. Find a recent news article that gives at least some data in the form of relative frequencies. Briefly describe the data, and discuss why relative frequencies were useful in this case.
30. Cumulative Frequencies. Find a recent news article that gives at least some data in the form of cumulative frequencies. Briefly describe the data, and discuss why cumulative frequencies were useful in this case.
31. Temperature Data. Look for a weather report that lists yesterday’s high temperatures in many American cities. Choosing appropriate bins, make a frequency table for the high temperature data. Include columns for relative frequency and cumulative frequency. Briefly describe how and why you chose your bins.
Section 3.2 Exercises
1.Statistical Literacy and Critical Thinking 1. Distribution Graph. What is a distribution of data? Describe the important labels we should include when making a graph of a distribution.
The distribution is the way the values of a variable are spread over all possible values. We can summarize a distribution with a table or a graph. When we make a graph of a distribution, we should include a title and/or caption, scales and titles for the axes, and a legend if more than one data set is shown on the graph
2. Qualitative Data. Which types of graphs described in this section can be used for qualitative data? Give an example of a qualitative data set and describe how you would show it with each type of graph.
3. Yearly Data. Which type of graph described in this section would work best for depicting data consisting of one value from each of the past 50 consecutive years? What is a major advantage of this type of graph?
A time-series graph would work best for these data. A major advantage of this type of graph is that it allows us to see a pattern of the data over time.
4. Histogram and Stem plot. Assume that a data set is used to construct a histogram and a stem plot. Using only the histogram, is it possible to re-create the original list of data values? Using only the stem plot, is it possible to re-create the original list of data values? What is an advantage of a stem plot over a histogram?
Does It Make Sense? For Exercises 5–8, determine whether the statement makes sense (or is clearly true) or does not make sense (or is clearly false). Explain clearly. Not all of these statements have definitive answers, so your explanation is more important than your chosen answer.
5. Bar Graph. You are given a sample of data values along with specific bins. Your bar graph was marked wrong because it shows different frequencies than the ones shown on the teacher’s answer key.
Makes sense. The bars represent frequencies, so your frequencies should agree with the correct ones.
6. Pie Chart. Your pie chart must be wrong because you have the 45% frequency wedge near the upper left and the answer key shows it near the lower right.
7. Pareto Chart. A quality control engineer wants to draw attention to the car parts that require repair most often, so she uses a Pareto chart to illustrate the frequencies of repairs for the various car parts.
Makes sense. The Pareto chart puts the bars in order of frequency and therefore will make it easy to see which repairs occur most often.
8. Histogram. I rearranged the bars on my histogram so that the tallest bar would come first. Concepts and Applications
9. Histogram. Children living near a smelter in Texas were exposed to lead, and their IQ scores were subsequently measured. The following histogram was constructed from those IQ scores.
o a. Estimate the frequency for each of the six score categories. 2, 13, 41, 18, 3
o b. Estimate the total number of children included in the histogram. 78
o c. What are the lowest and highest possible IQ scores included in the histogram? 40, 160
o d. How does the shape of the histogram change if relative frequencies are used instead of frequencies?
The shape of the histogram does not change.
10. Understanding Data. Suppose you have a list of blood platelet counts from 500 patients in a hospital. Which of the following is most helpful in understanding the distribution of those values: frequency table, pie chart, or histogram?
Most Appropriate Display. Exercises 11–14 describe data sets but do not give actual data. For each data set, describe the data as qualitative or quantitative, and then state the type of graphic that you believe would be most appropriate for displaying the data, if they were available. Explain your choice.
11. Students. The number of full-time students enrolled in colleges in each year since 1990
Data are quantitative and represent changes over a period of time. A time-series graph would be effective in showing any trend in the number of full-time college students since 1990.
12. Colors. The colors of cars involved in fatal crashes last year
13. IQ Scores. IQ scores of 1000 adults randomly selected last year
Data are quantitative. A histogram would work well to show the frequencies of different categories of IQ scores.
14. Airline Choices. The percentage of flights on a single day by each airline (e.g., United, Delta, Southwest)
15. Academy Award–Winning Male Actors. Exercise 17 in Section 3.1 required the construction of a frequency table for the ages of Academy Award–winning male actors at the time when they won the award. Use that frequency table to construct the corresponding histogram.
16. Body Temperatures. Exercise 18 in Section 3.1 required the construction of a frequency table for a list of body temperatures (in °F°F degrees cap f) of randomly selected subjects. Use that frequency table to construct the corresponding histogram.
17. Job Hunting. A survey was conducted to determine how employees found their jobs. The table below lists the successful methods identified by 400 randomly selected employees. The data are based on results from the National Center for Career Strategies. Construct a Pareto chart that displays the given data. Based on these results, what appears to be the best method for someone seeking employment?
Method used for job hunting Frequency
Help-wanted ads 56
Executive search firms 44
Mass mailing 20
18. Job Hunting. Refer to the data given in Exercise 17 and construct a pie chart. Compare the pie chart to the Pareto chart. Can you determine which graph is more effective in showing the relative usefulness of the job–hunting methods?
19. Job Application Mistakes Chief financial officers of U.S. companies were surveyed about areas in which job applicants make mistakes. Here are the areas and the frequency of responses: interview (452); resume (297); cover letter (141); reference checks (143); interview follow-up (113); screening call (85). These results are based on data from Robert Half Finance and Accounting. Construct a pie chart representing the given data.
20. Job Application Mistakes Construct a Pareto chart of the data given in Exercise 19. Compare the Pareto chart to the pie chart. Which graph is more effective in showing the relative importance of the mistakes made by job applicants?
21. Dot plot. Refer to the QWERTY data in Exercise 21 in Section 3.1 and construct a dot plot.
22. Dot plot. Refer to the Dvorak data in Exercise 21 in Section 3.1 and construct a dot plot. Compare the result to the dot plot from Exercise 21 above. Based on the results, does either keyboard configuration appear to be better? Explain.
23. Stem plot. Construct a stem plot of these test scores: 67, 72, 85, 75, 89, 89, 88, 90, 99, 100. How does the stem plot show the distribution of these data? The lengths of the rows are similar to the heights of bars in a histogram, so longer rows of data values correspond to higher frequencies.
24. Stem plot. Listed below are the lengths (in minutes) of animated children’s movies. Construct a stem plot. Does the stem plot show the distribution of the data? If so, how?
83 88 120 64 69 71 76 74 75 76
75 75 79 80 78 78 83 77 71 83
80 73 72 82 74 84 90 89 81 81
90 79 92 82 89 82 74 86 76 81
75 75 77 70 75 64 73 74 71 94
25. DJIA. Listed below (in order by row) are annual high values of the Dow Jones Industrial Average from 1995 through 2015. Construct a time-series line chart of the data. Comment on the result.
5,216 6,561 8,259 9,374 11,568 11,401
11,350 10,635 10,454 10,855 10,941 12,464
14,198 13,279 10,580 11,625 12,929 13,589
16,577 18,054 18,351
26. Home Runs. Listed below (in order by row) are the numbers of home runs in Major League Baseball for each year from 1990 through 2016. Construct a time-series line chart of the data. Is there a trend?
3317 3383 3038 4030 3306 4081
4962 4640 5064 5528 5693 5458
5059 5207 5451 5017 5386 4957
4878 5042 4613 4552 4934 4661
4186 4909 5610
PROJECTS FOR THE INTERNET & BEYOND
27. CO2CO2 Emissions. Find recent data on international carbon dioxide emissions. Create a graph of the data and discuss any important features or trends that you notice.
28. Energy Table. Explore the energy tables at the U.S. Energy Information Administration (EIA) website. Choose a table that you find interesting and make a graph of its data, using any of the graph types discussed in this section. Explain how you made your graph, and briefly discuss what can be learned from it.
29. Statistical Abstract. Go to the website for the Statistical Abstract of the United States. Explore the selection of “frequently requested tables.” Choose one table of interest to you and make a graph from its data, using any of the graph types discussed in this section. Explain how you made your graph, and briefly discuss what can be learned from it.
30. Navel Data. Create an appropriate display of the navel data collected in Exercise 25 of Section 3.1. Discuss any special properties of this distribution.
IN THE NEWS
31. Bar Graphs. Find a recent news article that includes a bar graph with qualitative data categories.
o a. Briefly explain what the bar graph shows and discuss whether it helps make the point of the news article. Are the labels clear?
o b. Briefly discuss whether the bar graph could be recast as a dot plot.
o c. Is the bar graph already a Pareto chart? If so, explain why you think it was drawn this way. If not, do you think it would be clearer if the bars were rearranged to make a Pareto chart? Explain.
32. Pie Charts. Find a recent news article that includes a pie chart. Briefly discuss the effectiveness of the pie chart. For example, would it be better if the data were displayed in a bar graph rather than a pie chart? Could the pie chart be improved in other ways?
33. Histograms. Find a recent news article that includes a histogram. Briefly explain what the histogram shows and discuss whether it helps make the point of the news article. Are the labels clear? Is the histogram a time-series graph? Explain.
34. Line Charts. Find a recent news article that includes a line chart. Briefly explain what the line chart shows and discuss whether it helps make the point of the news article. Are the labels clear? Is the line chart a time-series graph? Explain.