Here is the link to Section 2.2 of the textbook. Below is a modified version of the section.

#### Histogram

For most of the work you do in this book, you will use a histogram to display the data. One advantage of a histogram is that it can readily display large data sets. A rule of thumb is to use a histogram when the data set consists of 100 values or more.

A **histogram **consists of contiguous (adjoining) boxes. It has both a horizontal axis and a vertical axis. The horizontal axis is labeled with what the data represents (for instance, distance from your home to school). The vertical axis is labeled either **frequency **or **relative frequency ** (or percent frequency or probability). The graph will have the same shape with either label. The histogram can give you the shape of the data, the center, and the spread of the data.

**To construct a histogram**, first decide how many **bars** or **intervals**, also called classes, represent the data. Many histograms consist of five to 15 bars or classes for clarity. The number of bars needs to be chosen. Choose a starting point for the first interval.

### Example 2.7

The following data are the heights (in inches to the nearest half inch) of 100 male semiprofessional soccer players. The heights are **continuous** data, since height is measured.

60 | 60.5 | 61 | 61 | 61.5 | 63.5 | 63.5 | 63.5 |

64 | 64 | 64 | 64 | 64 | 64 | 64 | 64.5 |

64.5 | 64.5 | 64.5 | 64.5 | 64.5 | 64.5 | 64.5 | 66 |

66 | 66 | 66 | 66 | 66 | 66 | 66 | 66 |

66 | 66.5 | 66.5 | 66.5 | 66.5 | 66.5 | 66.5 | 66.5 |

66.5 | 66.5 | 66.5 | 66.5 | 67 | 67 | 67 | 67 |

67 | 67 | 67 | 67 | 67 | 67 | 67 | 67 |

67.5 | 67.5 | 67.5 | 67.5 | 67.5 | 67.5 | 67.5 | 68 |

68 | 69 | 69 | 69 | 69 | 69 | 69 | 69 |

69 | 69 | 69 | 69.5 | 69.5 | 69.5 | 69.5 | 69.5 |

70 | 70 | 70 | 70 | 70 | 70 | 70.5 | 70.5 |

70.5 | 71 | 71 | 71 | 72 | 72 | 72 | 72.5 |

72.5 | 73 | 73.5 | 74 |

The smallest data value is 60. We can use this as the starting point.

The largest value is 74. Let’s use this as the ending value.

Next, calculate the width of each bar or class interval. To calculate this width, subtract the starting point from the ending value and divide by the number of bars (you must choose the number of bars you desire). Suppose you choose five bars.

If you round 2.8 up, to the whole number, it becomes 3. So, let build five bars each with width of 3. The boundaries are:

- 60
- 60 + 3 = 63
- 63 + 3 = 66
- 66 + 3 = 69
- 69 + 3 = 71
- 71 + 3 = 75

The heights 60 through 61.5 inches are in the interval 60–63. The heights that are 63.5 through 64.5 are in the interval 63–66. The heights that are 66 through 68 are in the interval 66–69. The heights 69 through 70.5 are in the interval 69–71. The heights 71 through 74 are in the interval 71–75.

The following histogram displays the heights on the *x*-axis and frequency on the *y*-axis.

- The first bin tells us there are 5 people with heights between 60 to 63 inches.
- The second bin tells us there are 18 people with heights between 63 to 66 inches.
- The third bin tells us there are 42 people with heights between 66 to 69 inches.
- The fourth bin tells us there are 27 people with heights between 60 to 72 inches.
- The last bin tells us there are 8 people with heights between 72 to 75 inches.

**Answer:**

- The first bin tells us 5% of the sample have heights between 59.95 to 61.95 inches.
- The second bin tells us 3% of the sample have heights between 61.95 to 63.95 inches.
- The third bin tells us 15% of the sample have heights between 63.95 to 65.95 inches.
- The fourth bin tells us 40% of the sample have heights between 65.95 to 67.95 inches.
- The fifth bin tells us 17% of the sample have heights between 67.95 to 69.95 inches.
- The six bin tells us 12% of the sample have heights between 69.95 to 71.95 inches.
- The seventh bin tells us 7% of the sample have heights between 71.95 to 73.95 inches.
- The last bin tells us 1% of the sample have heights between 73.95 to 75.95 inches.

### TRY IT 2.7

The following data are the shoe sizes of 50 male students. The sizes are discrete data since shoe size is measured in whole and half units only. Construct a histogram and calculate the width of each bar or class interval. Suppose you choose six bars.

9; 9; 9.5; 9.5; 10; 10; 10; 10; 10; 10; 10.5; 10.5; 10.5; 10.5; 10.5; 10.5; 10.5; 10.5

11; 11; 11; 11; 11; 11; 11; 11; 11; 11; 11; 11; 11; 11.5; 11.5; 11.5; 11.5; 11.5; 11.5; 11.5

12; 12; 12; 12; 12; 12; 12; 12.5; 12.5; 12.5; 12.5; 14

**Example 2.8**

1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1

2; 2; 2; 2; 2; 2; 2; 2; 2; 2

3; 3; 3; 3; 3; 3; 3; 3; 3; 3; 3; 3; 3; 3; 3; 3

4; 4; 4; 4; 4; 4

5; 5; 5; 5; 5

6; 6

*x*-axis and the frequency on the

*y*-axis.

- The first bin tells us, 11 students bought 1 book.
- The second bin tells us, 10 students bought 2 books.
- The third bin tells us, 16 students bought 3 books.
- The fourth bin tells us, 6 students bought 4 books.
- The fifth bin tells us, 5 students bought 5 books.
- The last bin tells us, 2 students bought 6 books.

### TRY IT 2.8

The following data are the number of sports played by 50 student athletes. The number of sports is discrete data since sports are counted.

1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1

2; 2; 2; 2; 2; 2; 2; 2; 2; 2; 2; 2; 2; 2; 2; 2; 2; 2; 2; 2; 2; 2

3; 3; 3; 3; 3; 3; 3; 3

20 student athletes play one sport. 22 student athletes play two sports. Eight student athletes play three sports.

### EXAMPLE 2.9

Using this data set, construct a histogram.

Number of Hours My Classmates Spent Playing Video Games on Weekends | ||||
---|---|---|---|---|

9.95 | 10 | 2.25 | 16.75 | 0 |

19.5 | 22.5 | 7.5 | 15 | 12.75 |

5.5 | 11 | 10 | 20.75 | 17.5 |

23 | 21.9 | 24 | 23.75 | 18 |

20 | 15 | 22.9 | 18.8 | 20.5 |

#### Answer:

Some values in this data set fall on boundaries for the class intervals. A value is counted in a class interval if it falls on the left boundary, but not if it falls on the right boundary. Different researchers may set up histograms for the same data in different ways. There is more than one correct way to set up a histogram.

### TRY IT 2.9

The following data represent the number of employees at various restaurants in New York City. Using this data, create a histogram.

22; 35; 15; 26; 40; 28; 18; 20; 25; 34; 39; 42; 24; 22; 19; 27; 22; 34; 40; 20; 38; and 28

Use 10–19 as the first interval.

#### Dot Plot

Dot Plot is another graph to visualize the distribution of data. In a **dot ****plot**, we put a dot above each data value.

**Example**. The dotplot shows the distribution of ages of employees at a company.

- How many employees are in this company?

Answer: There are 25 dots in this graph, so in total there are 25 employees in this company.

- How many employees are 40 years old?

Answer: There are 4 dots above 40 indicating that 4 employees are 40 years old.

- What percentages of the employees are 40 years old?

Answer: 4 employees of the total 25 employees are 40 years old, so we have 16% of the employees are 40 years old.

- How many employees are 45 years old?

Answer: 2 employees are 45 years old.

- How many employees are 44 or older?

Answer: There are 6 employees 44 years old and 2 employees are 45 years old. So, in total 8 employees are 44 or older.

- What percentages of the employees are 44 or older?

Answer: 8 employees are 44 or older.out of the total 25. So, we have 32% of the employees are 44 or older.

## Recent Comments