AP Statistics Name:________________________
Chapter 1 Outline Per:______________
Vocabulary:
*Individuals & Variables
Categorical Variable
Quantitative Variable
*Distribution
*Inference
*frequency table
*relative frequency table
*Bar Graph
Segmented Bar Graph
*Pie Chart

*TwoWay Table
Marginal Distribution
Conditional Distribution
*Association
*Shape
Symmetric
Skewed
*Measures of Center
Mean
Median
*Spread
IQR
Outliers

*Four Step Process
*Dotplot
*Stemplot
Splitting stems
Backtoback stemplot
*Histogram
*5Number Summary
*BoxPlot
*Standard Deviation
Variances

Homework Assignments:
Date

Assignment

Weds 8/19

Intro #17 odd, 8
1.1 #1125 odd

Fri 8/21

1.2 #3747 odd, 5359 odd

Mon 8/24

1.3 #7997 odd skip 85, 103, 105, 107110

Weds 8/26

Review Assignment

Mon 8/31

Test

Introduction: Data Analysis: Making Sense of the Data
Example:
The following is a small section of a data set describing education in the US.
State Region Population(1000s) SAT Verbal SAT Math % taking % No HS
CA PAC 35,894 499 519 54 18.9
CO MTN 4,601 551 553 27 11.3
CT NE 3,504 512 514 84 12.5
Identify the individuals, and then identify the variables. Determine if each variable is categorical or quantitative.

Analyzing Categorical Data
What is the difference between a frequency table and a relative frequency table?
What is roundoff error?
The Bar Graph
Things to remember:

Make sure you label your axes and title your graph

Scale your axes appropriately

Each bar should correspond to the appropriate count.

Leave room between bars.

The Pie Chart:
Things to remember:

Must include all the categories that make up the whole

Counts will be percentages.

Example: Use the data to draw a bar graph AND a pie chart
Enrollment in Darien High School
Freshmen Sophomores Junior Seniors
340 320 409 389
When is it useful to use a bar graph?
When is it useful to use a pie chart?
TwoWay Tables and Marginal Distribution
Marginal Distribution:
Conditional Distribution:
Example: The table below gives information on the age and the number of school years completed for Americans (in thousands).


Age Group



Education

25 to 34

35 to 54

55 and over

Total

Did not Complete HS

5325

9152

16035


Completed High School

14061

24070

18320


1 to 3 years of college

11659

19925

9662


4 or more years of college

10342

19878

8005


Total





1) What is the total number of people described by this table?
2) What percentage of Americans did not finish high school? (marginal)
3) What percentage of Americans between the ages of 35 to 54 completed high school? (marginal)
4) What percentage of Americans who had between 1 and 3 years of college are over 55? (conditional)
5) What percentage of Americans who are between 25 and 34 years old had only completed high school? (conditional)
5) Is there a relationship between age and whether or not one completed 4 years of college? (conditional)
Age Group

2534

3554

55 and over

Percent with 4 years of college




6) Create a bar graph with the information in #5

Displaying Quantitative Data with Graphs
The Dotplot
Things to remember

You only need a properly labeled horizontal axis

Title the graph

Each dot represents a count of 1

Works well with a small data set
Example: Construct a Dotplot with the given information:
Runs Scored by the American league in the last 21 MLB AllStar Games

0

3

6

9

7

2

13

4

9

4

7

3

5

2

7

4

7

13

5

2

4


When describing the overall pattern of a distribution, you MUST address the following 4 things.

The CENTER of the data
2. The SHAPE of the data

Skewed to the right

Skewed to the left

The SPREAD of the Data
4. Any OUTLIERS in the data
Example:
Describe the overall pattern of the distribution of runs scored by the American League in the example above.
The Stemplot
Things to remember

Separate each piece of data into a stem (all but the rightmost digit) and a leaf (the final digit).

Write the stems vertically in increasing order from top to bottom.

Write the leaves in increasing order out from the stem

Be very neat and make sure you leave the same amount of space in between leaves.

Title your graph

Include a key identifying what the stem and leaves represent.

Works well with a small data set
Example:
During the early part of the 2004 baseball season, many sports fans and baseball players noticed that the number of home runs being hit seemed to be unusually large. Here are the data on the number of home runs hit by American League teams.
American League 35, 40, 43, 49, 51, 54, 57, 58, 58, 64, 68, 68, 75, 77
Construct an appropriate graph to display the number of home runs hit in the American League.
American League Home Runs
When is it advantageous to split stems in a stemplot?
The Histogram
Things to remember:

It is the most common graph of a quantitative variable.

The xaxis is continuous, so there should be no gaps between the bars (unless a class has zero observations)

Title your graph
How to make a histogram:

Divide the range of data into classes of equal width

Find the count (frequency) or percent (relative frequency) of individuals in each class

Label and scale your axes and draw histogram
Example: NBA Scoring Averages
The following table presents the average points scored per game (PTSG) for the 30 NBA teams in the 20092010 regular season. Create a frequency histogram and a relative frequency histogram
Team

PPG

Team

PPG

Team

PPG

Atlanta Hawks

101.7

Indiana Pacers

100.8

Oklahoma City Thunder

101.5

Boston Celtics

99.2

Los Angeles Clippers

95.7

Orlando Magic

102.8

Charlotte Bobcats

95.3

Los Angeles Lakers

101.7

Philadelphia 76ers

97.7

Chicago Bulls

97.5

Memphis Grizzlies

102.5

Phoenix Suns

110.2

Cleveland Cavaliers

102.1

Miami Heat

96.5

Portland Trail Blazers

98.1

Dallas Mavericks

102

Milwaukee Bucks

97.7

Sacramento Kings

100

Denver Nuggets

106.5

Minnesota Timberwolves

98.2

San Antonio Spurs

101.4

Detroit Pistons

94

New Jersey Nets

92.4

Toronto Raptors

104.1

Golden State Warriors

108.8

New Orleans Hornets

100.2

Utah Jazz

104.2

Houston Rockets

102.4

New York Knicks

102.1

Washington Wizards

96.2

How is the stemplot of a distribution related to its histogram?
What is the difference between a bar graph and a histogram?
When is it better to use a histogram rather than a stemplot or dotplot?
1.3 Describing Quantitative Data with Numbers
Measuring Center with the Mean and Median
The Mean (“x bar”)

The most common measure of center

The mean is the arithmetic average.

To find the mean of a set of observations you use the following formula:
_{ }
Important things to remember:

Although most common, not always most appropriate measure

Very sensitive to outliers

If a distribution is skewed, the mean will not be an accurate measure of center
The Median M

The _______________ of a distribution

The measure of center which is resistant to _________________ and _____________________

To find the median of a distribution:

Arrange observations in order from ______________ to ___________________

If the number of observations is odd, the median is the center observation in the ordered list

If the number of observations is even, the median is the average of the two center values in the ordered list
Comparing the Mean and Median

The mean and median in a symmetric distribution will be very close to each other.

If a distribution is exactly symmetric, the median and mean __________________________________

If a distribution is skewed to the left, the mean will _________________________________________

If the if the distribution is skewed to the right, the mean will __________________________________
Measuring the spread of a distribution
The Range:

The difference between the largest and smallest observation (max – min).

Useful if there are no outliers.
The Interquartile Range (IQR)

The range of the middle __________________

To calculate the quartiles:

is the median of the bottom half of the observations. It separates the bottom ___% of observations from the top ____%.

is the median of the top half of the observations. It separates the top ___% of observations from the bottom ___%.

IQR: _____________
Outliers: 1.5 x IQR Rule
An observation is an outlier if it falls:

Less than __________________________________

Higher than _____________________________________
Five Number Summary:

_________________

_________________

_________________

_________________

_________________
Example:
Here is data for the amount of fat (in grams) for each of McDonald’s different chicken sandwiches:
16, 10, 20, 17, 28, 12, 23, 17, 17, 10, 16, 9, 15, 9
Find the mean, median, IQR, 5 number summary, and if there are any outliers.
BoxPlots:

From the graph, we can see center, shape, and spread

How to Construct a BoxPlot:

A central box is drawn from the first quartile to the third quartile

A line in the box marks the median

Lines (whiskers) extend from the box out to the smallest and largest observations that aren’t outliers.
Example: Create a boxplot with the above data about McDonald’s chicken sandwiches.
How to find the standard deviation of n observations:

Find the distance of each observation from the mean and square each of these distances

Average the distances by dividing their sum by n1

is the square root of this average squared distance
Important note: The variance is the standard deviation squared (s²)
Choosing Measures of Center and Spread

The median and IQR are used when describing __________________________________

The mean and standard deviation are used when describing ___________________________
Share with your friends: 