Vocabulary: Individuals & Variables



Download 47.01 Kb.
Date20.10.2016
Size47.01 Kb.
AP Statistics Name:________________________

Chapter 1 Outline Per:______________




Vocabulary:

*Individuals & Variables

Categorical Variable

Quantitative Variable

*Distribution

*Inference

*frequency table

*relative frequency table

*Bar Graph

Segmented Bar Graph

*Pie Chart


*Two-Way Table

Marginal Distribution

Conditional Distribution

*Association

*Shape

Symmetric



Skewed

*Measures of Center

Mean

Median


*Spread

IQR


Outliers


*Four Step Process

*Dotplot


*Stemplot

Splitting stems

Back-to-back stemplot

*Histogram

*5-Number Summary

*BoxPlot


*Standard Deviation

Variances



Homework Assignments:

Date

Assignment

Weds 8/19

Intro #1-7 odd, 8

1.1 #11-25 odd



Fri 8/21

1.2 #37-47 odd, 53-59 odd

Mon 8/24

1.3 #79-97 odd skip 85, 103, 105, 107-110

Weds 8/26

Review Assignment

Mon 8/31

Test



Introduction: Data Analysis: Making Sense of the Data

Example:

The following is a small section of a data set describing education in the US.


State Region Population(1000s) SAT Verbal SAT Math % taking % No HS

CA PAC 35,894 499 519 54 18.9

CO MTN 4,601 551 553 27 11.3

CT NE 3,504 512 514 84 12.5


Identify the individuals, and then identify the variables. Determine if each variable is categorical or quantitative.


    1. Analyzing Categorical Data

What is the difference between a frequency table and a relative frequency table?

What is roundoff error?

The Bar Graph

Things to remember:



  • Make sure you label your axes and title your graph

  • Scale your axes appropriately

  • Each bar should correspond to the appropriate count.

  • Leave room between bars.




The Pie Chart:

Things to remember:



  • Must include all the categories that make up the whole

  • Counts will be percentages.



Example: Use the data to draw a bar graph AND a pie chart



Enrollment in Darien High School

Freshmen Sophomores Junior Seniors

340 320 409 389


When is it useful to use a bar graph?


When is it useful to use a pie chart?


Two-Way Tables and Marginal Distribution

Marginal Distribution:

Conditional Distribution:


Example: The table below gives information on the age and the number of school years completed for Americans (in thousands).









Age Group







Education

25 to 34

35 to 54

55 and over

Total

Did not Complete HS

5325

9152

16035




Completed High School

14061

24070

18320




1 to 3 years of college

11659

19925

9662




4 or more years of college

10342

19878

8005




Total













1) What is the total number of people described by this table?

2) What percentage of Americans did not finish high school? (marginal)

3) What percentage of Americans between the ages of 35 to 54 completed high school? (marginal)


4) What percentage of Americans who had between 1 and 3 years of college are over 55? (conditional)


5) What percentage of Americans who are between 25 and 34 years old had only completed high school? (conditional)


5) Is there a relationship between age and whether or not one completed 4 years of college? (conditional)




Age Group

25-34

35-54

55 and over

Percent with 4 years of college










6) Create a bar graph with the information in #5




    1. Displaying Quantitative Data with Graphs


The Dotplot

Things to remember



  • You only need a properly labeled horizontal axis

  • Title the graph

  • Each dot represents a count of 1

  • Works well with a small data set


Example: Construct a Dotplot with the given information:

Runs Scored by the American league in the last 21 MLB All-Star Games




0

3

6

9

7

2

13

4

9

4

7

3

5

2

7

4

7

13

5

2

4




When describing the overall pattern of a distribution, you MUST address the following 4 things.

  1. The CENTER of the data

2. The SHAPE of the data



    • Symmetric

  • Skewed to the right

  • Skewed to the left




  1. The SPREAD of the Data

  • Range

  • IQR

4. Any OUTLIERS in the data

Example:

Describe the overall pattern of the distribution of runs scored by the American League in the example above.



The Stemplot

Things to remember



  • Separate each piece of data into a stem (all but the rightmost digit) and a leaf (the final digit).

  • Write the stems vertically in increasing order from top to bottom.

  • Write the leaves in increasing order out from the stem

  • Be very neat and make sure you leave the same amount of space in between leaves.

  • Title your graph

  • Include a key identifying what the stem and leaves represent.

  • Works well with a small data set

Example:


During the early part of the 2004 baseball season, many sports fans and baseball players noticed that the number of home runs being hit seemed to be unusually large. Here are the data on the number of home runs hit by American League teams.
American League 35, 40, 43, 49, 51, 54, 57, 58, 58, 64, 68, 68, 75, 77
Construct an appropriate graph to display the number of home runs hit in the American League.

American League Home Runs

When is it advantageous to split stems in a stemplot?



The Histogram

Things to remember:



  • It is the most common graph of a quantitative variable.

  • The x-axis is continuous, so there should be no gaps between the bars (unless a class has zero observations)

  • Title your graph

How to make a histogram:



  • Divide the range of data into classes of equal width

  • Find the count (frequency) or percent (relative frequency) of individuals in each class

  • Label and scale your axes and draw histogram

Example: NBA Scoring Averages



The following table presents the average points scored per game (PTSG) for the 30 NBA teams in the 2009-2010 regular season. Create a frequency histogram and a relative frequency histogram


Team

PPG

Team

PPG

Team

PPG

Atlanta Hawks

101.7

Indiana Pacers

100.8

Oklahoma City Thunder

101.5

Boston Celtics

99.2

Los Angeles Clippers

95.7

Orlando Magic

102.8

Charlotte Bobcats

95.3

Los Angeles Lakers

101.7

Philadelphia 76ers

97.7

Chicago Bulls

97.5

Memphis Grizzlies

102.5

Phoenix Suns

110.2

Cleveland Cavaliers

102.1

Miami Heat

96.5

Portland Trail Blazers

98.1

Dallas Mavericks

102

Milwaukee Bucks

97.7

Sacramento Kings

100

Denver Nuggets

106.5

Minnesota Timberwolves

98.2

San Antonio Spurs

101.4

Detroit Pistons

94

New Jersey Nets

92.4

Toronto Raptors

104.1

Golden State Warriors

108.8

New Orleans Hornets

100.2

Utah Jazz

104.2

Houston Rockets

102.4

New York Knicks

102.1

Washington Wizards

96.2

How is the stemplot of a distribution related to its histogram?


What is the difference between a bar graph and a histogram?


When is it better to use a histogram rather than a stemplot or dotplot?



1.3 Describing Quantitative Data with Numbers
Measuring Center with the Mean and Median
The Mean (“x bar”)

  • The most common measure of center

  • The mean is the arithmetic average.

  • To find the mean of a set of observations you use the following formula:

Important things to remember:



  • Although most common, not always most appropriate measure

  • Very sensitive to outliers

  • If a distribution is skewed, the mean will not be an accurate measure of center


The Median M

  • The _______________ of a distribution

  • The measure of center which is resistant to _________________ and _____________________

  • To find the median of a distribution:

    1. Arrange observations in order from ______________ to ___________________

    2. If the number of observations is odd, the median is the center observation in the ordered list

    3. If the number of observations is even, the median is the average of the two center values in the ordered list


Comparing the Mean and Median

  • The mean and median in a symmetric distribution will be very close to each other.

  • If a distribution is exactly symmetric, the median and mean __________________________________

  • If a distribution is skewed to the left, the mean will _________________________________________

  • If the if the distribution is skewed to the right, the mean will __________________________________


Measuring the spread of a distribution
The Range:

  • The difference between the largest and smallest observation (max – min).

  • Useful if there are no outliers.


The Interquartile Range (IQR)

  • The range of the middle __________________

  • To calculate the quartiles:

  1. is the median of the bottom half of the observations. It separates the bottom ___% of observations from the top ____%.

  2. is the median of the top half of the observations. It separates the top ___% of observations from the bottom ___%.

  3. IQR: _____________



Outliers: 1.5 x IQR Rule

An observation is an outlier if it falls:



  • Less than __________________________________

  • Higher than _____________________________________


Five Number Summary:

  • _________________

  • _________________

  • _________________

  • _________________

  • _________________


Example:

Here is data for the amount of fat (in grams) for each of McDonald’s different chicken sandwiches:


16, 10, 20, 17, 28, 12, 23, 17, 17, 10, 16, 9, 15, 9
Find the mean, median, IQR, 5 number summary, and if there are any outliers.

BoxPlots:

  • From the graph, we can see center, shape, and spread

  • How to Construct a BoxPlot:

  • A central box is drawn from the first quartile to the third quartile

  • A line in the box marks the median

  • Lines (whiskers) extend from the box out to the smallest and largest observations that aren’t outliers.



Example: Create a boxplot with the above data about McDonald’s chicken sandwiches.

How to find the standard deviation of n observations:

  1. Find the distance of each observation from the mean and square each of these distances

  2. Average the distances by dividing their sum by n-1

  3. is the square root of this average squared distance


Important note: The variance is the standard deviation squared (s²)
Choosing Measures of Center and Spread

  • The median and IQR are used when describing __________________________________

  • The mean and standard deviation are used when describing ___________________________


Download 47.01 Kb.

Share with your friends:




The database is protected by copyright ©ininet.org 2020
send message

    Main page