Definitions



Download 15.28 Kb.
Date17.12.2020
Size15.28 Kb.
#55019
Day-2-Week-2
zhang-routledge, Water treatment

Week 2 Day 2

Quan Sam


1/10/2020

Definitions:


Inferential situation is when we take data from samples and make generalizations about a population allowing us to make predictions.

We use 2 indicators for the estimation:






Examples:


First, we generate 11 different samples, and

means_simulated=vector(mode = "numeric", length = 11)

a=rnorm(20,85,2)
b=rep(1,20)
means_simulated[1]=mean(a)
plot(a,b,cex = .5, col = "dark red")
points(x=mean(a),y=0.8,pch=24,cex=0.5,col = "dark red")

a=rnorm(20,85,2)
b=rep(1.025,20)
means_simulated[2]=mean(a)
points(a,b,cex = .5, col = "pink")
points(x=mean(a),y=0.8,pch=24,cex=0.5,col = "pink")

a=rnorm(20,85,2)
b=rep(1.05,20)
means_simulated[3]=mean(a)
points(a,b,cex = .5)
points(x=mean(a),y=0.8,pch=24,cex=0.5)

a=rnorm(20,85,2)
b=rep(1.075,20)
means_simulated[4]=mean(a)
points(a,b,cex = .5, col = "green")
points(x=mean(a),y=0.8,pch=24,cex=0.5,col = "green")

a=rnorm(20,85,2)
b=rep(1.1,20)
means_simulated[5]=mean(a)
points(a,b,cex = .5, col = "purple")
points(x=mean(a),y=0.8,pch=24,cex=0.5,col = "purple")

a=rnorm(20,85,2)
b=rep(1.125,20)
means_simulated[6]=mean(a)
points(a,b,cex = .5, col = "tomato")
points(x=mean(a),y=0.8,pch=24,cex=0.5,col = "tomato")

a=rnorm(20,85,2)
b=rep(1.15,20)
means_simulated[7]=mean(a)
points(a,b,cex = .5, col = "plum")
points(x=mean(a),y=0.8,pch=24,cex=0.5,col = "plum")

a=rnorm(20,85,2)
b=rep(1.175,20)
means_simulated[8]=mean(a)
points(a,b,cex = .5, col = "gold")
points(x=mean(a),y=0.8,pch=24,cex=0.5,col = "gold")

a=rnorm(20,85,2)
b=rep(1.2,20)
means_simulated[9]=mean(a)
points(a,b,cex = .5, col = "dimgrey")
points(x=mean(a),y=0.8,pch=24,cex=0.5,col = "dimgrey")

a=rnorm(20,85,2)
b=rep(1.225,20)
means_simulated[10]=mean(a)
points(a,b,cex = .5, col = "aquamarine")
points(x=mean(a),y=0.8,pch=24,cex=0.5,col = "aquamarine")

a=rnorm(20,85,2)
b=rep(1.25,20)
means_simulated[11]=mean(a)
points(a,b,cex = .5, col = "firebrick1")
points(x=mean(a),y=0.8,pch=24,cex=0.5,col = "firebrick1")
mean(means_simulated)

## [1] 85.15239



mean(means_simulated)

## [1] 85.15239



abline(v = mean(means_simulated), col = "red")

We can conclude that although we do not have the mean values of the whole population, we can estimate it based on the mean values of 11 samples as they are approaching to the mean values of the population. Now, we generate 30 values.



weight = c(56.6,54.8,59.0,60.4,61.8,62.6,65.0,58.1,61.4,60.8,59.2,58.1,57.5,55.2,54.6,61.6,56.9,61.3,67.2,53.9,54.1,62.0,63.5,58.1,56.0,51.5,63.8,58.1,58.2,61.3)
mean(weight)

## [1] 59.08667



sd(weight)

## [1] 3.655203



If we only have a sample of 30 values, we still can estimate the mean value of the population if the mean value of the sample lies in the confidence interval of .

The confidence interval is spanning from mean(weight) - 2*sd(weight)/sqrt(30) to mean(weight) + 2*sd(weight)/sqrt(30).
Download 15.28 Kb.

Share with your friends:




The database is protected by copyright ©ininet.org 2024
send message

    Main page