Observed Variable



Download 260.92 Kb.
Page3/3
Date conversion28.01.2017
Size260.92 Kb.
1   2   3

Path Analyses of ANOVAs

Since ANOVA is simply regression analysis, the representation of ANOVA in SEM is merely as a regression analysis. The key is to represent the differences between groups with group coding variables, just as we did in 513 and in the beginning of 595 . . .


1) Independent Groups t-test

The two groups are represented by a single, dichotomous observed group-coding variable. It is the independent variable in the regression analysis.

Dichotomous variable representing the two groups

Dependent

Variable

e
2) One Way ANOVA

The K groups are represented by K-1 group-coding variables created using one of the coding schemes (although I recommend contrast coding). They are the independent variables in the regression analysis. If contrast codes are used, the correlations between all the group coding variables are 0, so no arrows between them need be shown.


Note: Contrast codes were used so Group-coding variables are uncorrelated.
1st Group-coding contrast code variable

2nd Group-coding contrast code variable.

(K-1)th Group-coding contrast code variable.

. . . . .

Dependent

Variable

e

3) Factorial ANOVA.



Each factor is represented by G-1 group-coding variables created using one of the coding schemes. The interaction(s) is/are represented by products of the group-coding variables representing the factors. Again, no correlations between coding variables need be shown if contrast codes are used.


Note: Contrast codes should be used to make sure the group-coding variables uncorrelated (assuming equal sample sizes.)
1st Factor

1st Factor

2st Factor

2st Factor

Interaction

Interaction

Interaction

Interaction

Dependent

Variable


e

Path Diagrams representing Exploratory Factor Analysis



1) Exploratory Factor Analysis solution with one factor.
The factor is represented by a latent variable with three or more observed indicators. (Three is the generally recommended minimum no. of indicators for a factor.)


e1




Obs 1

F


e2




Obs 2



e3

Obs 3

Note that factors are exogenous. Indicators are endogenous. Since the indicators are endogenous, all of their variance must be accounted for by the model. Thus, each indicator must have an error latent variable to account for the variance in it not accounted for by the factor.


2) Exploratory Factor Analysis solution with two orthogonal factors.
Each factor is represented by a latent variable with three or more indicators. The orthogonality of the factors is represented by the fact that there is no arrow connecting the factor symbols.
Let’s assume that Obs1, 2, and 3 are thought to be primary indicators of F1 and 4,5,6 of F2.
For exploratory factor analysis, each variable is required to load on all factors. Of course, the hope is that the loadings will be substantial on only some of the factors and will be close to 0 on the others, but the loadings on all factors are retained, even if they’re close to 0. The loadings that might be close to 0 in the model are shown in red. These are sometimes called cross loadings.


e1




Obs 1



F1

e2

Obs 2



e3

Obs 3




e4




Obs 4



F2

e5

Obs 5



e6

Obs 6

Orthogonal factors represent uncorrelated aspects of behavior.
Note what is assumed here: There are two independent characteristics of people – F1 and F2. Each one influences responses to all six items, although it is hoped that F1 influences primarily the first 3 items and that F2 influences primarily the last 3 items.
If Obs 1 thru Obs 3 are one class of behavior and Obs 4 thru Obs 6 are a second class, then if the loadings “fit” the expected pattern, this would be evidence for the existence of two independent dispositions – that represented by F1 and that represented by F2.3) Exploratory Factor Analysis solution with two oblique factors.
Each factor is represented by a latent variable with three or more indicators. The obliqueness of the factors is represented by the fact that there IS an arrow connecting the factors.

e4

Obs 1



e5

Obs 2




F1

F




e6

Obs 3



e7

Obs 4



e8

Obs 5




F2




e9

Obs 6

Again, in exploratory factor analysis, all indicators load on all factors, even if the loadings are close to zero.


Exploratory factor analysis (EFA) programs, such as that in SPSS, always report estimates of all loadings.
This solution is potentially as important as the orthogonal solution, although in general, I think that researchers are more interested in independent dispositions than they are in correlated dispositions.
But discovering why two dispositions are separate but still correlated is an important and potentially rewarding task.

Path Diagram of EFA model of NEO-FFI Big Five 60 item questionnaire.


(From Biderman, M. (2014). Against all odds:  Bifactors in EFAs of Big Five Data.  Part of symposium: S. McAbee & M. Biderman, Chairs. Theoretical and Practical Advances in Latent Variable Models of Personality. Conducted at the 29th annual conference of The Society for Industrial and Organizational Psychology; Honolulu, Hawaii, 2014.

Cross loadings are red’d.

Crossloadings are in red. Path Diagrams vs the Table of Loadings.



(Cross loadings are in red in both representations.)

Pattern Matrixa




Factor

1

2

3

4

5

ne1

-.025

-.001

-.056

-.067

.678

ne2

.086

.002

.021

-.026

.268

ne3

.126

-.023

.133

.229

.260

ne4

.054

.014

.112

.184

.492

ne5

-.115

-.045

-.069

-.269

.624

ne6

.042

.206

-.200

.085

.563

ne7

.194

-.168

.046

-.203

.166

ne8

.342

-.127

.087

.299

.432

ne9

.270

.046

.092

.419

.223

ne10

-.017

-.138

-.105

.135

.327

ne11

.209

-.270

.009

.102

.278

ne12

.140

-.170

-.103

-.070

.483

na1

-.100

-.193

.081

.518

.249

na2

.347

-.153

-.076

.421

-.139

na3

.091

-.123

-.050

.560

.125

na4

-.139

.034

-.011

.504

-.099

na5

.298

.073

.033

.335

.188

na6

.335

.157

-.070

.353

.031

na7

.019

-.177

.067

.231

.346

na8

.053

.029

-.114

.543

.292

na9

.163

.115

-.120

.319

-.211

na10

-.057

-.123

.167

.594

.271

na11

.028

.012

.055

.471

-.227

na12

.027

-.210

-.075

.534

-.022

nc1

.092

-.429

-.090

.183

.008

nc2

.019

-.580

-.049

.055

-.086

nc3

-.030

-.376

-.037

.011

-.140

nc4

-.093

-.406

.052

.156

.028

nc5

.026

-.716

-.025

-.156

.057

nc6

.146

-.476

.100

.241

-.052

nc7

-.154

-.694

-.070

-.121

.109

nc8

.092

-.528

.019

-.017

.110

nc9

.040

-.573

.044

-.050

.094

nc10

.021

-.720

.016

-.103

.044

nc11

.035

-.551

-.067

.148

.005

nc12

.065

-.628

.035

-.011

.018

ns1

.501

.072

.145

-.231

-.031

ns2

.544

-.119

.132

-.044

.027

ns3

.653

.037

-.026

-.069

-.118

ns4

.664

.025

-.141

.006

.189

ns5

.660

.082

-.031

.200

.047

ns6

.677

-.053

-.104

.011

.046

ns7

.658

.035

.027

-.069

.011

ns8

.552

.068

-.078

.265

.012

ns9

.724

-.093

.134

-.021

.027

ns10

.662

-.041

-.068

-.009

.220

ns11

.563

-.266

.035

.030

-.132

ns12

.611

-.082

-.177

.075

-.003

no1

-.146

.334

.267

-.003

-.103

no2

.030

.366

.136

.061

.087

no3

-.014

.044

.661

-.040

.046

no4

.116

.037

.142

-.119

-.008

no5

-.010

-.090

.731

.102

-.166

no6

.029

.168

.217

.124

.207

no7

-.064

.025

.207

.082

-.082

no8

.024

.233

.240

-.101

-.197

no9

-.008

-.012

.822

.049

-.123

no10

.052

.135

.536

.026

-.033

no11

-.026

-.111

.560

-.041

.110

no12

.021

.131

.615

-.174

.090

Extraction Method: Maximum Likelihood.

Rotation Method: Oblimin with Kaiser Normalization.



a. Rotation converged in 14 iterations.




Whew – there are tons of crossloadings, most of them near 0. Can’t they just be assumed to be zero?


This kind of thinking leads to Confirmatory Factor Analysis Models.

Confirmatory vs Exploratory Factor Analysis


In Exploratory Factor Analysis, as discussed above, the loading of every item on every factor is estimated. The analyst hopes that some of those loadings will be large and some will be small. An EFA two-orthogonal-factor model is represented by the following diagram.

Obs 1


Obs 2

Obs 3


Obs 4

Obs 5


Obs 6

e4

e5



e6

e7

e8



e9

F1

F2



Exploratory

As mentioned above there are arrows (loadings) connecting each variable to each factor. We have no hypotheses about the loading values – we’re exploring – so we estimate all loadings and let them lead us. Generally, EFA programs do not allow you to specify or fix loadings to pre-determined values.


In contrast to the exploration implicit in EFA, a factor analysis in which some loadings are fixed at specific values is called a Confirmatory Factor Analysis. The analysis is confirming one or more hypotheses about loadings, hypotheses representing by our fixing them at specific (usually 0) values.
Unfortunately, EFA and CFA often cannot be done using the same computer program. Exceptions that I know of are MPlus and Rcmdr.
Amos and many CFA programs other than MPlus and Rcmdr are unable to do EFAs such as the above models.
So you may have to employ both SPSS (for EFA) and AMOS (for CFA) in exploring the interrelations between variables and factors. Often, analysts will use an EFA program to estimate ALL loadings to all factors, then use an SEM program to perform a confirmatory factor analysis, fixing those loadings that were close to 0 in the EFA to 0 in the CFA.

Obs 1


Obs 2

Obs 3


Obs 4

Obs 5


Obs 6

F2

F1


E1

E2



E3

E4

E5



E6

Confirmatory

Note that in the above confirmatory model, loadings of indicators 4-6 on F1 are fixed at 0, as are loadings of indicators 1-3 on F2. (The arrows are missing, therefore assumed to be zero.) The Identification Problem


X
Y

E

Mean, Variance


Mean, Variance
Consider the simple regression model . . .



Corr(E,Y)



Y = a + b*X


Quantities which can be computed from the data . .
Mean of the X variable Variance of the X variable

Mean of the Y variable. Variance of the Y variable.

Correlation of Y with X
Quantities in the diagram .

Remember that in SEM path diagrams, all the variance in every endogenous variable must be accounted for. For that reason, the path diagram includes a latent “Other factors” or “Error of measurement” variable, labeled “E”.




Note: Mean and variance of Y are not separately identified in the model because they are assumed to be completely determined by Y’s relationship to X and to E.

Mean of X Mean of E

Variance of X Variance of E

Intercept of X->Y regression

Slope of X->Y regression

Correlation of E with Y


Whoops! There are 5 quantities in the data but 7 in the model. There are too few quantities in the data. The model is underidentified. – not identified enough - there aren't enough quantities from the data to identify each model value.
Dealing with underidentification . . .
Solution 1

0) The mean of E is always assumed to be 0.

1) Fix the variance of E to be 1.
So in this regression model, the path diagram will be

0, 1



E



rEY

Mean, Variance




X




Y




Y = a + b*X

In this case, there are 5 quantities in the model that must be estimated – mean of X, variance of X, intercept of equation, slope of equation, and correlation of E with Y. There are also 5 quantities that can be estimated from the observed data. The model is said to be “just identified” or “completely identified”. This means that every estimable quantity in the model corresponds in some way to one quantity obtained from the data.


Or,
Solution 2

0) The mean of E is always assumed to be 0.

1) Fix covariance of E with Y at 1.

X

Y



E

Mean, Variance


0, Variance



1



Y=a+b*X

Underidentified models: Cannot be estimated.
Just identified models: Every model quantity is a function of some data quantity. But no parsimony.
Overidentified models: There are more data quantities than model quantities. It is said that you then have “degrees of freedom” in your model. This is good. Relationships are being explained by fewer model quantities than there are data quantities. This is parsimonious – what science is all about.

Identification in CFA models – We’ll have to do this for the reasons outlined above.
Here’s a typical CFA two-factor model.
Insuring that the “Residuals” part of the model – the “E”s – is identified.
We always assume all “E” means = 0.
Solution 1. Fix all “E” variances to 1.

Obs 1


Obs 2

Obs 3


Obs 4

Obs 5


Obs 6

F2

F1


E1

E2



E3

E4

E5



E6

0,1


0,1

0,1


0,1

0,1


0,1

0, Variance

0, Variance

0, Variance

0, Variance

0, Variance

0, Variance

or
Solution 2. Fix all E  O covariances to 1 and estimate variances of “E”s.


Obs 1

Obs 2


Obs 3

Obs 4


Obs 5

Obs 6


F2

F1


E1

E2

E3



E4

E5

E6




1


I recommend this.

1



1



1



1



1

Insuring that the Factors part of the CFA is identified
1. Fix one of the loadings for each factor at 1 and estimate all factor variances

Obs 1


Obs 2

Obs 3


Obs 4

Obs 5


Obs 6

F2

F1


E1

E2



E3

E4

E5



E6

1

1



The item whose loading is fixed is called the reference item.

Or
2. Fix the variance of each factor at 1 and estimate all factor loadings.




I recommend this, although some examples below will use the above method.
Obs 1

Obs 2


Obs 3

Obs 4


Obs 5

Obs 6


F2

F1


E1

E2

E3



E4

E5

E6



1

1

This probably seems quite arcane right now, and it is really the province of the mathematical statisticians and programmers who discovered the algorithms that allow us to apply the models.



But we must use these conventions when actually applying models such as these.
The main point for us – what the factor analysis gives us.
The above models tell us that the variation in 6 observed variables – Obs1, Obs2, Obs3, Obs4, Obs5, Obs5 - is due to variation in two internal characteristics – F1 and F2.
So we have explained why there is variation in the observed variables – because of variation in F1 and F2. We have also explained why the variation in Obs1 to Obs3 is unrelated to the variation in Obs4 to Obs6 – because F1 and F2 are uncorrelated.

Examples
1. Fixing all variances.
Obs 1

Obs 2


Obs 3

Obs 4


Obs 5

Obs 6


F2

F1


E1

E2

E3



E4

E5

E6



1

1

1



1

1

1



1

1

2. Fixing residual loadings but Factor variances



Obs 1

Obs 2


Obs 3

Obs 4


Obs 5

Obs 6


F2

F1


E1

E2

E3



E4

E5

E6



1

1

1



1

1

1



1

1


My favorite.

3. Fixing residual loadings and factor loadings.




1
Obs 1

Obs 2


Obs 3

Obs 4


Obs 5

Obs 6


F2

F1


E1

E2

E3



E4

E5

E6



1




1

1

1

1

1

1
Programming with path diagrams: Introduction to Amos
Amos is an add-on program to SPSS that performs confirmatory factor analysis and structural equation modeling.
It is designed to emphasize a visual interface and has been written so that virtually all analyses can be performed by drawing path diagrams.
It also contains a text-based programming language for those who wish to write programs in the command language.
The Amos drawing toolkit with functions of the most frequently used tools.

Other programs

LISREL


EQS

Amos


Mplus – the best
LAVAAN


Tool to tell Amos to run the diagram.

Tool to move an object

Tool to erase an object

Tool to copy an object

Tool to deselect all objects in diagram

Tool to select all objects in diagram

Tool to select a single object

Tool to draw regression arrows

Tool to put text on the diagram

Tool to draw correlation arrows

Tool to draw latent variables

Observed variable tool
snaghtml3cf74d5
Creating an Amos analysis
1. Open Amos Graphics.

1b. File -> New

2. File -> Data Files . . . (Because you have to connect the path diagram to a data file.)
3. Specify the name of the file that contains the raw or summary data.

a. Click on the [File Name] button.

b. Navigate to the file and double-click on it.

c. Click on the [OK] button.


In this example, I opened a file called IncentiveData080707.sav
snaghtml3da766b

4. Draw the desired path diagram using the appropriate drawing tools.



5. Name the variables by right-clicking on each object. And choosing “Object Properties . . .”

snaghtml3df5506
The example below is a simple

correlation analysis.


snaghtml3ddd75e
6. Save the model. File -> Save As...

7. To run Amos, click on the



button.
8. Click on to see output.
Amos Details
For most of the analyses you’ll perform using Amos, you should get in the habit of doing the following . . .
View -> Analysis Properties -> Estimation
Check “Estimate means and intercepts”
View -> Analysis Properties -> Output
Check “Standardized estimates”

Check “Squared multiple correlations”

Doing old things in a new way: Analyses we’ve done before, now performed using Amos

The data used for this example are the valdatnm data in the htlm2\5510\datafiles folder. We’ll simply look at the output here. Later, we’ll focus on the menu sequences needed to get this output.


a. SPSS analysis of the correlation of FORMULA with P511G
Correlations


b. Amos Input Path Diagram - Input Parameter Values


All variables are exogenous.


(Note, I told Amos to estimate means for this analysis.)


The covariance of p511g and Formula.



c. Amos Output Path Diagram - Unstandardized (Raw) coefficients


The mean and variance of p511g.

The mean and variance of Formula.



c. Amos Path Diagram - Standardized coefficients


The correlation of p511g and Formula.
Means and variances of standardized variables are not displayed, since they are 0 and 1 respectively.




Simple Regression Analysis: SPSS and Amos
The data used here are the VALDAT data.
a. SPSS Version 10 output
GET FILE='E:\MdbT\P595\Amos\valdatnm.sav'.

.

REGRESSION /MISSING LISTWISE /STATISTICS COEFF OUTS R ANOVA



/CRITERIA=PIN(.05) POUT(.10) /NOORIGIN /DEPENDENT p511g

/METHOD=ENTER formula .



Regression









b. Amos Input Path Diagram - Input parameter values


The model is underidentified unless you fix the value of one parameter. Fix either the variance of the latent error variable to 1 or the regression weight to 1. Here, the variance has been fixed.



c. Amos Output Path Diagram - Unstandardized (Raw) coefficients; Means not estimated


Note that the fixed parameter values were not changed.

The estimated unstandardized (raw score) relationship of p511g .to Formula - the slope, to 2 decimal places.

Variance of formula.

For what it's worth, the estimated unstandardized (raw score) relationship of p511g to the “other factors” latent variable.




Correlation of p511g with formula.

Squared multiple correlation of dependent variable (p511g) with predictor (only formula in this example).

Correlation of p511g with latent “other factors”..= sqrt(1-r2)=sqrt(1-.482) = sqrt(1-.23) = sqrt(.77)=.88
d. Amos Output Path Diagram - Standardized coefficients
(View/Set -> Analysis Properties -> Output to get Amos to print Standardized estimates what a pain!!)

Note that .482 + .882 = 1. All of variance of p511g has been accounted for.


We say that the formula and the error partition the total variance of p511g.
Two IV Regression Example - SPSS and Amos
The data here are the VALDATnm data. UGPA and GREQ are predictors of P511G.
a. SPSS output.

GET


FILE='G:\MdbT\P595\P595AL09-Amos\valdatnm.sav'.

DATASET NAME DataSet1 WINDOW=FRONT.

REGRESSION /DESCRIPTIVES MEAN STDDEV CORR SIG N /MISSING LISTWISE

/STATISTICS COEFF OUTS R ANOVA /CRITERIA=PIN(.05) POUT(.10) /NOORIGIN

/DEPENDENT p511g /METHOD=ENTER ugpa greq .
Regression

[DataSet1] G:\MdbT\P595\P595AL09-Amos\valdatnm.sav









b. Amos Input Path Diagram - Input parameters.


The variance of the (unobserved) error latent variable must be specified at 1.



Note that if the IVs are correlated, you must specify that they are correlated. Otherwise, Amos will perform the analysis assuming they're uncorrelated.



c. Amos Output Path Diagram - Unstandardized (Raw) coefficients

I forgot to check “Estimate means and intercepts, so no means are printed.)



Variance of ugpa

Raw partial regression coefficient relating p511g to ugpa


Covariance of ugpa and greq.



Raw Regression coefficient relating p511g to residual effects.



Raw partial regression coefficient relating p511g to GREQ to 2 decimal places.




Raw regression coefficient relating p511g to Formula. (Zero to two decimal places.)



d. Amos Output Path Diagram - Standardized coefficients.


Standardized partial regression coefficients , sometimes called betas..

Correlation of ugpa and greq.





SQRT(1-R2)=sqrt(1-.21) = sqrt(.79)=.89



Multiple R2.

Note that .332 + .412 + .892 = 1.07 > 1.0. This is because r2s partition variance only when variables are uncorrelated.

e. Amos Text Output - Details of input and minimization

(Early version of Amos without p values)

Chi-square = 0.000

Degrees of freedom = 0

Probability level cannot be computed

Maximum Likelihood Estimates

----------------------------

Regression Weights: Estimate S.E. C.R. Label

------------------- -------- ------- ------- -------
p511g <----- ugpa 0.048 0.015 3.140

p511g <---- error 0.047 0.004 12.329

p511g <----- greq 0.000 0.000 3.869
Standardized Regression Weights: Estimate

-------------------------------- --------
p511g <----- ugpa 0.332

p511g <---- error 0.891

p511g <----- greq 0.410

Covariances: Estimate S.E. C.R. Label

------------ -------- ------- ------- -------
ugpa <-----> greq -8.537 3.861 -2.211
Correlations: Estimate

------------- --------
ugpa <-----> greq -0.262
Variances: Estimate S.E. C.R. Label

---------- -------- ------- ------- -------
error 1.000

ugpa 0.134 0.022 6.164

greq 7897.622 1281.163 6.164

Note – No overall test of significance of R2.

This test is available in the ANOVA box in SPSS.


Squared Multiple Correlations: Estimate

------------------------------ --------
p511g 0.207

Oneway Analysis of Variance Example - SPSS and Amos


The data for this example follow. They're used to introduce the 595 students to contrast coding. The dependent variable is Job Satisfaction (JS). The research factor is Job, with three levels. It is contrast coded by CC1 and CC2.
The data for this example are in ‘MdbT\P595\Amos\ OnewayegData.sav’
ID JS JOB CC1 CC2


The rule for forming a contrast variable between two sets of groups is
1st Value = No. of groups in 2nd set / Total no. of groups.
2nd Value= - No. of groups in 1st set / Total no. of groups.
3rd Value = 0 for all groups to be excluded.
So, 1st Value of CC1 = 2 / 3 = .667.
2nd Value of CC1 = - 1 / 3
1st Value of CC2 = 1 / 2 = .5
2nd Value of CC2 = -1 / 2 = -..5
3rd Value of CC2 = 0 to exclude Job 1.

1 6 1 .667 .000

2 7 1 .667 .000

3 8 1 .667 .000

4 11 1 .667 .000

5 9 1 .667 .000

6 7 1 .667 .000

7 7 1 .667 .000

8 5 2 -.333 .500

9 7 2 -.333 .500

10 8 2 -.333 .500

11 9 2 -.333 .500

12 10 2 -.333 .500

13 8 2 -.333 .500

14 9 2 -.333 .500

15 4 3 -.333 -.500

16 3 3 -.333 -.500

17 6 3 -.333 -.500

18 5 3 -.333 -.500

19 7 3 -.333 -.500

20 8 3 -.333 -.500

21 2 3 -.333 -.500


a. SPSS Oneway output.
Oneway



b. SPSS Regression Output.
regression variables = js cc1 cc2

/dependent = js /enter.


Regression









c. Amos Input Path Diagram.


This was prepared using Amos 3.6. I chose the "Estimate means" option. This was not required, but it caused means to be displayed.








Intercept
d. Amos Output Path Diagram - Unstandardized (Raw) Coefficients


Mean and variance.



snaghtml932f5c

Multiple R2
e. Amos Output Path Diagram - Standardized Coefficients

snaghtml967013


Note that the correlation between group coding variables must be estimated. It's zero here because they're contrast codes, but estimate it anyway.

Note that .292 + .562 + .782 = 1.01 ~~ 1.

r2s partition variance since the variables are all independent.





f. Amos Text Output – Results




Results continued . . .


Note that AMOS does not provide a test of the null hypothesis that in the population, the multiple R = 0. This test is provided in the ANOVA box in SPSS.




CFA, Amos - Printed on 1/27/2017
1   2   3


The database is protected by copyright ©ininet.org 2016
send message

    Main page