Since ANOVA is simply regression analysis, the representation of ANOVA in SEM is merely as a regression analysis. The key is to represent the differences between groups with group coding variables, just as we did in 513 and in the beginning of 595 . . .
The two groups are represented by a single, dichotomous observed groupcoding variable. It is the independent variable in the regression analysis.
The K groups are represented by K1 groupcoding variables created using one of the coding schemes (although I recommend contrast coding). They are the independent variables in the regression analysis. If contrast codes are used, the correlations between all the group coding variables are 0, so no arrows between them need be shown.
Note: Contrast codes were used so Groupcoding variables are uncorrelated.
1st Groupcoding contrast code variable
3) Factorial ANOVA.
Each factor is represented by G1 groupcoding variables created using one of the coding schemes. The interaction(s) is/are represented by products of the groupcoding variables representing the factors. Again, no correlations between coding variables need be shown if contrast codes are used.
The factor is represented by a latent variable with three or more observed indicators. (Three is the generally recommended minimum no. of indicators for a factor.)
Obs 1
F
e2
Obs 2
e3
Obs 3
Note that factors are exogenous. Indicators are endogenous. Since the indicators are endogenous, all of their variance must be accounted for by the model. Thus, each indicator must have an error latent variable to account for the variance in it not accounted for by the factor.
2) Exploratory Factor Analysis solution with two orthogonal factors.
Each factor is represented by a latent variable with three or more indicators. The orthogonality of the factors is represented by the fact that there is no arrow connecting the factor symbols.
Let’s assume that Obs1, 2, and 3 are thought to be primary indicators of F1 and 4,5,6 of F2.
For exploratory factor analysis, each variable is required to load on all factors. Of course, the hope is that the loadings will be substantial on only some of the factors and will be close to 0 on the others, but the loadings on all factors are retained, even if they’re close to 0. The loadings that might be close to 0 in the model are shown in red. These are sometimes called cross loadings.
e1
Obs 1
F1
e2
Obs 2
e3
Obs 3
e4
Obs 4
F2
e5
Obs 5
e6
Obs 6
Orthogonal factors represent uncorrelated aspects of behavior.
Note what is assumed here: There are two independent characteristics of people – F1 and F2. Each one influences responses to all six items, although it is hoped that F1 influences primarily the first 3 items and that F2 influences primarily the last 3 items.
If Obs 1 thru Obs 3 are one class of behavior and Obs 4 thru Obs 6 are a second class, then if the loadings “fit” the expected pattern, this would be evidence for the existence of two independent dispositions – that represented by F1 and that represented by F2.3) Exploratory Factor Analysis solution with two oblique factors.
Each factor is represented by a latent variable with three or more indicators. The obliqueness of the factors is represented by the fact that there IS an arrow connecting the factors.
e4
Obs 1
e5
Obs 2
F1
F
e6
Obs 3
e7
Obs 4
e8
Obs 5
F2
e9
Obs 6
Again, in exploratory factor analysis, all indicators load on all factors, even if the loadings are close to zero.
Exploratory factor analysis (EFA) programs, such as that in SPSS, always report estimates of all loadings.
This solution is potentially as important as the orthogonal solution, although in general, I think that researchers are more interested in independent dispositions than they are in correlated dispositions.
But discovering why two dispositions are separate but still correlated is an important and potentially rewarding task.
Path Diagram of EFA model of NEOFFI Big Five 60 item questionnaire.
(From Biderman, M. (2014). Against all odds: Bifactors in EFAs of Big Five Data. Part of symposium: S. McAbee & M. Biderman, Chairs. Theoretical and Practical Advances in Latent Variable Models of Personality. Conducted at the 29^{th} annual conference of The Society for Industrial and Organizational Psychology; Honolulu, Hawaii, 2014.
Cross loadings are red’d.
Crossloadings are in red. Path Diagrams vs the Table of Loadings.
(Cross loadings are in red in both representations.)
Pattern Matrix^{a}


Factor

1

2

3

4

5

ne1

.025

.001

.056

.067

.678

ne2

.086

.002

.021

.026

.268

ne3

.126

.023

.133

.229

.260

ne4

.054

.014

.112

.184

.492

ne5

.115

.045

.069

.269

.624

ne6

.042

.206

.200

.085

.563

ne7

.194

.168

.046

.203

.166

ne8

.342

.127

.087

.299

.432

ne9

.270

.046

.092

.419

.223

ne10

.017

.138

.105

.135

.327

ne11

.209

.270

.009

.102

.278

ne12

.140

.170

.103

.070

.483

na1

.100

.193

.081

.518

.249

na2

.347

.153

.076

.421

.139

na3

.091

.123

.050

.560

.125

na4

.139

.034

.011

.504

.099

na5

.298

.073

.033

.335

.188

na6

.335

.157

.070

.353

.031

na7

.019

.177

.067

.231

.346

na8

.053

.029

.114

.543

.292

na9

.163

.115

.120

.319

.211

na10

.057

.123

.167

.594

.271

na11

.028

.012

.055

.471

.227

na12

.027

.210

.075

.534

.022

nc1

.092

.429

.090

.183

.008

nc2

.019

.580

.049

.055

.086

nc3

.030

.376

.037

.011

.140

nc4

.093

.406

.052

.156

.028

nc5

.026

.716

.025

.156

.057

nc6

.146

.476

.100

.241

.052

nc7

.154

.694

.070

.121

.109

nc8

.092

.528

.019

.017

.110

nc9

.040

.573

.044

.050

.094

nc10

.021

.720

.016

.103

.044

nc11

.035

.551

.067

.148

.005

nc12

.065

.628

.035

.011

.018

ns1

.501

.072

.145

.231

.031

ns2

.544

.119

.132

.044

.027

ns3

.653

.037

.026

.069

.118

ns4

.664

.025

.141

.006

.189

ns5

.660

.082

.031

.200

.047

ns6

.677

.053

.104

.011

.046

ns7

.658

.035

.027

.069

.011

ns8

.552

.068

.078

.265

.012

ns9

.724

.093

.134

.021

.027

ns10

.662

.041

.068

.009

.220

ns11

.563

.266

.035

.030

.132

ns12

.611

.082

.177

.075

.003

no1

.146

.334

.267

.003

.103

no2

.030

.366

.136

.061

.087

no3

.014

.044

.661

.040

.046

no4

.116

.037

.142

.119

.008

no5

.010

.090

.731

.102

.166

no6

.029

.168

.217

.124

.207

no7

.064

.025

.207

.082

.082

no8

.024

.233

.240

.101

.197

no9

.008

.012

.822

.049

.123

no10

.052

.135

.536

.026

.033

no11

.026

.111

.560

.041

.110

no12

.021

.131

.615

.174

.090

Extraction Method: Maximum Likelihood.
Rotation Method: Oblimin with Kaiser Normalization.

a. Rotation converged in 14 iterations.

Whew – there are tons of crossloadings, most of them near 0. Can’t they just be assumed to be zero?
This kind of thinking leads to Confirmatory Factor Analysis Models.
Confirmatory vs Exploratory Factor Analysis
In Exploratory Factor Analysis, as discussed above, the loading of every item on every factor is estimated. The analyst hopes that some of those loadings will be large and some will be small. An EFA twoorthogonalfactor model is represented by the following diagram.
Obs 1
Obs 2
Obs 3
Obs 4
Obs 5
Obs 6
e4
e5
e6
e7
e8
e9
F1
F2
Exploratory
As mentioned above there are arrows (loadings) connecting each variable to each factor. We have no hypotheses about the loading values – we’re exploring – so we estimate all loadings and let them lead us. Generally, EFA programs do not allow you to specify or fix loadings to predetermined values.
In contrast to the exploration implicit in EFA, a factor analysis in which some loadings are fixed at specific values is called a Confirmatory Factor Analysis. The analysis is confirming one or more hypotheses about loadings, hypotheses representing by our fixing them at specific (usually 0) values.
Unfortunately, EFA and CFA often cannot be done using the same computer program. Exceptions that I know of are MPlus and Rcmdr.
Amos and many CFA programs other than MPlus and Rcmdr are unable to do EFAs such as the above models.
So you may have to employ both SPSS (for EFA) and AMOS (for CFA) in exploring the interrelations between variables and factors. Often, analysts will use an EFA program to estimate ALL loadings to all factors, then use an SEM program to perform a confirmatory factor analysis, fixing those loadings that were close to 0 in the EFA to 0 in the CFA.
Obs 1
Obs 2
Obs 3
Obs 4
Obs 5
Obs 6
F2
F1
E1
E2
E3
E4
E5
E6
Confirmatory
Note that in the above confirmatory model, loadings of indicators 46 on F1 are fixed at 0, as are loadings of indicators 13 on F2. (The arrows are missing, therefore assumed to be zero.) The Identification Problem
X
Y
E
Mean, Variance
Mean, Variance
Consider the simple regression model . . .
Corr(E,Y)
Y = a + b*X
Quantities which can be computed from the data . .
Mean of the X variable Variance of the X variable
Mean of the Y variable. Variance of the Y variable.
Correlation of Y with X
Quantities in the diagram .
Remember that in SEM path diagrams, all the variance in every endogenous variable must be accounted for. For that reason, the path diagram includes a latent “Other factors” or “Error of measurement” variable, labeled “E”.
Note: Mean and variance of Y are not separately identified in the model because they are assumed to be completely determined by Y’s relationship to X and to E.
Mean of X Mean of E
Variance of X Variance of E
Intercept of X>Y regression
Slope of X>Y regression
Correlation of E with Y
Whoops! There are 5 quantities in the data but 7 in the model. There are too few quantities in the data. The model is underidentified. – not identified enough  there aren't enough quantities from the data to identify each model value.
Dealing with underidentification . . .
Solution 1
0) The mean of E is always assumed to be 0.
1) Fix the variance of E to be 1.
So in this regression model, the path diagram will be
0, 1
E
r_{EY}
Mean, Variance
X
Y
Y = a + b*X
In this case, there are 5 quantities in the model that must be estimated – mean of X, variance of X, intercept of equation, slope of equation, and correlation of E with Y. There are also 5 quantities that can be estimated from the observed data. The model is said to be “just identified” or “completely identified”. This means that every estimable quantity in the model corresponds in some way to one quantity obtained from the data.
Or,
Solution 2
0) The mean of E is always assumed to be 0.
1) Fix covariance of E with Y at 1.
X
Y
E
Mean, Variance
0, Variance
1
Y=a+b*X
Underidentified models: Cannot be estimated.
Just identified models: Every model quantity is a function of some data quantity. But no parsimony.
Overidentified models: There are more data quantities than model quantities. It is said that you then have “degrees of freedom” in your model. This is good. Relationships are being explained by fewer model quantities than there are data quantities. This is parsimonious – what science is all about.
Identification in CFA models – We’ll have to do this for the reasons outlined above.
Here’s a typical CFA twofactor model.
Insuring that the “Residuals” part of the model – the “E”s – is identified.
We always assume all “E” means = 0.
Solution 1. Fix all “E” variances to 1.
Obs 1
Obs 2
Obs 3
Obs 4
Obs 5
Obs 6
F2
F1
E1
E2
E3
E4
E5
E6
0,1
0,1
0,1
0,1
0,1
0,1
0, Variance
0, Variance
0, Variance
0, Variance
0, Variance
0, Variance
or
Solution 2. Fix all E O covariances to 1 and estimate variances of “E”s.
Obs 1
Obs 2
Obs 3
Obs 4
Obs 5
Obs 6
F2
F1
E1
E2
E3
E4
E5
E6
1
I recommend this.
1
1
1
1
1
Insuring that the Factors part of the CFA is identified
1. Fix one of the loadings for each factor at 1 and estimate all factor variances
Obs 1
Obs 2
Obs 3
Obs 4
Obs 5
Obs 6
F2
F1
E1
E2
E3
E4
E5
E6
1
1
The item whose loading is fixed is called the reference item.
Or
2. Fix the variance of each factor at 1 and estimate all factor loadings.
I recommend this, although some examples below will use the above method.
Obs 1
Obs 2
Obs 3
Obs 4
Obs 5
Obs 6
F2
F1
E1
E2
E3
E4
E5
E6
1
1
This probably seems quite arcane right now, and it is really the province of the mathematical statisticians and programmers who discovered the algorithms that allow us to apply the models.
But we must use these conventions when actually applying models such as these.
The main point for us – what the factor analysis gives us.
The above models tell us that the variation in 6 observed variables – Obs1, Obs2, Obs3, Obs4, Obs5, Obs5  is due to variation in two internal characteristics – F1 and F2.
So we have explained why there is variation in the observed variables – because of variation in F1 and F2. We have also explained why the variation in Obs1 to Obs3 is unrelated to the variation in Obs4 to Obs6 – because F1 and F2 are uncorrelated.
Examples
1. Fixing all variances.
Obs 1
Obs 2
Obs 3
Obs 4
Obs 5
Obs 6
F2
F1
E1
E2
E3
E4
E5
E6
1
1
1
1
1
1
1
1
2. Fixing residual loadings but Factor variances
Obs 1
Obs 2
Obs 3
Obs 4
Obs 5
Obs 6
F2
F1
E1
E2
E3
E4
E5
E6
1
1
1
1
1
1
1
1
My favorite.
3. Fixing residual loadings and factor loadings.
1
Obs 1
Obs 2
Obs 3
Obs 4
Obs 5
Obs 6
F2
F1
E1
E2
E3
E4
E5
E6
1
1
1
1
1
1
1
Programming with path diagrams: Introduction to Amos
Amos is an addon program to SPSS that performs confirmatory factor analysis and structural equation modeling.
It is designed to emphasize a visual interface and has been written so that virtually all analyses can be performed by drawing path diagrams.
It also contains a textbased programming language for those who wish to write programs in the command language.
The Amos drawing toolkit with functions of the most frequently used tools.
Other programs
LISREL
EQS
Amos
Mplus – the best
LAVAAN
Tool to tell Amos to run the diagram.
Tool to move an object
Tool to erase an object
Tool to copy an object
Tool to deselect all objects in diagram
Tool to select all objects in diagram
Tool to select a single object
Tool to draw regression arrows
Tool to put text on the diagram
Tool to draw correlation arrows
Tool to draw latent variables
Observed variable tool
Creating an Amos analysis
1. Open Amos Graphics.
1b. File > New
2. File > Data Files . . . (Because you have to connect the path diagram to a data file.)
3. Specify the name of the file that contains the raw or summary data.
a. Click on the [File Name] button.
b. Navigate to the file and doubleclick on it.
c. Click on the [OK] button.
In this example, I opened a file called IncentiveData080707.sav
4. Draw the desired path diagram using the appropriate drawing tools.
5. Name the variables by rightclicking on each object. And choosing “Object Properties . . .”
The example below is a simple
correlation analysis.
6. Save the model. File > Save As...
7. To run Amos, click on the
button.
8. Click on to see output.
Amos Details
For most of the analyses you’ll perform using Amos, you should get in the habit of doing the following . . .
View > Analysis Properties > Estimation
Check “Estimate means and intercepts”
View > Analysis Properties > Output
Check “Standardized estimates”
Check “Squared multiple correlations”
Doing old things in a new way: Analyses we’ve done before, now performed using Amos
The data used for this example are the valdatnm data in the htlm2\5510\datafiles folder. We’ll simply look at the output here. Later, we’ll focus on the menu sequences needed to get this output.
a. SPSS analysis of the correlation of FORMULA with P511G
Correlations
b. Amos Input Path Diagram  Input Parameter Values
All variables are exogenous.
(Note, I told Amos to estimate means for this analysis.)
The covariance of p511g and Formula.
c. Amos Output Path Diagram  Unstandardized (Raw) coefficients
The mean and variance of p511g.
The mean and variance of Formula.
c. Amos Path Diagram  Standardized coefficients
The correlation of p511g and Formula.
Means and variances of standardized variables are not displayed, since they are 0 and 1 respectively.
Simple Regression Analysis: SPSS and Amos
The data used here are the VALDAT data.
a. SPSS Version 10 output
GET FILE='E:\MdbT\P595\Amos\valdatnm.sav'.
.
REGRESSION /MISSING LISTWISE /STATISTICS COEFF OUTS R ANOVA
/CRITERIA=PIN(.05) POUT(.10) /NOORIGIN /DEPENDENT p511g
/METHOD=ENTER formula .
Regression
b. Amos Input Path Diagram  Input parameter values
The model is underidentified unless you fix the value of one parameter. Fix either the variance of the latent error variable to 1 or the regression weight to 1. Here, the variance has been fixed.
c. Amos Output Path Diagram  Unstandardized (Raw) coefficients; Means not estimated
Note that the fixed parameter values were not changed.
The estimated unstandardized (raw score) relationship of p511g .to Formula  the slope, to 2 decimal places.
Variance of formula.
For what it's worth, the estimated unstandardized (raw score) relationship of p511g to the “other factors” latent variable.
Correlation of p511g with formula.
Squared multiple correlation of dependent variable (p511g) with predictor (only formula in this example).
Correlation of p511g with latent “other factors”..= sqrt(1r^{2})=sqrt(1.48^{2}) = sqrt(1.23) = sqrt(.77)=.88
d. Amos Output Path Diagram  Standardized coefficients
(View/Set > Analysis Properties > Output to get Amos to print Standardized estimates what a pain!!)
Note that .48^{2} + .88^{2} = 1. All of variance of p511g has been accounted for.
We say that the formula and the error partition the total variance of p511g.
Two IV Regression Example  SPSS and Amos
The data here are the VALDATnm data. UGPA and GREQ are predictors of P511G.
a. SPSS output.
GET
FILE='G:\MdbT\P595\P595AL09Amos\valdatnm.sav'.
DATASET NAME DataSet1 WINDOW=FRONT.
REGRESSION /DESCRIPTIVES MEAN STDDEV CORR SIG N /MISSING LISTWISE
/STATISTICS COEFF OUTS R ANOVA /CRITERIA=PIN(.05) POUT(.10) /NOORIGIN
/DEPENDENT p511g /METHOD=ENTER ugpa greq .
Regression
[DataSet1] G:\MdbT\P595\P595AL09Amos\valdatnm.sav
b. Amos Input Path Diagram  Input parameters.
The variance of the (unobserved) error latent variable must be specified at 1.
Note that if the IVs are correlated, you must specify that they are correlated. Otherwise, Amos will perform the analysis assuming they're uncorrelated.
c. Amos Output Path Diagram  Unstandardized (Raw) coefficients
I forgot to check “Estimate means and intercepts, so no means are printed.)
Variance of ugpa
Raw partial regression coefficient relating p511g to ugpa
Covariance of ugpa and greq.
Raw Regression coefficient relating p511g to residual effects.
Raw partial regression coefficient relating p511g to GREQ to 2 decimal places.
Raw regression coefficient relating p511g to Formula. (Zero to two decimal places.)
d. Amos Output Path Diagram  Standardized coefficients.
Standardized partial regression coefficients , sometimes called betas..
Correlation of ugpa and greq.
SQRT(1R^{2})=sqrt(1.21) = sqrt(.79)=.89
Multiple R^{2}.
Note that .33^{2 }+ .41^{2} + .89^{2} = 1.07 > 1.0. This is because r^{2}s partition variance only when variables are uncorrelated.
e. Amos Text Output  Details of input and minimization
(Early version of Amos without p values)
Chisquare = 0.000
Degrees of freedom = 0
Probability level cannot be computed
Maximum Likelihood Estimates

Regression Weights: Estimate S.E. C.R. Label
    
p511g < ugpa 0.048 0.015 3.140
p511g < error 0.047 0.004 12.329
p511g < greq 0.000 0.000 3.869
Standardized Regression Weights: Estimate
 
p511g < ugpa 0.332
p511g < error 0.891
p511g < greq 0.410
Covariances: Estimate S.E. C.R. Label
    
ugpa <> greq 8.537 3.861 2.211
Correlations: Estimate
 
ugpa <> greq 0.262
Variances: Estimate S.E. C.R. Label
    
error 1.000
ugpa 0.134 0.022 6.164
greq 7897.622 1281.163 6.164
Note – No overall test of significance of R^{2}.
This test is available in the ANOVA box in SPSS.
Squared Multiple Correlations: Estimate
 
p511g 0.207
Oneway Analysis of Variance Example  SPSS and Amos
The data for this example follow. They're used to introduce the 595 students to contrast coding. The dependent variable is Job Satisfaction (JS). The research factor is Job, with three levels. It is contrast coded by CC1 and CC2.
The data for this example are in ‘MdbT\P595\Amos\ OnewayegData.sav’
ID JS JOB CC1 CC2
The rule for forming a contrast variable between two sets of groups is
1^{st} Value = No. of groups in 2^{nd} set / Total no. of groups.
2^{nd} Value=  No. of groups in 1^{st} set / Total no. of groups.
3^{rd} Value = 0 for all groups to be excluded.
So, 1^{st} Value of CC1 = 2 / 3 = .667.
2^{nd} Value of CC1 =  1 / 3
1^{st} Value of CC2 = 1 / 2 = .5
2^{nd} Value of CC2 = 1 / 2 = ..5
3^{rd} Value of CC2 = 0 to exclude Job 1.
1 6 1 .667 .000
2 7 1 .667 .000
3 8 1 .667 .000
4 11 1 .667 .000
5 9 1 .667 .000
6 7 1 .667 .000
7 7 1 .667 .000
8 5 2 .333 .500
9 7 2 .333 .500
10 8 2 .333 .500
11 9 2 .333 .500
12 10 2 .333 .500
13 8 2 .333 .500
14 9 2 .333 .500
15 4 3 .333 .500
16 3 3 .333 .500
17 6 3 .333 .500
18 5 3 .333 .500
19 7 3 .333 .500
20 8 3 .333 .500
21 2 3 .333 .500
a. SPSS Oneway output.
Oneway
b. SPSS Regression Output.
regression variables = js cc1 cc2
/dependent = js /enter.
Regression
c. Amos Input Path Diagram.
This was prepared using Amos 3.6. I chose the "Estimate means" option. This was not required, but it caused means to be displayed.
Intercept
d. Amos Output Path Diagram  Unstandardized (Raw) Coefficients
Mean and variance.
Multiple R^{2}
e. Amos Output Path Diagram  Standardized Coefficients
Note that the correlation between group coding variables must be estimated. It's zero here because they're contrast codes, but estimate it anyway.
Note that .29^{2} + .56^{2} + .78^{2} = 1.01 ~~ 1.
r^{2}s partition variance since the variables are all independent.
f. Amos Text Output – Results
Results continued . . .
Note that AMOS does not provide a test of the null hypothesis that in the population, the multiple R = 0. This test is provided in the ANOVA box in SPSS.