Observed Variable

Download 260.92 Kb.

Page	3/3
Date	28.01.2017
Size	260.92 Kb.
	#9217

1 2 3

Path Diagrams representing Exploratory Factor Analysis

Path Analyses of ANOVAs

Since ANOVA is simply regression analysis, the representation of ANOVA in SEM is merely as a regression analysis. The key is to represent the differences between groups with group coding variables, just as we did in 513 and in the beginning of 595 . . .

1) Independent Groups t-test

The two groups are represented by a single, dichotomous observed group-coding variable. It is the independent variable in the regression analysis.

Dichotomous variable representing the two groups

Dependent

Variable

e
2) One Way ANOVA

The K groups are represented by K-1 group-coding variables created using one of the coding schemes (although I recommend contrast coding). They are the independent variables in the regression analysis. If contrast codes are used, the correlations between all the group coding variables are 0, so no arrows between them need be shown.

Note: Contrast codes were used so Group-coding variables are uncorrelated.
1st Group-coding contrast code variable

2nd Group-coding contrast code variable.

(K-1)th Group-coding contrast code variable.

. . . . .

Dependent

Variable

3) Factorial ANOVA.

Each factor is represented by G-1 group-coding variables created using one of the coding schemes. The interaction(s) is/are represented by products of the group-coding variables representing the factors. Again, no correlations between coding variables need be shown if contrast codes are used.

Note: Contrast codes should be used to make sure the group-coding variables uncorrelated (assuming equal sample sizes.)
1st Factor

1st Factor

2st Factor

Interaction

Dependent

Variable

Path Diagrams representing Exploratory Factor Analysis

1) Exploratory Factor Analysis solution with one factor.
The factor is represented by a latent variable with three or more observed indicators. (Three is the generally recommended minimum no. of indicators for a factor.)

e1

Obs 1

e2

Obs 2

e3

Obs 3

Note that factors are exogenous. Indicators are endogenous. Since the indicators are endogenous, all of their variance must be accounted for by the model. Thus, each indicator must have an error latent variable to account for the variance in it not accounted for by the factor.

2) Exploratory Factor Analysis solution with two orthogonal factors.
Each factor is represented by a latent variable with three or more indicators. The orthogonality of the factors is represented by the fact that there is no arrow connecting the factor symbols.
Let’s assume that Obs1, 2, and 3 are thought to be primary indicators of F1 and 4,5,6 of F2.
For exploratory factor analysis, each variable is required to load on all factors. Of course, the hope is that the loadings will be substantial on only some of the factors and will be close to 0 on the others, but the loadings on all factors are retained, even if they’re close to 0. The loadings that might be close to 0 in the model are shown in red. These are sometimes called cross loadings.

e1

Obs 1

F1

e2

Obs 2

e3

Obs 3

e4

Obs 4

F2

e5

Obs 5

e6

Obs 6

Orthogonal factors represent uncorrelated aspects of behavior.
Note what is assumed here: There are two independent characteristics of people – F1 and F2. Each one influences responses to all six items, although it is hoped that F1 influences primarily the first 3 items and that F2 influences primarily the last 3 items.
If Obs 1 thru Obs 3 are one class of behavior and Obs 4 thru Obs 6 are a second class, then if the loadings “fit” the expected pattern, this would be evidence for the existence of two independent dispositions – that represented by F1 and that represented by F2.3) Exploratory Factor Analysis solution with two oblique factors.
Each factor is represented by a latent variable with three or more indicators. The obliqueness of the factors is represented by the fact that there IS an arrow connecting the factors.

e4

Obs 1

e5

Obs 2

F1

F

e6

Obs 3

e7

Obs 4

e8

Obs 5

F2

e9

Obs 6

Again, in exploratory factor analysis, all indicators load on all factors, even if the loadings are close to zero.

Exploratory factor analysis (EFA) programs, such as that in SPSS, always report estimates of all loadings.
This solution is potentially as important as the orthogonal solution, although in general, I think that researchers are more interested in independent dispositions than they are in correlated dispositions.
But discovering why two dispositions are separate but still correlated is an important and potentially rewarding task.

Path Diagram of EFA model of NEO-FFI Big Five 60 item questionnaire.

(From Biderman, M. (2014). Against all odds: Bifactors in EFAs of Big Five Data. Part of symposium: S. McAbee & M. Biderman, Chairs. Theoretical and Practical Advances in Latent Variable Models of Personality. Conducted at the 29^th annual conference of The Society for Industrial and Organizational Psychology; Honolulu, Hawaii, 2014.

Cross loadings are red’d.

Crossloadings are in red. Path Diagrams vs the Table of Loadings.

(Cross loadings are in red in both representations.)

Pattern Matrix^a
	Factor
	1	2	3	4	5
ne1	-.025	-.001	-.056	-.067	.678
ne2	.086	.002	.021	-.026	.268
ne3	.126	-.023	.133	.229	.260
ne4	.054	.014	.112	.184	.492
ne5	-.115	-.045	-.069	-.269	.624
ne6	.042	.206	-.200	.085	.563
ne7	.194	-.168	.046	-.203	.166
ne8	.342	-.127	.087	.299	.432
ne9	.270	.046	.092	.419	.223
ne10	-.017	-.138	-.105	.135	.327
ne11	.209	-.270	.009	.102	.278
ne12	.140	-.170	-.103	-.070	.483
na1	-.100	-.193	.081	.518	.249
na2	.347	-.153	-.076	.421	-.139
na3	.091	-.123	-.050	.560	.125
na4	-.139	.034	-.011	.504	-.099
na5	.298	.073	.033	.335	.188
na6	.335	.157	-.070	.353	.031
na7	.019	-.177	.067	.231	.346
na8	.053	.029	-.114	.543	.292
na9	.163	.115	-.120	.319	-.211
na10	-.057	-.123	.167	.594	.271
na11	.028	.012	.055	.471	-.227
na12	.027	-.210	-.075	.534	-.022
nc1	.092	-.429	-.090	.183	.008
nc2	.019	-.580	-.049	.055	-.086
nc3	-.030	-.376	-.037	.011	-.140
nc4	-.093	-.406	.052	.156	.028
nc5	.026	-.716	-.025	-.156	.057
nc6	.146	-.476	.100	.241	-.052
nc7	-.154	-.694	-.070	-.121	.109
nc8	.092	-.528	.019	-.017	.110
nc9	.040	-.573	.044	-.050	.094
nc10	.021	-.720	.016	-.103	.044
nc11	.035	-.551	-.067	.148	.005
nc12	.065	-.628	.035	-.011	.018
ns1	.501	.072	.145	-.231	-.031
ns2	.544	-.119	.132	-.044	.027
ns3	.653	.037	-.026	-.069	-.118
ns4	.664	.025	-.141	.006	.189
ns5	.660	.082	-.031	.200	.047
ns6	.677	-.053	-.104	.011	.046
ns7	.658	.035	.027	-.069	.011
ns8	.552	.068	-.078	.265	.012
ns9	.724	-.093	.134	-.021	.027
ns10	.662	-.041	-.068	-.009	.220
ns11	.563	-.266	.035	.030	-.132
ns12	.611	-.082	-.177	.075	-.003
no1	-.146	.334	.267	-.003	-.103
no2	.030	.366	.136	.061	.087
no3	-.014	.044	.661	-.040	.046
no4	.116	.037	.142	-.119	-.008
no5	-.010	-.090	.731	.102	-.166
no6	.029	.168	.217	.124	.207
no7	-.064	.025	.207	.082	-.082
no8	.024	.233	.240	-.101	-.197
no9	-.008	-.012	.822	.049	-.123
no10	.052	.135	.536	.026	-.033
no11	-.026	-.111	.560	-.041	.110
no12	.021	.131	.615	-.174	.090
Extraction Method: Maximum Likelihood. Rotation Method: Oblimin with Kaiser Normalization.
a. Rotation converged in 14 iterations.

Whew – there are tons of crossloadings, most of them near 0. Can’t they just be assumed to be zero?

This kind of thinking leads to Confirmatory Factor Analysis Models.

Confirmatory vs Exploratory Factor Analysis

In Exploratory Factor Analysis, as discussed above, the loading of every item on every factor is estimated. The analyst hopes that some of those loadings will be large and some will be small. An EFA two-orthogonal-factor model is represented by the following diagram.

Obs 1

Obs 2

Obs 3

Obs 4

Obs 5

Obs 6

Exploratory

As mentioned above there are arrows (loadings) connecting each variable to each factor. We have no hypotheses about the loading values – we’re exploring – so we estimate all loadings and let them lead us. Generally, EFA programs do not allow you to specify or fix loadings to pre-determined values.

In contrast to the exploration implicit in EFA, a factor analysis in which some loadings are fixed at specific values is called a Confirmatory Factor Analysis. The analysis is confirming one or more hypotheses about loadings, hypotheses representing by our fixing them at specific (usually 0) values.
Unfortunately, EFA and CFA often cannot be done using the same computer program. Exceptions that I know of are MPlus and Rcmdr.
Amos and many CFA programs other than MPlus and Rcmdr are unable to do EFAs such as the above models.
So you may have to employ both SPSS (for EFA) and AMOS (for CFA) in exploring the interrelations between variables and factors. Often, analysts will use an EFA program to estimate ALL loadings to all factors, then use an SEM program to perform a confirmatory factor analysis, fixing those loadings that were close to 0 in the EFA to 0 in the CFA.

Obs 1

Obs 2

Obs 3

Obs 4

Obs 5

Obs 6

E6

Confirmatory

Note that in the above confirmatory model, loadings of indicators 4-6 on F1 are fixed at 0, as are loadings of indicators 1-3 on F2. (The arrows are missing, therefore assumed to be zero.) The Identification Problem

X
Y

Mean, Variance

Mean, Variance
Consider the simple regression model . . .

Corr(E,Y)

Y = a + b*X

Quantities which can be computed from the data . .
Mean of the X variable Variance of the X variable

Mean of the Y variable. Variance of the Y variable.

Correlation of Y with X
Quantities in the diagram .

Remember that in SEM path diagrams, all the variance in every endogenous variable must be accounted for. For that reason, the path diagram includes a latent “Other factors” or “Error of measurement” variable, labeled “E”.

Note: Mean and variance of Y are not separately identified in the model because they are assumed to be completely determined by Y’s relationship to X and to E.

Mean of X Mean of E

Variance of X Variance of E

Intercept of X->Y regression

Slope of X->Y regression

Correlation of E with Y

Whoops! There are 5 quantities in the data but 7 in the model. There are too few quantities in the data. The model is underidentified. – not identified enough - there aren't enough quantities from the data to identify each model value.
Dealing with underidentification . . .
Solution 1

0) The mean of E is always assumed to be 0.

1) Fix the variance of E to be 1.
So in this regression model, the path diagram will be

0, 1

E

r_EY

Mean, Variance

X

Y

Y = a + b*X

In this case, there are 5 quantities in the model that must be estimated – mean of X, variance of X, intercept of equation, slope of equation, and correlation of E with Y. There are also 5 quantities that can be estimated from the observed data. The model is said to be “just identified” or “completely identified”. This means that every estimable quantity in the model corresponds in some way to one quantity obtained from the data.

Or,
Solution 2

0) The mean of E is always assumed to be 0.

1) Fix covariance of E with Y at 1.

Mean, Variance

0, Variance

1

Y=a+b*X

Underidentified models: Cannot be estimated.
Just identified models: Every model quantity is a function of some data quantity. But no parsimony.
Overidentified models: There are more data quantities than model quantities. It is said that you then have “degrees of freedom” in your model. This is good. Relationships are being explained by fewer model quantities than there are data quantities. This is parsimonious – what science is all about.

Identification in CFA models – We’ll have to do this for the reasons outlined above.
Here’s a typical CFA two-factor model.
Insuring that the “Residuals” part of the model – the “E”s – is identified.
We always assume all “E” means = 0.
Solution 1. Fix all “E” variances to 1.

Obs 1

Obs 2

Obs 3

Obs 4

Obs 5

Obs 6

0,1

0, Variance

or
Solution 2. Fix all E  O covariances to 1 and estimate variances of “E”s.

Obs 1

Obs 2

Obs 3

Obs 4

Obs 5

Obs 6

1

I recommend this.

1

1

1

1

1

Insuring that the Factors part of the CFA is identified
1. Fix one of the loadings for each factor at 1 and estimate all factor variances

Obs 1

Obs 2

Obs 3

Obs 4

Obs 5

Obs 6

The item whose loading is fixed is called the reference item.

Or
2. Fix the variance of each factor at 1 and estimate all factor loadings.

I recommend this, although some examples below will use the above method.
Obs 1

Obs 2

Obs 3

Obs 4

Obs 5

Obs 6

This probably seems quite arcane right now, and it is really the province of the mathematical statisticians and programmers who discovered the algorithms that allow us to apply the models.

But we must use these conventions when actually applying models such as these.
The main point for us – what the factor analysis gives us.
The above models tell us that the variation in 6 observed variables – Obs1, Obs2, Obs3, Obs4, Obs5, Obs5 - is due to variation in two internal characteristics – F1 and F2.
So we have explained why there is variation in the observed variables – because of variation in F1 and F2. We have also explained why the variation in Obs1 to Obs3 is unrelated to the variation in Obs4 to Obs6 – because F1 and F2 are uncorrelated.

Examples
1. Fixing all variances.
Obs 1

Obs 2

Obs 3

Obs 4

Obs 5

Obs 6

2. Fixing residual loadings but Factor variances

Obs 1

Obs 2

Obs 3

Obs 4

Obs 5

Obs 6

My favorite.

3. Fixing residual loadings and factor loadings.

1
Obs 1

Obs 2

Obs 3

Obs 4

Obs 5

Obs 6

1

1

1

1

1

1

1
Programming with path diagrams: Introduction to Amos
Amos is an add-on program to SPSS that performs confirmatory factor analysis and structural equation modeling.
It is designed to emphasize a visual interface and has been written so that virtually all analyses can be performed by drawing path diagrams.
It also contains a text-based programming language for those who wish to write programs in the command language.
The Amos drawing toolkit with functions of the most frequently used tools.

Other programs

LISREL

EQS

Amos

Mplus – the best
LAVAAN

Tool to tell Amos to run the diagram.

Tool to move an object

Tool to erase an object

Tool to copy an object

Tool to deselect all objects in diagram

Tool to select all objects in diagram

Tool to select a single object

Tool to draw regression arrows

Tool to put text on the diagram

Tool to draw correlation arrows

Tool to draw latent variables

Observed variable tool
snaghtml3cf74d5

Creating an Amos analysis
1. Open Amos Graphics.

1b. File -> New

2. File -> Data Files . . . (Because you have to connect the path diagram to a data file.)
3. Specify the name of the file that contains the raw or summary data.

a. Click on the [File Name] button.

b. Navigate to the file and double-click on it.

c. Click on the [OK] button.

In this example, I opened a file called IncentiveData080707.sav
snaghtml3da766b

4. Draw the desired path diagram using the appropriate drawing tools.

5. Name the variables by right-clicking on each object. And choosing “Object Properties . . .”

snaghtml3df5506

The example below is a simple

correlation analysis.

6. Save the model. File -> Save As...

7. To run Amos, click on the

button.
8. Click on

to see output.
Amos Details
For most of the analyses you’ll perform using Amos, you should get in the habit of doing the following . . .
View -> Analysis Properties -> Estimation
Check “Estimate means and intercepts”
View -> Analysis Properties -> Output
Check “Standardized estimates”

Check “Squared multiple correlations”

Doing old things in a new way: Analyses we’ve done before, now performed using Amos

The data used for this example are the valdatnm data in the htlm2\5510\datafiles folder. We’ll simply look at the output here. Later, we’ll focus on the menu sequences needed to get this output.

a. SPSS analysis of the correlation of FORMULA with P511G
Correlations

b. Amos Input Path Diagram - Input Parameter Values

All variables are exogenous.

(Note, I told Amos to estimate means for this analysis.)

The covariance of p511g and Formula.

c. Amos Output Path Diagram - Unstandardized (Raw) coefficients

The mean and variance of p511g.

The mean and variance of Formula.

c. Amos Path Diagram - Standardized coefficients

The correlation of p511g and Formula.
Means and variances of standardized variables are not displayed, since they are 0 and 1 respectively.

Simple Regression Analysis: SPSS and Amos
The data used here are the VALDAT data.
a. SPSS Version 10 output
GET FILE='E:\MdbT\P595\Amos\valdatnm.sav'.

REGRESSION /MISSING LISTWISE /STATISTICS COEFF OUTS R ANOVA

/CRITERIA=PIN(.05) POUT(.10) /NOORIGIN /DEPENDENT p511g

/METHOD=ENTER formula .

Regression

b. Amos Input Path Diagram - Input parameter values

The model is underidentified unless you fix the value of one parameter. Fix either the variance of the latent error variable to 1 or the regression weight to 1. Here, the variance has been fixed.

c. Amos Output Path Diagram - Unstandardized (Raw) coefficients; Means not estimated

Note that the fixed parameter values were not changed.

The estimated unstandardized (raw score) relationship of p511g .to Formula - the slope, to 2 decimal places.

Variance of formula.

For what it's worth, the estimated unstandardized (raw score) relationship of p511g to the “other factors” latent variable.

Correlation of p511g with formula.

Squared multiple correlation of dependent variable (p511g) with predictor (only formula in this example).

Correlation of p511g with latent “other factors”..= sqrt(1-r²)=sqrt(1-.48²) = sqrt(1-.23) = sqrt(.77)=.88
d. Amos Output Path Diagram - Standardized coefficients

(View/Set -> Analysis Properties -> Output to get Amos to print Standardized estimates what a pain!!)

Note that .48² + .88² = 1. All of variance of p511g has been accounted for.

We say that the formula and the error partition the total variance of p511g.
Two IV Regression Example - SPSS and Amos
The data here are the VALDATnm data. UGPA and GREQ are predictors of P511G.
a. SPSS output.

GET

FILE='G:\MdbT\P595\P595AL09-Amos\valdatnm.sav'.

DATASET NAME DataSet1 WINDOW=FRONT.

REGRESSION /DESCRIPTIVES MEAN STDDEV CORR SIG N /MISSING LISTWISE

/STATISTICS COEFF OUTS R ANOVA /CRITERIA=PIN(.05) POUT(.10) /NOORIGIN

/DEPENDENT p511g /METHOD=ENTER ugpa greq .
Regression

[DataSet1] G:\MdbT\P595\P595AL09-Amos\valdatnm.sav

b. Amos Input Path Diagram - Input parameters.

The variance of the (unobserved) error latent variable must be specified at 1.

Note that if the IVs are correlated, you must specify that they are correlated. Otherwise, Amos will perform the analysis assuming they're uncorrelated.

c. Amos Output Path Diagram - Unstandardized (Raw) coefficients

I forgot to check “Estimate means and intercepts, so no means are printed.)

Variance of ugpa

Raw partial regression coefficient relating p511g to ugpa

Covariance of ugpa and greq.

Raw Regression coefficient relating p511g to residual effects.

Raw partial regression coefficient relating p511g to GREQ to 2 decimal places.

Raw regression coefficient relating p511g to Formula. (Zero to two decimal places.)

d. Amos Output Path Diagram - Standardized coefficients.

Standardized partial regression coefficients , sometimes called betas..

Correlation of ugpa and greq.

SQRT(1-R²)=sqrt(1-.21) = sqrt(.79)=.89

Multiple R².

Note that .33²+ .41² + .89² = 1.07 > 1.0. This is because r²s partition variance only when variables are uncorrelated.

e. Amos Text Output - Details of input and minimization

(Early version of Amos without p values)

Chi-square = 0.000

Degrees of freedom = 0

Probability level cannot be computed

Maximum Likelihood Estimates

----------------------------

Regression Weights: Estimate S.E. C.R. Label

------------------- -------- ------- ------- -------
p511g <----- ugpa 0.048 0.015 3.140

p511g <---- error 0.047 0.004 12.329

p511g <----- greq 0.000 0.000 3.869
Standardized Regression Weights: Estimate

-------------------------------- --------
p511g <----- ugpa 0.332

p511g <---- error 0.891

p511g <----- greq 0.410

Covariances: Estimate S.E. C.R. Label

------------ -------- ------- ------- -------
ugpa <-----> greq -8.537 3.861 -2.211
Correlations: Estimate

------------- --------
ugpa <-----> greq -0.262
Variances: Estimate S.E. C.R. Label

---------- -------- ------- ------- -------
error 1.000

ugpa 0.134 0.022 6.164

greq 7897.622 1281.163 6.164

Note – No overall test of significance of R².

This test is available in the ANOVA box in SPSS.

Squared Multiple Correlations: Estimate

------------------------------ --------
p511g 0.207

Oneway Analysis of Variance Example - SPSS and Amos

The data for this example follow. They're used to introduce the 595 students to contrast coding. The dependent variable is Job Satisfaction (JS). The research factor is Job, with three levels. It is contrast coded by CC1 and CC2.
The data for this example are in ‘MdbT\P595\Amos\ OnewayegData.sav’
ID JS JOB CC1 CC2

The rule for forming a contrast variable between two sets of groups is
1^st Value = No. of groups in 2^nd set / Total no. of groups.
2^nd Value= - No. of groups in 1^st set / Total no. of groups.
3^rd Value = 0 for all groups to be excluded.
So, 1^st Value of CC1 = 2 / 3 = .667.
2^nd Value of CC1 = - 1 / 3
1^st Value of CC2 = 1 / 2 = .5
2^nd Value of CC2 = -1 / 2 = -..5
3^rd Value of CC2 = 0 to exclude Job 1.

1 6 1 .667 .000

2 7 1 .667 .000

3 8 1 .667 .000

4 11 1 .667 .000

5 9 1 .667 .000

6 7 1 .667 .000

7 7 1 .667 .000

8 5 2 -.333 .500

9 7 2 -.333 .500

10 8 2 -.333 .500

11 9 2 -.333 .500

12 10 2 -.333 .500

13 8 2 -.333 .500

14 9 2 -.333 .500

15 4 3 -.333 -.500

16 3 3 -.333 -.500

17 6 3 -.333 -.500

18 5 3 -.333 -.500

19 7 3 -.333 -.500

20 8 3 -.333 -.500

21 2 3 -.333 -.500

a. SPSS Oneway output.
Oneway

b. SPSS Regression Output.
regression variables = js cc1 cc2

/dependent = js /enter.

Regression

c. Amos Input Path Diagram.

This was prepared using Amos 3.6. I chose the "Estimate means" option. This was not required, but it caused means to be displayed.

Intercept
d. Amos Output Path Diagram - Unstandardized (Raw) Coefficients

Mean and variance.

snaghtml932f5c

Multiple R²
e. Amos Output Path Diagram - Standardized Coefficients

snaghtml967013

Note that the correlation between group coding variables must be estimated. It's zero here because they're contrast codes, but estimate it anyway.

Note that .29² + .56² + .78² = 1.01 ~~ 1.

r²s partition variance since the variables are all independent.

f. Amos Text Output – Results

Results continued . . .

Note that AMOS does not provide a test of the null hypothesis that in the population, the multiple R = 0. This test is provided in the ANOVA box in SPSS.

CFA, Amos - Printed on 1/27/2017

Directory: faculty
faculty -> Dsci 3870: Management Science
faculty -> Handling Indivisibilities
faculty -> Course overview
faculty -> Curriculum vitae wei chen professor
faculty -> Digital image warping
faculty -> Samples of Elements Exam Question III contains All Prior Exam Qs III except
faculty -> 【Education&Working Experience】
faculty -> References Abe, M., A. Kitoh and T. Yasunari, 2003: An evolution of the Asian summer monsoon associated with mountain uplift —Simulation with the mri atmosphere-ocean coupled gcm. J. Meteor. Soc. Japan, 81
faculty -> Ralph R. Ferraro Chief, Satellite Climate Studies Branch, noaa/nesdis
faculty -> Unit IV text: Types of Oil, Types of Prices Grammar: that/those of, with revision

Download 260.92 Kb.

Share with your friends:

1 2 3