Factor Analysis - SPSS
________________________________________
•    First Read Principal Components Analysis.
    The methods we have employed so far attempt to repackage all of the variance in the p variables into principal components.  We may wish to restrict our analysis to variance that is common among variables.  That is, when repackaging the variables’ variance we may wish not to redistribute variance that is unique to any one variable.  This is Common Factor Analysis.  A common factor is an abstraction, a hypothetical dimension that affects at least two of the variables.  We assumed that there is also one unique factor for each variable, a factor that affects that variable but does not affect any other variables.  We assume that the p unique factors are uncorrelated with one another and with the common factors.  It is the variance due to these unique factors that we shall exclude from our FA.

Iterated Principal Factors Analysis
    The most common sort of FA is principal axis FA, also known as principal factor analysis.  This analysis proceeds very much like that for a PCA.  We eliminate the variance due to unique factors by replacing the 1’s on the main diagonal of the correlation matrix with estimates of the variables’ communalities. Recall that a variable’s communality, its SSL across components or factors, is the amount of the variable’s variance that is accounted for by the components or factors.  Since our factors will be common factors, a variable’s communality will be the amount of its variance that is common rather than unique.  The R2  between a variable and all other variables is most often used initially to estimate a variable’s communality.
    Using the beer data, change the extraction method to principal axis:
 
    Take a look at the initial communalities (for each variable, this is the R2  for predicting that variable from an optimally weighted linear combination of the remaining variables).  Recall that they were all 1’s for the principal components analysis we did earlier, but now each is less than 1.  If we sum these communalities we get 5.675.  We started with 7 units of standardized variance and we have now reduced that to 5.675 units of standardized variance (by eliminating unique variance).
 
    For an iterated principal axis solution SPSS first estimates communalities, with R2 ’s, and then conducts the analysis.  It then takes the communalities from that first analysis and inserts them into the main diagonal of the correlation matrix in place of the R2 ’s, and does the analysis again.  The variables’ SSL’s from this second solution are then inserted into the main diagonal replacing the communalities from the previous iteration, etc. etc., until the change from one iteration to the next iteration is trivial.
    Look at the communalities after this iterative process and for a two-factor solution.  They now sum to 5.60.  That is, 5.6/7 = 80% of the variance is common variance and 20% is unique.  Here you can see how we have packaged that common variance into two factors, both before and after a varimax rotation:
 
    The final rotated loadings are:
 
    These loadings are very similar to those we obtained previously with a principle components analysis.

Reproduced and Residual Correlation Matrices
    Having extracted common factors, one can turn right around and try to reproduce the correlation matrix from the factor loading matrix.  We assume that the correlations between variables result from their sharing common underlying factors.  Thus, it makes sense to try to estimate the correlations between variables from the correlations between factors and variables.  The reproduced correlation matrix is obtained by multiplying the loading matrix by the transposed loading matrix.  This results in calculating each reproduced correlation as the sum across factors (from 1 to m) of the products (r between factor and the one variable)(r between factor and the other variable).  For example, for our 2 factor iterative solution the reproduced correlation between COLOR and TASTE = (r for color - Factor1)(r for taste - Factor 1) + (r for color - Factor 2)(r for taste - Factor2) = (.95)(.94) + (.06)(-.02) = .89. The original r between color and taste was .90, so our two factors did indeed capture the relationship between Color and Taste.
    The residual correlation matrix equals the original correlation matrix minus the reproduced correlation matrix.  We want these residuals to be small.  If you check “Reproduced” under “Descriptive” in the Factor Analysis dialogue box, you will get both of these matrices:
 
Nonorthogonal (Oblique) Rotation
    The data may be better fit with axes that are not perpendicular.  This can be done by means of an oblique rotation, but the factors will now be correlated with one another.  Also, the factor loadings (in the pattern matrix) will no longer be equal to the correlation between each factor and each variable.  They will still be standardized regression coefficients (Beta weights), the A’s in the   formula presented at the beginning of the handout on principle components analysis.  The correlations between factors and variables are presented in a factor structure matrix.
    I am not generally comfortable with oblique rotations, but for this lesson I tried a Promax rotation (a varimax rotation is first applied and then the resulting axes rotated to oblique positions):
 
    Notice that this solution is not much different from the previously obtained varimax solution, so little was gained by allowing the factors to be correlated.

Exact Factor Scores
    One may wish to define subscales on a test, with each subscale representing one factor.  Using an "exact" weighting scheme, each subject’s estimated factor score on each factor is a weighted sum of the products of scoring coefficients and the subject’s standardized scores on the original variables.
    The regression coefficients (standardized scoring coefficients) for converting scores on variables to factor scores are obtained by multiplying the inverse of the original simple correlation matrix by the factor loading matrix. To obtain a subject’s factor scores you multiply e’s standardized scores (Z’s) on the variables by these standardized scoring coefficients.  For example, subject # 1’s Factor scores are:
Factor 1:      (-.294)(.41) + (.955)(.40) + (-.036)(.22) + (1.057)(-.07) + (.712)(.04) +
        (1.219)(.03) + (-1.14)(.01) = 0.23.
Factor 2:      (-.294)(.11) + (.955)(.03) + (-.036)(-.20) + (1.057)(.61) + (.712)(.25) +
        (.16)(1.219) + (-1.14)(-.04) = 1.06
    SPSS will not only compute the scoring coefficients for you, it will also output the factor scores of your subjects into your SPSS data set so that you can input them into other procedures.  In the Factor Analysis window, click Scores and select Save As Variables, Regression, Display Factor Score Coefficient Matrix.
 
    Here are the scoring coefficients:
 
    Look back at your data sheet.  You will find that two columns have been added to the right, one for scores on Factor 1 and another for scores on Factor 2.
    The input data included two variables (SES and Group) not included in the factor analysis.  Just for fun, try conducting a multiple regression predicting subjects’ SES from their factor scores and also try using Student’s t to compare the two groups’ means on the factor scores.  Do note that the scores for factor 1 are not correlated with those for factor 2.  Accordingly, in the multiple regression the squared semipartial correlation coefficients are identical to squared zero-order correlation coefficients and the  .