Cfa order in r set number of threshold

Keywords: SEM, CFA, R, lavaan

This guide outlines how to conduct a confirmatory factor analysis in R using the lavaan package. A basic example with 6 manifest variables measuring two latent factors is presented. The model estimation results can be compared to the same model fitted with Mplus.

Package Installation

If you have not already installed the lavaan package, you will need to do that first. Here the command to install the package is commented out (preceded by “##”) because the package is already installed. The lavaan package needs to be loaded by using the “library(lavaan)” command.

##install.packages("lavaan") library(lavaan)
This is lavaan 0.6-3
lavaan is BETA software! Please report any bugs.
##install.packages("kutils") library(kutils) ##install.packages("semPlot") library(semPlot)

Data Importation and Naming

The data file “job_placement” needs to be read in to the R session.

Because the datafile does not have column (or variable) names, the variable names need to be specified.

colnames(dat) 

In the original data file the missing values are coded as “99999”. These values need to be recoded to NA so that R recognizes them as missing.

dat[dat == 99999] 

Model Building and Fitting

Now the CFA model that is to be tested needs to be specified as an R object. In lavaan the =~ operator is the exact same as the BY operator in Mplus. In the statements below the variables on the right of the =~ are indicators of the variable to the left of the =~ . Also notice that there are + signs that separate the variables on the right side of the equation. The entire model statement needs to be wrapped in quotation marks.

CFAmodel 

Once the model is built, it can be fit to the data using the cfa function. The model object is the CFAmodel object that was specified above. The data object is the dat object that was imported and altered above. The last command is used to set the scale for the latent variable. The default value for std.lv is FALSE , by setting it to TRUE the first factor loading for each latent variable is freely estimated, the mean for each latent variable is set to 0, and the variance for each latent variable is set to 1.

CFAmodel.fitted 
lavaan 0.6-3 ended normally after 50 iterations Optimization method NLMINB Number of free parameters 19 Number of observations 322 Number of missing patterns 4 Estimator ML Model Fit Test Statistic 9.540 Degrees of freedom 8 P-value (Chi-square) 0.299 Parameter Estimates: Information Observed Observed information based on Hessian Standard Errors Standard Latent Variables: Estimate Std.Err z-value P(>|z|) MATH =~ wratcalc 6.041 0.276 21.921 0.000 wjcalc 4.144 0.203 20.370 0.000 waiscalc 2.410 0.165 14.636 0.000 SPELL =~ wratspl 6.532 0.288 22.645 0.000 wjspl 6.809 0.296 23.025 0.000 waisspl 6.354 0.283 22.463 0.000 Covariances: Estimate Std.Err z-value P(>|z|) MATH ~~ SPELL 0.553 0.042 13.191 0.000 Intercepts: Estimate Std.Err z-value P(>|z|) .wratcalc 38.922 0.355 109.514 0.000 .wjcalc 23.812 0.255 93.297 0.000 .waiscalc 11.022 0.186 59.230 0.000 .wratspl 36.484 0.385 94.751 0.000 .wjspl 41.674 0.398 104.808 0.000 .waisspl 37.163 0.376 98.788 0.000 MATH 0.000 SPELL 0.000 Variances: Estimate Std.Err z-value P(>|z|) .wratcalc 4.179 1.014 4.122 0.000 .wjcalc 3.769 0.537 7.017 0.000 .waiscalc 5.304 0.457 11.596 0.000 .wratspl 5.053 0.612 8.256 0.000 .wjspl 4.541 0.616 7.369 0.000 .waisspl 5.156 0.599 8.605 0.000 MATH 1.000 SPELL 1.000 

Examining Output

We can request standardized output columns by adding standardized = TRUE . The Std.lv column is the parameter estimates when just latent variables are standardized (they already were standardized, so this column tells us nothing new), whereas the Std.all column is the parameter estimates when all variables are standardized.

summary(CFAmodel.fitted, standardized = TRUE)
lavaan 0.6-3 ended normally after 50 iterations Optimization method NLMINB Number of free parameters 19 Number of observations 322 Number of missing patterns 4 Estimator ML Model Fit Test Statistic 9.540 Degrees of freedom 8 P-value (Chi-square) 0.299 Parameter Estimates: Information Observed Observed information based on Hessian Standard Errors Standard Latent Variables: Estimate Std.Err z-value P(>|z|) Std.lv Std.all MATH =~ wratcalc 6.041 0.276 21.921 0.000 6.041 0.947 wjcalc 4.144 0.203 20.370 0.000 4.144 0.906 waiscalc 2.410 0.165 14.636 0.000 2.410 0.723 SPELL =~ wratspl 6.532 0.288 22.645 0.000 6.532 0.946 wjspl 6.809 0.296 23.025 0.000 6.809 0.954 waisspl 6.354 0.283 22.463 0.000 6.354 0.942 Covariances: Estimate Std.Err z-value P(>|z|) Std.lv Std.all MATH ~~ SPELL 0.553 0.042 13.191 0.000 0.553 0.553 Intercepts: Estimate Std.Err z-value P(>|z|) Std.lv Std.all .wratcalc 38.922 0.355 109.514 0.000 38.922 6.103 .wjcalc 23.812 0.255 93.297 0.000 23.812 5.203 .waiscalc 11.022 0.186 59.230 0.000 11.022 3.306 .wratspl 36.484 0.385 94.751 0.000 36.484 5.282 .wjspl 41.674 0.398 104.808 0.000 41.674 5.841 .waisspl 37.163 0.376 98.788 0.000 37.163 5.508 MATH 0.000 0.000 0.000 SPELL 0.000 0.000 0.000 Variances: Estimate Std.Err z-value P(>|z|) Std.lv Std.all .wratcalc 4.179 1.014 4.122 0.000 4.179 0.103 .wjcalc 3.769 0.537 7.017 0.000 3.769 0.180 .waiscalc 5.304 0.457 11.596 0.000 5.304 0.477 .wratspl 5.053 0.612 8.256 0.000 5.053 0.106 .wjspl 4.541 0.616 7.369 0.000 4.541 0.089 .waisspl 5.156 0.599 8.605 0.000 5.156 0.113 MATH 1.000 1.000 1.000 SPELL 1.000 1.000 1.000

We can request the fit measures along with the parameter estimates. This makes the output resemble what is returned by Mplus.

summary(CFAmodel.fitted, standardized = TRUE, fit = TRUE)
lavaan 0.6-3 ended normally after 50 iterations Optimization method NLMINB Number of free parameters 19 Number of observations 322 Number of missing patterns 4 Estimator ML Model Fit Test Statistic 9.540 Degrees of freedom 8 P-value (Chi-square) 0.299 Model test baseline model: Minimum Function Test Statistic 1882.335 Degrees of freedom 15 P-value 0.000 User model versus baseline model: Comparative Fit Index (CFI) 0.999 Tucker-Lewis Index (TLI) 0.998 Loglikelihood and Information Criteria: Loglikelihood user model (H0) -5127.830 Loglikelihood unrestricted model (H1) -5123.060 Number of free parameters 19 Akaike (AIC) 10293.661 Bayesian (BIC) 10365.377 Sample-size adjusted Bayesian (BIC) 10305.112 Root Mean Square Error of Approximation: RMSEA 0.024 90 Percent Confidence Interval 0.000 0.073 P-value RMSEA |z|) Std.lv Std.all MATH =~ wratcalc 6.041 0.276 21.921 0.000 6.041 0.947 wjcalc 4.144 0.203 20.370 0.000 4.144 0.906 waiscalc 2.410 0.165 14.636 0.000 2.410 0.723 SPELL =~ wratspl 6.532 0.288 22.645 0.000 6.532 0.946 wjspl 6.809 0.296 23.025 0.000 6.809 0.954 waisspl 6.354 0.283 22.463 0.000 6.354 0.942 Covariances: Estimate Std.Err z-value P(>|z|) Std.lv Std.all MATH ~~ SPELL 0.553 0.042 13.191 0.000 0.553 0.553 Intercepts: Estimate Std.Err z-value P(>|z|) Std.lv Std.all .wratcalc 38.922 0.355 109.514 0.000 38.922 6.103 .wjcalc 23.812 0.255 93.297 0.000 23.812 5.203 .waiscalc 11.022 0.186 59.230 0.000 11.022 3.306 .wratspl 36.484 0.385 94.751 0.000 36.484 5.282 .wjspl 41.674 0.398 104.808 0.000 41.674 5.841 .waisspl 37.163 0.376 98.788 0.000 37.163 5.508 MATH 0.000 0.000 0.000 SPELL 0.000 0.000 0.000 Variances: Estimate Std.Err z-value P(>|z|) Std.lv Std.all .wratcalc 4.179 1.014 4.122 0.000 4.179 0.103 .wjcalc 3.769 0.537 7.017 0.000 3.769 0.180 .waiscalc 5.304 0.457 11.596 0.000 5.304 0.477 .wratspl 5.053 0.612 8.256 0.000 5.053 0.106 .wjspl 4.541 0.616 7.369 0.000 4.541 0.089 .waisspl 5.156 0.599 8.605 0.000 5.156 0.113 MATH 1.000 1.000 1.000 SPELL 1.000 1.000 1.000

What if we only want certain fit measures? Those can be assigned and viewed using the commands below.

fitStats 
 cfi rmsea 0.999 0.024 

We can pull out the parameter estimates instead of just looking at the summary. This can be useful when putting the parameter estimates in a table.

paramEsts 
 lhs op rhs est se z pvalue ci.lower ci.upper 1 MATH =~ wratcalc 6.041 0.276 21.921 0 5.501 6.581 2 MATH =~ wjcalc 4.144 0.203 20.370 0 3.745 4.543 3 MATH =~ waiscalc 2.410 0.165 14.636 0 2.088 2.733 4 SPELL =~ wratspl 6.532 0.288 22.645 0 5.967 7.097 5 SPELL =~ wjspl 6.809 0.296 23.025 0 6.230 7.389 6 SPELL =~ waisspl 6.354 0.283 22.463 0 5.799 6.908 7 wratcalc ~~ wratcalc 4.179 1.014 4.122 0 2.192 6.166 8 wjcalc ~~ wjcalc 3.769 0.537 7.017 0 2.716 4.821 9 waiscalc ~~ waiscalc 5.304 0.457 11.596 0 4.407 6.200 10 wratspl ~~ wratspl 5.053 0.612 8.256 0 3.853 6.253 11 wjspl ~~ wjspl 4.541 0.616 7.369 0 3.333 5.749 12 waisspl ~~ waisspl 5.156 0.599 8.605 0 3.982 6.330 13 MATH ~~ MATH 1.000 0.000 NA NA 1.000 1.000 14 SPELL ~~ SPELL 1.000 0.000 NA NA 1.000 1.000 15 MATH ~~ SPELL 0.553 0.042 13.191 0 0.470 0.635 16 wratcalc ~1 38.922 0.355 109.514 0 38.226 39.619 17 wjcalc ~1 23.812 0.255 93.297 0 23.311 24.312 18 waiscalc ~1 11.022 0.186 59.230 0 10.657 11.386 19 wratspl ~1 36.484 0.385 94.751 0 35.730 37.239 20 wjspl ~1 41.674 0.398 104.808 0 40.895 42.453 21 waisspl ~1 37.163 0.376 98.788 0 36.426 37.900 22 MATH ~1 0.000 0.000 NA NA 0.000 0.000 23 SPELL ~1 0.000 0.000 NA NA 0.000 0.000

We can also pull out the factor scores by using the lavPredict function on the testOutput object.

fscores 
 MATH SPELL [1,] 1.378216628 1.3727471 [2,] 1.812068411 2.0974157 [3,] -0.006094102 0.1972874 [4,] 1.400963259 -0.2650847 [5,] 1.768911802 -0.7860667 [6,] 0.549146289 0.3652424

The semPaths function from the semPlot package provides an easy way to produce a path diagram.

library(semPlot) semPaths(CFAmodel.fitted)

We can change some of the settings to make the path diagram into something we like a bit more.

semPaths(CFAmodel.fitted, what = "paths", intercepts = FALSE, sizeMan = 12, sizeMan2 = 8, layout = "tree2", sizeLat = 20, sizeLat2 = 10, width = 5, height = 1, label.cex = 1, nCharNodes = 0, curve = 2.5, label.scale = FALSE)

We can ask for the parameter estimates to be placed on the paths directly with the whatLabels = "est" argument. We also use the “edge.label.cex = 1.2” argument to make the values larger than the default.

semPaths(CFAmodel.fitted, what = "paths", whatLabels = "est", intercepts = FALSE, sizeMan = 12, sizeMan2 = 8,layout = "tree2", sizeLat = 20, sizeLat2 = 10, width = 5, height = 3, label.cex = 1, nCharNodes = 0, curve = 2.5, label.scale = FALSE, edge.label.cex = 1.2)

Lastly, the kutils function semTable can be used to produce a table summarizing the model parameter estimates in latex or html code. In this example we produce an html table that is automatically built by Rmarkdown.

semTable(CFAmodel.fitted, type = "html")

In R your output will look like this:

 
Model
EstimateStd. Err.zp
Factor Loadings
MATH
wratcalc6.040.2821.92.000
wjcalc4.140.2020.37.000
waiscalc2.410.1614.64.000
SPELL
wratspl6.530.2922.64.000
wjspl6.810.3023.03.000
waisspl6.350.2822.46.000
Intercepts
wratcalc38.920.36109.51.000
wjcalc23.810.2693.30.000
waiscalc11.020.1959.23.000
wratspl36.480.3994.75.000
wjspl41.670.40104.81.000
waisspl37.160.3898.79.000
Residual Variances
wratcalc4.181.014.12.000
wjcalc3.770.547.02.000
waiscalc5.300.4611.60.000
wratspl5.050.618.26.000
wjspl4.540.627.37.000
waisspl5.160.608.60.000
Latent Intercepts
MATH0.00+
SPELL0.00+
Latent Variances
MATH1.00+
SPELL1.00+
Latent Covariances
MATH w/SPELL0.550.0413.19.000
Fit Indices
χ29.54(8) .299
CFI1.00
TLI1.00
RMSEA0.02
+Fixed parameter

If the file argument is used to specify a file to be saved, opening the resulting html file in a web browser will show a formatted table.

semTable(CFAmodel.fitted, type = "html", file = "CFAmodel.fitted.html")
Model
Estimate Std. Err. z p
Factor Loadings
MATH
wratcalc 6.04 0.28 21.92 .000
wjcalc 4.14 0.20 20.37 .000
waiscalc 2.41 0.16 14.64 .000
SPELL
wratspl 6.53 0.29 22.64 .000
wjspl 6.81 0.30 23.03 .000
waisspl 6.35 0.28 22.46 .000
Intercepts
wratcalc 38.92 0.36 109.51 .000
wjcalc 23.81 0.26 93.30 .000
waiscalc 11.02 0.19 59.23 .000
wratspl 36.48 0.39 94.75 .000
wjspl 41.67 0.40 104.81 .000
waisspl 37.16 0.38 98.79 .000
Residual Variances
wratcalc 4.18 1.01 4.12 .000
wjcalc 3.77 0.54 7.02 .000
waiscalc 5.30 0.46 11.60 .000
wratspl 5.05 0.61 8.26 .000
wjspl 4.54 0.62 7.37 .000
waisspl 5.16 0.60 8.60 .000
Latent Intercepts
MATH 0.00 +
SPELL 0.00 +
Latent Variances
MATH 1.00 +
SPELL 1.00 +
Latent Covariances
MATH w/SPELL 0.55 0.04 13.19 .000
Fit Indices
χ 2 9.54(8) .299
CFI 1.00
TLI 1.00
RMSEA 0.02
+ Fixed parameter

The CFA Example in Mplus

Please click cfa-01.html if the reader would like to see the same CFA model fitted with Mplus.

Session Info

R version 3.5.1 (2018-07-02) Platform: x86_64-redhat-linux-gnu (64-bit) Running under: CentOS Linux 7 (Core) Matrix products: default BLAS/LAPACK: /usr/lib64/R/lib/libRblas.so locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] semPlot_1.1 kutils_1.62 lavaan_0.6-3 [4] stationery_0.98.5.7 loaded via a namespace (and not attached): [1] nlme_3.1-137 RColorBrewer_1.1-2 mi_1.0 [4] tools_3.5.1 backports_1.1.3 R6_2.3.0 [7] rpart_4.1-13 d3Network_0.5.2.1 Hmisc_4.2-0 [10] lazyeval_0.2.1 colorspace_1.4-0 nnet_7.3-12 [13] tidyselect_0.2.5 gridExtra_2.3 mnormt_1.5-5 [16] compiler_3.5.1 qgraph_1.6 fdrtool_1.2.15 [19] htmlTable_1.13.1 scales_1.0.0 checkmate_1.9.1 [22] psych_1.8.12 pbapply_1.4-0 sem_3.1-9 [25] stringr_1.3.1 digest_0.6.18 pbivnorm_0.6.0 [28] foreign_0.8-71 minqa_1.2.4 rmarkdown_1.11 [31] base64enc_0.1-3 jpeg_0.1-8 pkgconfig_2.0.2 [34] htmltools_0.3.6 lme4_1.1-20 lisrelToR_0.1.4 [37] htmlwidgets_1.3 rlang_0.3.1 rstudioapi_0.9.0 [40] huge_1.2.7 bindr_0.1.1 gtools_3.8.1 [43] acepack_1.4.1 dplyr_0.7.8 zip_1.0.0 [46] magrittr_1.5 OpenMx_2.12.1 Formula_1.2-3 [49] Matrix_1.2-15 Rcpp_1.0.0 munsell_0.5.0 [52] abind_1.4-5 rockchalk_1.8.130 stringi_1.2.4 [55] whisker_0.3-2 yaml_2.2.0 carData_3.0-2 [58] MASS_7.3-51.1 plyr_1.8.4 matrixcalc_1.0-3 [61] grid_3.5.1 parallel_3.5.1 crayon_1.3.4 [64] lattice_0.20-38 splines_3.5.1 knitr_1.21 [67] pillar_1.3.1 igraph_1.2.2 boot_1.3-20 [70] rjson_0.2.20 corpcor_1.6.9 BDgraph_2.54 [73] reshape2_1.4.3 stats4_3.5.1 XML_3.98-1.16 [76] glue_1.3.0 evaluate_0.12 latticeExtra_0.6-28 [79] data.table_1.12.0 png_0.1-7 nloptr_1.2.1 [82] gtable_0.2.0 purrr_0.3.0 assertthat_0.2.0 [85] ggplot2_3.1.0 xfun_0.4 openxlsx_4.1.0 [88] xtable_1.8-3 semTools_0.5-1 coda_0.19-2 [91] survival_2.43-3 glasso_1.10 tibble_2.0.1 [94] arm_1.10-1 ggm_2.3 bindrcpp_0.2.2 [97] cluster_2.0.7-1