Cfa order in r set number of threshold

Keywords: SEM, CFA, R, lavaan

This guide outlines how to conduct a confirmatory factor analysis in R using the lavaan package. A basic example with 6 manifest variables measuring two latent factors is presented. The model estimation results can be compared to the same model fitted with Mplus.

Package Installation

If you have not already installed the lavaan package, you will need to do that first. Here the command to install the package is commented out (preceded by “##”) because the package is already installed. The lavaan package needs to be loaded by using the “library(lavaan)” command.

##install.packages("lavaan") library(lavaan)

This is lavaan 0.6-3

lavaan is BETA software! Please report any bugs.

##install.packages("kutils") library(kutils) ##install.packages("semPlot") library(semPlot)

Data Importation and Naming

The data file “job_placement” needs to be read in to the R session.

Because the datafile does not have column (or variable) names, the variable names need to be specified.

colnames(dat)


In the original data file the missing values are coded as “99999”. These values need to be recoded to NA so that R recognizes them as missing.
dat[dat == 99999] 

Model Building and Fitting
Now the CFA model that is to be tested needs to be specified as an R object. In lavaan the =~ operator is the exact same as the BY operator in Mplus. In the statements below the variables on the right of the =~ are indicators of the variable to the left of the =~ . Also notice that there are + signs that separate the variables on the right side of the equation. The entire model statement needs to be wrapped in quotation marks.
CFAmodel 

Once the model is built, it can be fit to the data using the cfa function. The model object is the CFAmodel object that was specified above. The data object is the dat object that was imported and altered above. The last command is used to set the scale for the latent variable. The default value for std.lv is FALSE , by setting it to TRUE the first factor loading for each latent variable is freely estimated, the mean for each latent variable is set to 0, and the variance for each latent variable is set to 1.
CFAmodel.fitted 

lavaan 0.6-3 ended normally after 50 iterations Optimization method NLMINB Number of free parameters 19 Number of observations 322 Number of missing patterns 4 Estimator ML Model Fit Test Statistic 9.540 Degrees of freedom 8 P-value (Chi-square) 0.299 Parameter Estimates: Information Observed Observed information based on Hessian Standard Errors Standard Latent Variables: Estimate Std.Err z-value P(>|z|) MATH =~ wratcalc 6.041 0.276 21.921 0.000 wjcalc 4.144 0.203 20.370 0.000 waiscalc 2.410 0.165 14.636 0.000 SPELL =~ wratspl 6.532 0.288 22.645 0.000 wjspl 6.809 0.296 23.025 0.000 waisspl 6.354 0.283 22.463 0.000 Covariances: Estimate Std.Err z-value P(>|z|) MATH ~~ SPELL 0.553 0.042 13.191 0.000 Intercepts: Estimate Std.Err z-value P(>|z|) .wratcalc 38.922 0.355 109.514 0.000 .wjcalc 23.812 0.255 93.297 0.000 .waiscalc 11.022 0.186 59.230 0.000 .wratspl 36.484 0.385 94.751 0.000 .wjspl 41.674 0.398 104.808 0.000 .waisspl 37.163 0.376 98.788 0.000 MATH 0.000 SPELL 0.000 Variances: Estimate Std.Err z-value P(>|z|) .wratcalc 4.179 1.014 4.122 0.000 .wjcalc 3.769 0.537 7.017 0.000 .waiscalc 5.304 0.457 11.596 0.000 .wratspl 5.053 0.612 8.256 0.000 .wjspl 4.541 0.616 7.369 0.000 .waisspl 5.156 0.599 8.605 0.000 MATH 1.000 SPELL 1.000 
Examining Output
We can request standardized output columns by adding standardized = TRUE . The Std.lv column is the parameter estimates when just latent variables are standardized (they already were standardized, so this column tells us nothing new), whereas the Std.all column is the parameter estimates when all variables are standardized.
summary(CFAmodel.fitted, standardized = TRUE)
lavaan 0.6-3 ended normally after 50 iterations Optimization method NLMINB Number of free parameters 19 Number of observations 322 Number of missing patterns 4 Estimator ML Model Fit Test Statistic 9.540 Degrees of freedom 8 P-value (Chi-square) 0.299 Parameter Estimates: Information Observed Observed information based on Hessian Standard Errors Standard Latent Variables: Estimate Std.Err z-value P(>|z|) Std.lv Std.all MATH =~ wratcalc 6.041 0.276 21.921 0.000 6.041 0.947 wjcalc 4.144 0.203 20.370 0.000 4.144 0.906 waiscalc 2.410 0.165 14.636 0.000 2.410 0.723 SPELL =~ wratspl 6.532 0.288 22.645 0.000 6.532 0.946 wjspl 6.809 0.296 23.025 0.000 6.809 0.954 waisspl 6.354 0.283 22.463 0.000 6.354 0.942 Covariances: Estimate Std.Err z-value P(>|z|) Std.lv Std.all MATH ~~ SPELL 0.553 0.042 13.191 0.000 0.553 0.553 Intercepts: Estimate Std.Err z-value P(>|z|) Std.lv Std.all .wratcalc 38.922 0.355 109.514 0.000 38.922 6.103 .wjcalc 23.812 0.255 93.297 0.000 23.812 5.203 .waiscalc 11.022 0.186 59.230 0.000 11.022 3.306 .wratspl 36.484 0.385 94.751 0.000 36.484 5.282 .wjspl 41.674 0.398 104.808 0.000 41.674 5.841 .waisspl 37.163 0.376 98.788 0.000 37.163 5.508 MATH 0.000 0.000 0.000 SPELL 0.000 0.000 0.000 Variances: Estimate Std.Err z-value P(>|z|) Std.lv Std.all .wratcalc 4.179 1.014 4.122 0.000 4.179 0.103 .wjcalc 3.769 0.537 7.017 0.000 3.769 0.180 .waiscalc 5.304 0.457 11.596 0.000 5.304 0.477 .wratspl 5.053 0.612 8.256 0.000 5.053 0.106 .wjspl 4.541 0.616 7.369 0.000 4.541 0.089 .waisspl 5.156 0.599 8.605 0.000 5.156 0.113 MATH 1.000 1.000 1.000 SPELL 1.000 1.000 1.000
We can request the fit measures along with the parameter estimates. This makes the output resemble what is returned by Mplus.
summary(CFAmodel.fitted, standardized = TRUE, fit = TRUE)
lavaan 0.6-3 ended normally after 50 iterations Optimization method NLMINB Number of free parameters 19 Number of observations 322 Number of missing patterns 4 Estimator ML Model Fit Test Statistic 9.540 Degrees of freedom 8 P-value (Chi-square) 0.299 Model test baseline model: Minimum Function Test Statistic 1882.335 Degrees of freedom 15 P-value 0.000 User model versus baseline model: Comparative Fit Index (CFI) 0.999 Tucker-Lewis Index (TLI) 0.998 Loglikelihood and Information Criteria: Loglikelihood user model (H0) -5127.830 Loglikelihood unrestricted model (H1) -5123.060 Number of free parameters 19 Akaike (AIC) 10293.661 Bayesian (BIC) 10365.377 Sample-size adjusted Bayesian (BIC) 10305.112 Root Mean Square Error of Approximation: RMSEA 0.024 90 Percent Confidence Interval 0.000 0.073 P-value RMSEA |z|) Std.lv Std.all MATH =~ wratcalc 6.041 0.276 21.921 0.000 6.041 0.947 wjcalc 4.144 0.203 20.370 0.000 4.144 0.906 waiscalc 2.410 0.165 14.636 0.000 2.410 0.723 SPELL =~ wratspl 6.532 0.288 22.645 0.000 6.532 0.946 wjspl 6.809 0.296 23.025 0.000 6.809 0.954 waisspl 6.354 0.283 22.463 0.000 6.354 0.942 Covariances: Estimate Std.Err z-value P(>|z|) Std.lv Std.all MATH ~~ SPELL 0.553 0.042 13.191 0.000 0.553 0.553 Intercepts: Estimate Std.Err z-value P(>|z|) Std.lv Std.all .wratcalc 38.922 0.355 109.514 0.000 38.922 6.103 .wjcalc 23.812 0.255 93.297 0.000 23.812 5.203 .waiscalc 11.022 0.186 59.230 0.000 11.022 3.306 .wratspl 36.484 0.385 94.751 0.000 36.484 5.282 .wjspl 41.674 0.398 104.808 0.000 41.674 5.841 .waisspl 37.163 0.376 98.788 0.000 37.163 5.508 MATH 0.000 0.000 0.000 SPELL 0.000 0.000 0.000 Variances: Estimate Std.Err z-value P(>|z|) Std.lv Std.all .wratcalc 4.179 1.014 4.122 0.000 4.179 0.103 .wjcalc 3.769 0.537 7.017 0.000 3.769 0.180 .waiscalc 5.304 0.457 11.596 0.000 5.304 0.477 .wratspl 5.053 0.612 8.256 0.000 5.053 0.106 .wjspl 4.541 0.616 7.369 0.000 4.541 0.089 .waisspl 5.156 0.599 8.605 0.000 5.156 0.113 MATH 1.000 1.000 1.000 SPELL 1.000 1.000 1.000
What if we only want certain fit measures? Those can be assigned and viewed using the commands below.
fitStats 

 cfi rmsea 0.999 0.024 
We can pull out the parameter estimates instead of just looking at the summary. This can be useful when putting the parameter estimates in a table.
paramEsts 

 lhs op rhs est se z pvalue ci.lower ci.upper 1 MATH =~ wratcalc 6.041 0.276 21.921 0 5.501 6.581 2 MATH =~ wjcalc 4.144 0.203 20.370 0 3.745 4.543 3 MATH =~ waiscalc 2.410 0.165 14.636 0 2.088 2.733 4 SPELL =~ wratspl 6.532 0.288 22.645 0 5.967 7.097 5 SPELL =~ wjspl 6.809 0.296 23.025 0 6.230 7.389 6 SPELL =~ waisspl 6.354 0.283 22.463 0 5.799 6.908 7 wratcalc ~~ wratcalc 4.179 1.014 4.122 0 2.192 6.166 8 wjcalc ~~ wjcalc 3.769 0.537 7.017 0 2.716 4.821 9 waiscalc ~~ waiscalc 5.304 0.457 11.596 0 4.407 6.200 10 wratspl ~~ wratspl 5.053 0.612 8.256 0 3.853 6.253 11 wjspl ~~ wjspl 4.541 0.616 7.369 0 3.333 5.749 12 waisspl ~~ waisspl 5.156 0.599 8.605 0 3.982 6.330 13 MATH ~~ MATH 1.000 0.000 NA NA 1.000 1.000 14 SPELL ~~ SPELL 1.000 0.000 NA NA 1.000 1.000 15 MATH ~~ SPELL 0.553 0.042 13.191 0 0.470 0.635 16 wratcalc ~1 38.922 0.355 109.514 0 38.226 39.619 17 wjcalc ~1 23.812 0.255 93.297 0 23.311 24.312 18 waiscalc ~1 11.022 0.186 59.230 0 10.657 11.386 19 wratspl ~1 36.484 0.385 94.751 0 35.730 37.239 20 wjspl ~1 41.674 0.398 104.808 0 40.895 42.453 21 waisspl ~1 37.163 0.376 98.788 0 36.426 37.900 22 MATH ~1 0.000 0.000 NA NA 0.000 0.000 23 SPELL ~1 0.000 0.000 NA NA 0.000 0.000
We can also pull out the factor scores by using the lavPredict function on the testOutput object.
fscores 

 MATH SPELL [1,] 1.378216628 1.3727471 [2,] 1.812068411 2.0974157 [3,] -0.006094102 0.1972874 [4,] 1.400963259 -0.2650847 [5,] 1.768911802 -0.7860667 [6,] 0.549146289 0.3652424
The semPaths function from the semPlot package provides an easy way to produce a path diagram.
library(semPlot) semPaths(CFAmodel.fitted)
We can change some of the settings to make the path diagram into something we like a bit more.
semPaths(CFAmodel.fitted, what = "paths", intercepts = FALSE, sizeMan = 12, sizeMan2 = 8, layout = "tree2", sizeLat = 20, sizeLat2 = 10, width = 5, height = 1, label.cex = 1, nCharNodes = 0, curve = 2.5, label.scale = FALSE)
We can ask for the parameter estimates to be placed on the paths directly with the whatLabels = "est" argument. We also use the “edge.label.cex = 1.2” argument to make the values larger than the default.
semPaths(CFAmodel.fitted, what = "paths", whatLabels = "est", intercepts = FALSE, sizeMan = 12, sizeMan2 = 8,layout = "tree2", sizeLat = 20, sizeLat2 = 10, width = 5, height = 3, label.cex = 1, nCharNodes = 0, curve = 2.5, label.scale = FALSE, edge.label.cex = 1.2)
Lastly, the kutils function semTable can be used to produce a table summarizing the model parameter estimates in latex or html code. In this example we produce an html table that is automatically built by Rmarkdown.
semTable(CFAmodel.fitted, type = "html")
In R your output will look like this:
  Model
 Estimate Std. Err. z p
 Factor Loadings
 MATH
 wratcalc 6.04 0.28 21.92 .000
 wjcalc 4.14 0.20 20.37 .000
 waiscalc 2.41 0.16 14.64 .000
 SPELL
 wratspl 6.53 0.29 22.64 .000
 wjspl 6.81 0.30 23.03 .000
 waisspl 6.35 0.28 22.46 .000
 Intercepts
 wratcalc 38.92 0.36 109.51 .000
 wjcalc 23.81 0.26 93.30 .000
 waiscalc 11.02 0.19 59.23 .000
 wratspl 36.48 0.39 94.75 .000
 wjspl 41.67 0.40 104.81 .000
 waisspl 37.16 0.38 98.79 .000
 Residual Variances
 wratcalc 4.18 1.01 4.12 .000
 wjcalc 3.77 0.54 7.02 .000
 waiscalc 5.30 0.46 11.60 .000
 wratspl 5.05 0.61 8.26 .000
 wjspl 4.54 0.62 7.37 .000
 waisspl 5.16 0.60 8.60 .000
 Latent Intercepts
 MATH 0.00⁺  
 SPELL 0.00⁺  
 Latent Variances
 MATH 1.00⁺  
 SPELL 1.00⁺  
 Latent Covariances
 MATH w/SPELL 0.55 0.04 13.19 .000
 Fit Indices
 χ² 9.54(8)  .299
 CFI 1.00  
 TLI 1.00  
 RMSEA 0.02  
 ⁺Fixed parameter
 
 
If the file argument is used to specify a file to be saved, opening the resulting html file in a web browser will show a formatted table.
semTable(CFAmodel.fitted, type = "html", file = "CFAmodel.fitted.html")
  Model  
  Estimate  Std. Err.  z  p  
  Factor Loadings  
  MATH  
  wratcalc  6.04  0.28  21.92  .000  
  wjcalc  4.14  0.20  20.37  .000  
  waiscalc  2.41  0.16  14.64  .000  
  SPELL  
  wratspl  6.53  0.29  22.64  .000  
  wjspl  6.81  0.30  23.03  .000  
  waisspl  6.35  0.28  22.46  .000  
  Intercepts  
  wratcalc  38.92  0.36  109.51  .000  
  wjcalc  23.81  0.26  93.30  .000  
  waiscalc  11.02  0.19  59.23  .000  
  wratspl  36.48  0.39  94.75  .000  
  wjspl  41.67  0.40  104.81  .000  
  waisspl  37.16  0.38  98.79  .000  
  Residual Variances  
  wratcalc  4.18  1.01  4.12  .000  
  wjcalc  3.77  0.54  7.02  .000  
  waiscalc  5.30  0.46  11.60  .000  
  wratspl  5.05  0.61  8.26  .000  
  wjspl  4.54  0.62  7.37  .000  
  waisspl  5.16  0.60  8.60  .000  
  Latent Intercepts  
  MATH  0.00 +  
  SPELL  0.00 +  
  Latent Variances  
  MATH  1.00 +  
  SPELL  1.00 +  
  Latent Covariances  
  MATH w/SPELL  0.55  0.04  13.19  .000  
  Fit Indices  
  χ 2  9.54(8)  .299  
  CFI  1.00  
  TLI  1.00  
  RMSEA  0.02  
  + Fixed parameter  
 
The CFA Example in Mplus
Please click cfa-01.html if the reader would like to see the same CFA model fitted with Mplus.
Session Info
R version 3.5.1 (2018-07-02) Platform: x86_64-redhat-linux-gnu (64-bit) Running under: CentOS Linux 7 (Core) Matrix products: default BLAS/LAPACK: /usr/lib64/R/lib/libRblas.so locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] semPlot_1.1 kutils_1.62 lavaan_0.6-3 [4] stationery_0.98.5.7 loaded via a namespace (and not attached): [1] nlme_3.1-137 RColorBrewer_1.1-2 mi_1.0 [4] tools_3.5.1 backports_1.1.3 R6_2.3.0 [7] rpart_4.1-13 d3Network_0.5.2.1 Hmisc_4.2-0 [10] lazyeval_0.2.1 colorspace_1.4-0 nnet_7.3-12 [13] tidyselect_0.2.5 gridExtra_2.3 mnormt_1.5-5 [16] compiler_3.5.1 qgraph_1.6 fdrtool_1.2.15 [19] htmlTable_1.13.1 scales_1.0.0 checkmate_1.9.1 [22] psych_1.8.12 pbapply_1.4-0 sem_3.1-9 [25] stringr_1.3.1 digest_0.6.18 pbivnorm_0.6.0 [28] foreign_0.8-71 minqa_1.2.4 rmarkdown_1.11 [31] base64enc_0.1-3 jpeg_0.1-8 pkgconfig_2.0.2 [34] htmltools_0.3.6 lme4_1.1-20 lisrelToR_0.1.4 [37] htmlwidgets_1.3 rlang_0.3.1 rstudioapi_0.9.0 [40] huge_1.2.7 bindr_0.1.1 gtools_3.8.1 [43] acepack_1.4.1 dplyr_0.7.8 zip_1.0.0 [46] magrittr_1.5 OpenMx_2.12.1 Formula_1.2-3 [49] Matrix_1.2-15 Rcpp_1.0.0 munsell_0.5.0 [52] abind_1.4-5 rockchalk_1.8.130 stringi_1.2.4 [55] whisker_0.3-2 yaml_2.2.0 carData_3.0-2 [58] MASS_7.3-51.1 plyr_1.8.4 matrixcalc_1.0-3 [61] grid_3.5.1 parallel_3.5.1 crayon_1.3.4 [64] lattice_0.20-38 splines_3.5.1 knitr_1.21 [67] pillar_1.3.1 igraph_1.2.2 boot_1.3-20 [70] rjson_0.2.20 corpcor_1.6.9 BDgraph_2.54 [73] reshape2_1.4.3 stats4_3.5.1 XML_3.98-1.16 [76] glue_1.3.0 evaluate_0.12 latticeExtra_0.6-28 [79] data.table_1.12.0 png_0.1-7 nloptr_1.2.1 [82] gtable_0.2.0 purrr_0.3.0 assertthat_0.2.0 [85] ggplot2_3.1.0 xfun_0.4 openxlsx_4.1.0 [88] xtable_1.8-3 semTools_0.5-1 coda_0.19-2 [91] survival_2.43-3 glasso_1.10 tibble_2.0.1 [94] arm_1.10-1 ggm_2.3 bindrcpp_0.2.2 [97] cluster_2.0.7-1

Model
Estimate	Std. Err.	z	p
Factor Loadings
MATH
wratcalc	6.04	0.28	21.92	.000
wjcalc	4.14	0.20	20.37	.000
waiscalc	2.41	0.16	14.64	.000
SPELL
wratspl	6.53	0.29	22.64	.000
wjspl	6.81	0.30	23.03	.000
waisspl	6.35	0.28	22.46	.000
Intercepts
wratcalc	38.92	0.36	109.51	.000
wjcalc	23.81	0.26	93.30	.000
waiscalc	11.02	0.19	59.23	.000
wratspl	36.48	0.39	94.75	.000
wjspl	41.67	0.40	104.81	.000
waisspl	37.16	0.38	98.79	.000
Residual Variances
wratcalc	4.18	1.01	4.12	.000
wjcalc	3.77	0.54	7.02	.000
waiscalc	5.30	0.46	11.60	.000
wratspl	5.05	0.61	8.26	.000
wjspl	4.54	0.62	7.37	.000
waisspl	5.16	0.60	8.60	.000
Latent Intercepts
MATH	0.00⁺
SPELL	0.00⁺
Latent Variances
MATH	1.00⁺
SPELL	1.00⁺
Latent Covariances
MATH w/SPELL	0.55	0.04	13.19	.000
Fit Indices
χ²	9.54(8)	.299
CFI	1.00
TLI	1.00
RMSEA	0.02
⁺Fixed parameter