This document shows examples for using the sjt.lm()
function of the sjPlot package.
# load package
library(sjPlot)
library(sjmisc)
library(sjlabelled)
# sample data
data(efc)
The sjt.lm()
function prints summaries of linear models (fitted with the lm()
function) as nicely formatted html-tables.
Before starting, sample data is loaded and sample models are fitted:
# fit first model
fit1 <- lm(barthtot ~ c160age + c12hour + c161sex + c172code, data = efc)
# fit second model
fit2 <- lm(neg_c_7 ~ c160age + c12hour + c161sex + c172code, data = efc)
# Note that both models share the same predictors and only differ
# in their dependent variable. See examples of stepwise models
# later...
The simplest way of producing the table output is by passing the fitted models as parameter. By default, estimates (B), confidence intervals (CI) and p-values (p) are reported. The models are named Model 1 and Model 2.
sjt.lm(fit1, fit2)
Total score BARTHEL INDEX | Negative impact with 7 items | |||||||
B | CI | p | B | CI | p | |||
(Intercept) | 90.06 | 77.95 – 102.18 | <.001 | 8.46 | 6.67 – 10.24 | <.001 | ||
carer’ age | -0.22 | -0.36 – -0.08 | .002 | 0.01 | -0.01 – 0.03 | .206 | ||
average number of hours of care per week | -0.28 | -0.31 – -0.24 | <.001 | 0.02 | 0.01 – 0.02 | <.001 | ||
carer’s gender | -0.26 | -4.36 – 3.83 | .900 | 0.57 | -0.03 – 1.17 | .061 | ||
carer’s level of education | -0.76 | -3.55 – 2.02 | .592 | 0.44 | 0.03 – 0.86 | .034 | ||
Observations | 821 | 832 | ||||||
R2 / adj. R2 | .270 / .266 | .079 / .075 |
You can specify the ‘model’ label via depvar.labels
parameter:
sjt.lm(fit1, fit2, depvar.labels = c("Barthel-Index", "Negative Impact"))
Barthel-Index | Negative Impact | |||||||
B | CI | p | B | CI | p | |||
(Intercept) | 90.06 | 77.95 – 102.18 | <.001 | 8.46 | 6.67 – 10.24 | <.001 | ||
carer’ age | -0.22 | -0.36 – -0.08 | .002 | 0.01 | -0.01 – 0.03 | .206 | ||
average number of hours of care per week | -0.28 | -0.31 – -0.24 | <.001 | 0.02 | 0.01 – 0.02 | <.001 | ||
carer’s gender | -0.26 | -4.36 – 3.83 | .900 | 0.57 | -0.03 – 1.17 | .061 | ||
carer’s level of education | -0.76 | -3.55 – 2.02 | .592 | 0.44 | 0.03 – 0.86 | .034 | ||
Observations | 821 | 832 | ||||||
R2 / adj. R2 | .270 / .266 | .079 / .075 |
Here is an example how to change the other labels. Note that show.header
makes the two labels on top and top left corner appear in the table.
sjt.lm(
fit1,
fit2,
show.header = TRUE,
string.est = "Estimate",
string.ci = "Conf. Int.",
string.p = "p-value",
string.dv = "Response",
string.pred = "Coefficients",
string.interc = "Konstante",
depvar.labels = c("Barthel-Index", "Negative Impact")
)
Coefficients | Response | Barthel-Index | Negative Impact | |||||
Estimate | Conf. Int. | p-value | Estimate | Conf. Int. | p-value | |||
Konstante | 90.06 | 77.95 – 102.18 | <.001 | 8.46 | 6.67 – 10.24 | <.001 | ||
carer’ age | -0.22 | -0.36 – -0.08 | .002 | 0.01 | -0.01 – 0.03 | .206 | ||
average number of hours of care per week | -0.28 | -0.31 – -0.24 | <.001 | 0.02 | 0.01 – 0.02 | <.001 | ||
carer’s gender | -0.26 | -4.36 – 3.83 | .900 | 0.57 | -0.03 – 1.17 | .061 | ||
carer’s level of education | -0.76 | -3.55 – 2.02 | .592 | 0.44 | 0.03 – 0.86 | .034 | ||
Observations | 821 | 832 | ||||||
R2 / adj. R2 | .270 / .266 | .079 / .075 |
You can change the table style with specific parameters, e.g. to include CI into the same table cell as the estimates, print asterisks instead of numeric p-values etc.
sjt.lm(
fit1,
fit2,
separate.ci.col = FALSE, # ci in same cell as estimates
show.std = TRUE, # also show standardized beta values
p.numeric = FALSE # "*" instead of numeric values
)
Total score BARTHEL INDEX | Negative impact with 7 items | |||||
B (CI) | std. Beta (CI) | B (CI) | std. Beta (CI) | |||
(Intercept) |
90.06 (77.95 – 102.18) *** |
8.46 (6.67 – 10.24) *** |
||||
carer’ age |
-0.22 (-0.36 – -0.08) ** |
-0.10 (-0.16 – -0.04) |
0.01 (-0.01 – 0.03) |
0.05 (-0.03 – 0.12) |
||
average number of hours of care per week |
-0.28 (-0.31 – -0.24) *** |
-0.48 (-0.54 – -0.41) |
0.02 (0.01 – 0.02) *** |
0.25 (0.18 – 0.32) |
||
carer’s gender |
-0.26 (-4.36 – 3.83) |
-0.00 (-0.06 – 0.06) |
0.57 (-0.03 – 1.17) |
0.06 (-0.00 – 0.13) |
||
carer’s level of education |
-0.76 (-3.55 – 2.02) |
-0.02 (-0.08 – 0.04) |
0.44 (0.03 – 0.86) * |
0.07 (0.01 – 0.14) |
||
Observations | 821 | 832 | ||||
R2 / adj. R2 | .270 / .266 | .079 / .075 | ||||
Notes | * p<.05 ** p<.01 *** p<.001 |
In the above example, the original variable labels are long and not so pretty. You can change variable labels either with sjmisc::set_label()
, which will affect all future plots and tables, or pass own labels via pred.labels
.
sjt.lm(fit1, fit2, pred.labels = c("Carer's Age", "Hours of Care", "Carer's Sex", "Educational Status"))
Total score BARTHEL INDEX | Negative impact with 7 items | |||||||
B | CI | p | B | CI | p | |||
(Intercept) | 90.06 | 77.95 – 102.18 | <.001 | 8.46 | 6.67 – 10.24 | <.001 | ||
Carer’s Age | -0.22 | -0.36 – -0.08 | .002 | 0.01 | -0.01 – 0.03 | .206 | ||
Hours of Care | -0.28 | -0.31 – -0.24 | <.001 | 0.02 | 0.01 – 0.02 | <.001 | ||
Carer’s Sex | -0.26 | -4.36 – 3.83 | .900 | 0.57 | -0.03 – 1.17 | .061 | ||
Educational Status | -0.76 | -3.55 – 2.02 | .592 | 0.44 | 0.03 – 0.86 | .034 | ||
Observations | 821 | 832 | ||||||
R2 / adj. R2 | .270 / .266 | .079 / .075 |
In some cases, for instance stepwise regressions, you have different predictors on the same response. The proper grouping of predictors, resp. rows, is done automatically.
First, let’s fit some example models.
# fit first model
fit1 <- lm(neg_c_7 ~ c160age + c172code + c161sex, data = efc)
# fit second model
fit2 <- lm(neg_c_7 ~ c160age + c172code + c161sex + c12hour, data = efc)
# fit second model
fit3 <- lm(neg_c_7 ~ c160age + c172code + e42dep + tot_sc_e, data = efc)
Note that printing tables with fitted models, which have different predictors do not automatically detect variable labels (maybe this will be implemented in a future package version).
sjt.lm(fit1, fit2, fit3,
separate.ci.col = FALSE,
show.aic = TRUE,
show.fstat = TRUE)
Negative impact with 7 items | Negative impact with 7 items | Negative impact with 7 items | |||||||
B (CI) | p | B (CI) | p | B (CI) | p | ||||
(Intercept) |
7.82 (6.00 – 9.65) |
<.001 |
8.46 (6.67 – 10.24) |
<.001 |
6.23 (4.76 – 7.69) |
<.001 | |||
carer’ age |
0.04 (0.02 – 0.06) |
<.001 |
0.01 (-0.01 – 0.03) |
.206 |
0.01 (-0.01 – 0.03) |
.271 | |||
carer’s level of education |
0.39 (-0.03 – 0.81) |
.071 |
0.44 (0.03 – 0.86) |
.034 |
0.24 (-0.15 – 0.64) |
.230 | |||
carer’s gender |
0.69 (0.07 – 1.31) |
.028 |
0.57 (-0.03 – 1.17) |
.061 | |||||
average number of hours of care per week |
0.02 (0.01 – 0.02) |
<.001 | |||||||
elder’s dependency |
1.50 (1.23 – 1.77) |
<.001 | |||||||
Services for elderly |
0.21 (0.01 – 0.41) |
.038 | |||||||
Observations | 832 | 832 | 833 | ||||||
R2 / adj. R2 | .025 / .022 | .079 / .075 | .153 / .148 | ||||||
F-statistics | 7.107*** | 17.730*** | 37.250*** | ||||||
AIC | 4611.921 | 4566.622 | 4502.333 |
Especially when fitting and summarizing some more models, it might help to increase the distance between the columns that separate the models. This can be done by tweaking the css.separatorcol
style-sheet:
sjt.lm(fit1, fit2, fit3, CSS = list(css.separatorcol = 'padding-right:1.5em; padding-left:1.5em;'))
Negative impact with 7 items | Negative impact with 7 items | Negative impact with 7 items | ||||||||||
B | CI | p | B | CI | p | B | CI | p | ||||
(Intercept) | 7.82 | 6.00 – 9.65 | <.001 | 8.46 | 6.67 – 10.24 | <.001 | 6.23 | 4.76 – 7.69 | <.001 | |||
carer’ age | 0.04 | 0.02 – 0.06 | <.001 | 0.01 | -0.01 – 0.03 | .206 | 0.01 | -0.01 – 0.03 | .271 | |||
carer’s level of education | 0.39 | -0.03 – 0.81 | .071 | 0.44 | 0.03 – 0.86 | .034 | 0.24 | -0.15 – 0.64 | .230 | |||
carer’s gender | 0.69 | 0.07 – 1.31 | .028 | 0.57 | -0.03 – 1.17 | .061 | ||||||
average number of hours of care per week | 0.02 | 0.01 – 0.02 | <.001 | |||||||||
elder’s dependency | 1.50 | 1.23 – 1.77 | <.001 | |||||||||
Services for elderly | 0.21 | 0.01 – 0.41 | .038 | |||||||||
Observations | 832 | 832 | 833 | |||||||||
R2 / adj. R2 | .025 / .022 | .079 / .075 | .153 / .148 |
In case you have categorical variables with more than two factor levels, the sjt.lm()
function automatically groups the category levels to give a better overview of predictors in the table.
By default, automatic grouping is activated. To disable this feature, use group.pred = FALSE
as parameter.
To demonstrate this feature, we first convert two predictors to factors (what they actually are, indeed). To do this, we use the sjmisc::to_factor()
function, which converts numerical variables into factors, however, does not remove the variable and value label attributes.
# make education categorical
efc$c172code <- to_factor(efc$c172code)
# make dependency categorical
efc$e42dep <- to_factor(efc$e42dep)
# fit first model again (with c172code as factor)
fit1 <- lm(barthtot ~ c160age + c12hour + c172code + c161sex + e42dep, data = efc)
# fit second model again (with c172code as factor)
fit2 <- lm(neg_c_7 ~ c160age + c12hour + c172code + c161sex + e42dep, data = efc)
Now we can print the table.
sjt.lm(fit1, fit2)
Total score BARTHEL INDEX | Negative impact with 7 items | |||||||
B | CI | p | B | CI | p | |||
(Intercept) | 97.17 | 88.37 – 105.97 | <.001 | 7.76 | 5.97 – 9.55 | <.001 | ||
carer’ age | -0.06 | -0.16 – 0.03 | .203 | 0.00 | -0.02 – 0.02 | .683 | ||
average number of hours of care per week | -0.07 | -0.10 – -0.04 | <.001 | 0.01 | 0.00 – 0.01 | .015 | ||
carer’s level of education | ||||||||
intermediate level of education | 1.50 | -1.60 – 4.60 | .343 | 0.13 | -0.50 – 0.76 | .689 | ||
high level of education | 0.66 | -3.20 – 4.52 | .738 | 0.72 | -0.07 – 1.51 | .074 | ||
carer’s gender | 0.09 | -2.74 – 2.93 | .949 | 0.56 | -0.02 – 1.13 | .058 | ||
elder’s dependency | ||||||||
slightly dependent | -7.85 | -12.86 – -2.83 | .002 | 1.11 | 0.09 – 2.13 | .033 | ||
moderately dependent | -19.49 | -24.42 – -14.57 | <.001 | 2.37 | 1.37 – 3.37 | <.001 | ||
severely dependent | -56.87 | -62.12 – -51.63 | <.001 | 3.92 | 2.86 – 4.99 | <.001 | ||
Observations | 821 | 832 | ||||||
R2 / adj. R2 | .653 / .650 | .160 / .152 |
With remove.estimates
, specific estimates can be removed from the table output. This may make sense in case you have stepwise regression models and only want to compare the varying predictors but not the controls. remove.estimates
either accepts the row indices of the rows of the table output that should be removed, or the coefficient’s names.
When using numeric indices, the estimates’ index number relates to the same order as coef(fit)
.
Note that automatic grouping of categorical predictors (argument group.pred
) does not yet work properly when removing estimates! See also Example 6 for further details.
data(efc)
# make education categorical
efc$c172code <- to_factor(efc$c172code)
# make education categorical
efc$e42dep <- to_factor(efc$e42dep)
# make prettier variable labels
set_label(efc$c172code) <- "Education"
set_label(efc$e42dep) <- "Dependency"
# fit first model
fit1 <- lm(neg_c_7 ~ c160age + c172code + c161sex, data = efc)
# fit second model
fit2 <- lm(neg_c_7 ~ c160age + c172code + c161sex + c12hour, data = efc)
# fit third model
fit3 <- lm(neg_c_7 ~ c160age + c172code + e42dep + tot_sc_e, data = efc)
Here you have the complete table output. This helps you identify the row index numbers. Especially when you have multiple models with different predictors, the estimate’s position in the last model may differ from this estimate’s position in the table output.
sjt.lm(fit1, fit2, fit3, group.pred = FALSE)
Negative impact with 7 items | Negative impact with 7 items | Negative impact with 7 items | ||||||||||
B | CI | p | B | CI | p | B | CI | p | ||||
(Intercept) | 8.40 | 6.72 – 10.08 | <.001 | 9.18 | 7.53 – 10.83 | <.001 | 8.48 | 6.99 – 9.97 | <.001 | |||
carer’ age | 0.04 | 0.02 – 0.06 | <.001 | 0.01 | -0.01 – 0.03 | .306 | 0.01 | -0.01 – 0.03 | .384 | |||
Education (intermediate level of education) | 0.16 | -0.52 – 0.83 | .652 | 0.12 | -0.54 – 0.78 | .728 | 0.08 | -0.56 – 0.72 | .806 | |||
Education (high level of education) | 0.79 | -0.05 – 1.64 | .066 | 0.91 | 0.09 – 1.74 | .030 | 0.52 | -0.28 – 1.32 | .203 | |||
carer’s gender | 0.70 | 0.09 – 1.32 | .025 | 0.59 | -0.01 – 1.19 | .053 | ||||||
average number of hours of care per week | 0.02 | 0.01 – 0.02 | <.001 | |||||||||
Dependency (slightly dependent) | 1.18 | 0.16 – 2.20 | .024 | |||||||||
Dependency (moderately dependent) | 2.53 | 1.53 – 3.52 | <.001 | |||||||||
Dependency (severely dependent) | 4.32 | 3.31 – 5.33 | <.001 | |||||||||
Services for elderly | 0.21 | 0.01 – 0.41 | .042 | |||||||||
Observations | 832 | 832 | 833 | |||||||||
R2 / adj. R2 | .026 / .021 | .081 / .075 | .154 / .147 |
Note that currently the intercept cannot be removed from the model output. However, you can “fake” a removed intercept by setting the font size of the first row (with intercept) to zero via CSS = list(css.topcontentborder = "+font-size: 0px;")
.
sjt.lm(fit1, fit2, fit3, group.pred = FALSE, CSS = list(css.topcontentborder = "+font-size: 0px;"))
Negative impact with 7 items | Negative impact with 7 items | Negative impact with 7 items | ||||||||||
B | CI | p | B | CI | p | B | CI | p | ||||
(Intercept) | 8.40 | 6.72 – 10.08 | <.001 | 9.18 | 7.53 – 10.83 | <.001 | 8.48 | 6.99 – 9.97 | <.001 | |||
carer’ age | 0.04 | 0.02 – 0.06 | <.001 | 0.01 | -0.01 – 0.03 | .306 | 0.01 | -0.01 – 0.03 | .384 | |||
Education (intermediate level of education) | 0.16 | -0.52 – 0.83 | .652 | 0.12 | -0.54 – 0.78 | .728 | 0.08 | -0.56 – 0.72 | .806 | |||
Education (high level of education) | 0.79 | -0.05 – 1.64 | .066 | 0.91 | 0.09 – 1.74 | .030 | 0.52 | -0.28 – 1.32 | .203 | |||
carer’s gender | 0.70 | 0.09 – 1.32 | .025 | 0.59 | -0.01 – 1.19 | .053 | ||||||
average number of hours of care per week | 0.02 | 0.01 – 0.02 | <.001 | |||||||||
Dependency (slightly dependent) | 1.18 | 0.16 – 2.20 | .024 | |||||||||
Dependency (moderately dependent) | 2.53 | 1.53 – 3.52 | <.001 | |||||||||
Dependency (severely dependent) | 4.32 | 3.31 – 5.33 | <.001 | |||||||||
Services for elderly | 0.21 | 0.01 – 0.41 | .042 | |||||||||
Observations | 832 | 832 | 833 | |||||||||
R2 / adj. R2 | .026 / .021 | .081 / .075 | .154 / .147 |
sjt.lm(fit1, fit2, fit3, group.pred = FALSE, remove.estimates = 2)
Negative impact with 7 items | Negative impact with 7 items | Negative impact with 7 items | ||||||||||
B | CI | p | B | CI | p | B | CI | p | ||||
(Intercept) | 8.40 | 6.72 – 10.08 | <.001 | 9.18 | 7.53 – 10.83 | <.001 | 8.48 | 6.99 – 9.97 | <.001 | |||
Education (intermediate level of education) | 0.16 | -0.52 – 0.83 | .652 | 0.12 | -0.54 – 0.78 | .728 | 0.08 | -0.56 – 0.72 | .806 | |||
Education (high level of education) | 0.79 | -0.05 – 1.64 | .066 | 0.91 | 0.09 – 1.74 | .030 | 0.52 | -0.28 – 1.32 | .203 | |||
carer’s gender | 0.70 | 0.09 – 1.32 | .025 | 0.59 | -0.01 – 1.19 | .053 | – | |||||
average number of hours of care per week | – | 0.02 | 0.01 – 0.02 | <.001 | – | |||||||
Dependency (slightly dependent) | – | – | 1.18 | 0.16 – 2.20 | .024 | |||||||
Dependency (moderately dependent) | – | – | 2.53 | 1.53 – 3.52 | <.001 | |||||||
Dependency (severely dependent) | – | – | 4.32 | 3.31 – 5.33 | <.001 | |||||||
Services for elderly | – | – | 0.21 | 0.01 – 0.41 | .042 | |||||||
Observations | 832 | 832 | 833 | |||||||||
R2 / adj. R2 | .026 / .021 | .081 / .075 | .154 / .147 |
sjt.lm(fit1, fit2, fit3, group.pred = FALSE, remove.estimates = c("c160age", "c161sex"))
Negative impact with 7 items | Negative impact with 7 items | Negative impact with 7 items | ||||||||||
B | CI | p | B | CI | p | B | CI | p | ||||
(Intercept) | 8.40 | 6.72 – 10.08 | <.001 | 9.18 | 7.53 – 10.83 | <.001 | 8.48 | 6.99 – 9.97 | <.001 | |||
Education (intermediate level of education) | 0.16 | -0.52 – 0.83 | .652 | 0.12 | -0.54 – 0.78 | .728 | 0.08 | -0.56 – 0.72 | .806 | |||
Education (high level of education) | 0.79 | -0.05 – 1.64 | .066 | 0.91 | 0.09 – 1.74 | .030 | 0.52 | -0.28 – 1.32 | .203 | |||
average number of hours of care per week | – | 0.02 | 0.01 – 0.02 | <.001 | – | |||||||
Dependency (slightly dependent) | – | – | 1.18 | 0.16 – 2.20 | .024 | |||||||
Dependency (moderately dependent) | – | – | 2.53 | 1.53 – 3.52 | <.001 | |||||||
Dependency (severely dependent) | – | – | 4.32 | 3.31 – 5.33 | <.001 | |||||||
Services for elderly | – | – | 0.21 | 0.01 – 0.41 | .042 | |||||||
Observations | 832 | 832 | 833 | |||||||||
R2 / adj. R2 | .026 / .021 | .081 / .075 | .154 / .147 |
sjt.lm(fit1, fit2, fit3, group.pred = FALSE, remove.estimates = c(2,5,6,10))
Negative impact with 7 items | Negative impact with 7 items | Negative impact with 7 items | ||||||||||
B | CI | p | B | CI | p | B | CI | p | ||||
(Intercept) | 8.40 | 6.72 – 10.08 | <.001 | 9.18 | 7.53 – 10.83 | <.001 | 8.48 | 6.99 – 9.97 | <.001 | |||
Education (intermediate level of education) | 0.16 | -0.52 – 0.83 | .652 | 0.12 | -0.54 – 0.78 | .728 | 0.08 | -0.56 – 0.72 | .806 | |||
Education (high level of education) | 0.79 | -0.05 – 1.64 | .066 | 0.91 | 0.09 – 1.74 | .030 | 0.52 | -0.28 – 1.32 | .203 | |||
Dependency (slightly dependent) | – | – | 1.18 | 0.16 – 2.20 | .024 | |||||||
Dependency (moderately dependent) | – | – | 2.53 | 1.53 – 3.52 | <.001 | |||||||
Dependency (severely dependent) | – | – | 4.32 | 3.31 – 5.33 | <.001 | |||||||
Observations | 832 | 832 | 833 | |||||||||
R2 / adj. R2 | .026 / .021 | .081 / .075 | .154 / .147 |
In most cases you need to define your own labels when removing estimates, especially when you have grouped categorical predictors, because automatic label detection is quite tricky in such situations. If you provide own labels, please note that grouped predictors’ headings (the variable name of the grouped, categorical variable) are still automatically set by the sjt.lm()
function (variable labels are used, so use set_label()
for those categorical predictors). All data rows in the table, i.e. for each coefficient appearing in the model, you need to specify a label string.
In the next example, we have seven table rows with data (excluding intercept): mid and hi education (categories of the variable Education), Hours of Care, slight, moderate and severe dependency (categories of the variable Dependency) and Service Usage. These ‘rows’ need to be labelled.
sjt.lm(fit1, fit2, fit3,
pred.labels = c("mid education", "hi education", "Hours of Care",
"slight dependency", "moderate dependency",
"severe dependency", "Service Usage"),
remove.estimates = c("c160age", "c161sex"))
Negative impact with 7 items | Negative impact with 7 items | Negative impact with 7 items | ||||||||||
B | CI | p | B | CI | p | B | CI | p | ||||
(Intercept) | 8.40 | 6.72 – 10.08 | <.001 | 9.18 | 7.53 – 10.83 | <.001 | 8.48 | 6.99 – 9.97 | <.001 | |||
Education | ||||||||||||
mid education | 0.16 | -0.52 – 0.83 | .652 | 0.12 | -0.54 – 0.78 | .728 | 0.08 | -0.56 – 0.72 | .806 | |||
hi education | 0.79 | -0.05 – 1.64 | .066 | 0.91 | 0.09 – 1.74 | .030 | 0.52 | -0.28 – 1.32 | .203 | |||
Hours of Care | – | 0.02 | 0.01 – 0.02 | <.001 | – | |||||||
Dependency | ||||||||||||
slight dependency | – | – | 1.18 | 0.16 – 2.20 | .024 | |||||||
moderate dependency | – | – | 2.53 | 1.53 – 3.52 | <.001 | |||||||
severe dependency | – | – | 4.32 | 3.31 – 5.33 | <.001 | |||||||
Service Usage | – | – | 0.21 | 0.01 – 0.41 | .042 | |||||||
Observations | 832 | 832 | 833 | |||||||||
R2 / adj. R2 | .026 / .021 | .081 / .075 | .154 / .147 |