Summary of Linear Regression Models as HTML Table

Daniel Lüdecke

2018-02-05

This document shows examples for using the sjt.lm() function of the sjPlot package.

# load package
library(sjPlot)
library(sjmisc)
library(sjlabelled)

# sample data
data(efc)

Linear model summaries as HTML table

The sjt.lm() function prints summaries of linear models (fitted with the lm() function) as nicely formatted html-tables.

Before starting, sample data is loaded and sample models are fitted:

# fit first model
fit1 <- lm(barthtot ~ c160age + c12hour + c161sex + c172code, data = efc)
# fit second model 
fit2 <- lm(neg_c_7 ~ c160age + c12hour + c161sex + c172code, data = efc)
# Note that both models share the same predictors and only differ 
# in their dependent variable. See examples of stepwise models 
# later...

The simplest way of producing the table output is by passing the fitted models as parameter. By default, estimates (B), confidence intervals (CI) and p-values (p) are reported. The models are named Model 1 and Model 2.

sjt.lm(fit1, fit2)
    Total score BARTHEL INDEX   Negative impact with 7 items
    B CI p   B CI p
(Intercept)   90.06 77.95 – 102.18 <.001   8.46 6.67 – 10.24 <.001
carer’ age   -0.22 -0.36 – -0.08 .002   0.01 -0.01 – 0.03 .206
average number of hours of care per week   -0.28 -0.31 – -0.24 <.001   0.02 0.01 – 0.02 <.001
carer’s gender   -0.26 -4.36 – 3.83 .900   0.57 -0.03 – 1.17 .061
carer’s level of education   -0.76 -3.55 – 2.02 .592   0.44 0.03 – 0.86 .034
Observations   821   832
R2 / adj. R2   .270 / .266   .079 / .075

Custom labels

You can specify the ‘model’ label via depvar.labels parameter:

sjt.lm(fit1, fit2, depvar.labels = c("Barthel-Index", "Negative Impact"))
    Barthel-Index   Negative Impact
    B CI p   B CI p
(Intercept)   90.06 77.95 – 102.18 <.001   8.46 6.67 – 10.24 <.001
carer’ age   -0.22 -0.36 – -0.08 .002   0.01 -0.01 – 0.03 .206
average number of hours of care per week   -0.28 -0.31 – -0.24 <.001   0.02 0.01 – 0.02 <.001
carer’s gender   -0.26 -4.36 – 3.83 .900   0.57 -0.03 – 1.17 .061
carer’s level of education   -0.76 -3.55 – 2.02 .592   0.44 0.03 – 0.86 .034
Observations   821   832
R2 / adj. R2   .270 / .266   .079 / .075

More custom labels

Here is an example how to change the other labels. Note that show.header makes the two labels on top and top left corner appear in the table.

sjt.lm(
  fit1,
  fit2,
  show.header = TRUE,
  string.est = "Estimate",
  string.ci = "Conf. Int.",
  string.p = "p-value",
  string.dv = "Response",
  string.pred = "Coefficients",
  string.interc = "Konstante",
  depvar.labels = c("Barthel-Index", "Negative Impact")
)
Coefficients Response
  Barthel-Index   Negative Impact
    Estimate Conf. Int. p-value   Estimate Conf. Int. p-value
Konstante   90.06 77.95 – 102.18 <.001   8.46 6.67 – 10.24 <.001
carer’ age   -0.22 -0.36 – -0.08 .002   0.01 -0.01 – 0.03 .206
average number of hours of care per week   -0.28 -0.31 – -0.24 <.001   0.02 0.01 – 0.02 <.001
carer’s gender   -0.26 -4.36 – 3.83 .900   0.57 -0.03 – 1.17 .061
carer’s level of education   -0.76 -3.55 – 2.02 .592   0.44 0.03 – 0.86 .034
Observations   821   832
R2 / adj. R2   .270 / .266   .079 / .075

Changing summary style and content

You can change the table style with specific parameters, e.g. to include CI into the same table cell as the estimates, print asterisks instead of numeric p-values etc.

sjt.lm(
  fit1,
  fit2,
  separate.ci.col = FALSE, # ci in same cell as estimates
  show.std = TRUE,         # also show standardized beta values
  p.numeric = FALSE        # "*" instead of numeric values
)       
    Total score BARTHEL INDEX   Negative impact with 7 items
    B (CI) std. Beta (CI)   B (CI) std. Beta (CI)
(Intercept)   90.06
(77.95 – 102.18) ***
    8.46
(6.67 – 10.24) ***
 
carer’ age   -0.22
(-0.36 – -0.08) **
-0.10
(-0.16 – -0.04)
  0.01
(-0.01 – 0.03) 
0.05
(-0.03 – 0.12)
average number of hours of care per week   -0.28
(-0.31 – -0.24) ***
-0.48
(-0.54 – -0.41)
  0.02
(0.01 – 0.02) ***
0.25
(0.18 – 0.32)
carer’s gender   -0.26
(-4.36 – 3.83) 
-0.00
(-0.06 – 0.06)
  0.57
(-0.03 – 1.17) 
0.06
(-0.00 – 0.13)
carer’s level of education   -0.76
(-3.55 – 2.02) 
-0.02
(-0.08 – 0.04)
  0.44
(0.03 – 0.86) *
0.07
(0.01 – 0.14)
Observations   821   832
R2 / adj. R2   .270 / .266   .079 / .075
Notes * p<.05   ** p<.01   *** p<.001

Custom variable labels

In the above example, the original variable labels are long and not so pretty. You can change variable labels either with sjmisc::set_label(), which will affect all future plots and tables, or pass own labels via pred.labels.

sjt.lm(fit1, fit2, pred.labels = c("Carer's Age", "Hours of Care", "Carer's Sex", "Educational Status"))
    Total score BARTHEL INDEX   Negative impact with 7 items
    B CI p   B CI p
(Intercept)   90.06 77.95 – 102.18 <.001   8.46 6.67 – 10.24 <.001
Carer’s Age   -0.22 -0.36 – -0.08 .002   0.01 -0.01 – 0.03 .206
Hours of Care   -0.28 -0.31 – -0.24 <.001   0.02 0.01 – 0.02 <.001
Carer’s Sex   -0.26 -4.36 – 3.83 .900   0.57 -0.03 – 1.17 .061
Educational Status   -0.76 -3.55 – 2.02 .592   0.44 0.03 – 0.86 .034
Observations   821   832
R2 / adj. R2   .270 / .266   .079 / .075

Compare models with different predictors

In some cases, for instance stepwise regressions, you have different predictors on the same response. The proper grouping of predictors, resp. rows, is done automatically.

First, let’s fit some example models.

# fit first model
fit1 <- lm(neg_c_7 ~ c160age + c172code + c161sex, data = efc)
# fit second model
fit2 <- lm(neg_c_7 ~ c160age + c172code + c161sex + c12hour, data = efc)
# fit second model
fit3 <- lm(neg_c_7 ~ c160age + c172code + e42dep + tot_sc_e, data = efc)

Note that printing tables with fitted models, which have different predictors do not automatically detect variable labels (maybe this will be implemented in a future package version).

sjt.lm(fit1, fit2, fit3, 
       separate.ci.col = FALSE,
       show.aic = TRUE,
       show.fstat = TRUE)
    Negative impact with 7 items   Negative impact with 7 items   Negative impact with 7 items
    B (CI) p   B (CI) p   B (CI) p
(Intercept)   7.82
(6.00 – 9.65)
<.001   8.46
(6.67 – 10.24)
<.001   6.23
(4.76 – 7.69)
<.001
carer’ age   0.04
(0.02 – 0.06)
<.001   0.01
(-0.01 – 0.03)
.206   0.01
(-0.01 – 0.03)
.271
carer’s level of education   0.39
(-0.03 – 0.81)
.071   0.44
(0.03 – 0.86)
.034   0.24
(-0.15 – 0.64)
.230
carer’s gender   0.69
(0.07 – 1.31)
.028   0.57
(-0.03 – 1.17)
.061  
average number of hours of care per week     0.02
(0.01 – 0.02)
<.001  
elder’s dependency       1.50
(1.23 – 1.77)
<.001
Services for elderly       0.21
(0.01 – 0.41)
.038
Observations   832   832   833
R2 / adj. R2   .025 / .022   .079 / .075   .153 / .148
F-statistics   7.107***   17.730***   37.250***
AIC   4611.921   4566.622   4502.333

More space bewteen model columns

Especially when fitting and summarizing some more models, it might help to increase the distance between the columns that separate the models. This can be done by tweaking the css.separatorcol style-sheet:

sjt.lm(fit1, fit2, fit3, CSS = list(css.separatorcol = 'padding-right:1.5em; padding-left:1.5em;'))
    Negative impact with 7 items   Negative impact with 7 items   Negative impact with 7 items
    B CI p   B CI p   B CI p
(Intercept)   7.82 6.00 – 9.65 <.001   8.46 6.67 – 10.24 <.001   6.23 4.76 – 7.69 <.001
carer’ age   0.04 0.02 – 0.06 <.001   0.01 -0.01 – 0.03 .206   0.01 -0.01 – 0.03 .271
carer’s level of education   0.39 -0.03 – 0.81 .071   0.44 0.03 – 0.86 .034   0.24 -0.15 – 0.64 .230
carer’s gender   0.69 0.07 – 1.31 .028   0.57 -0.03 – 1.17 .061    
average number of hours of care per week       0.02 0.01 – 0.02 <.001    
elder’s dependency           1.50 1.23 – 1.77 <.001
Services for elderly           0.21 0.01 – 0.41 .038
Observations   832   832   833
R2 / adj. R2   .025 / .022   .079 / .075   .153 / .148

Automatic grouping of categorical predictors

In case you have categorical variables with more than two factor levels, the sjt.lm() function automatically groups the category levels to give a better overview of predictors in the table.

By default, automatic grouping is activated. To disable this feature, use group.pred = FALSE as parameter.

To demonstrate this feature, we first convert two predictors to factors (what they actually are, indeed). To do this, we use the sjmisc::to_factor() function, which converts numerical variables into factors, however, does not remove the variable and value label attributes.

# make education categorical
efc$c172code <- to_factor(efc$c172code)
# make dependency categorical
efc$e42dep <- to_factor(efc$e42dep)
# fit first model again (with c172code as factor)
fit1 <- lm(barthtot ~ c160age + c12hour + c172code + c161sex + e42dep, data = efc)
# fit second model again (with c172code as factor)
fit2 <- lm(neg_c_7 ~ c160age + c12hour + c172code + c161sex + e42dep, data = efc)

Now we can print the table.

sjt.lm(fit1, fit2)
    Total score BARTHEL INDEX   Negative impact with 7 items
    B CI p   B CI p
(Intercept)   97.17 88.37 – 105.97 <.001   7.76 5.97 – 9.55 <.001
carer’ age   -0.06 -0.16 – 0.03 .203   0.00 -0.02 – 0.02 .683
average number of hours of care per week   -0.07 -0.10 – -0.04 <.001   0.01 0.00 – 0.01 .015
carer’s level of education
intermediate level of education   1.50 -1.60 – 4.60 .343   0.13 -0.50 – 0.76 .689
high level of education   0.66 -3.20 – 4.52 .738   0.72 -0.07 – 1.51 .074
carer’s gender   0.09 -2.74 – 2.93 .949   0.56 -0.02 – 1.13 .058
elder’s dependency
slightly dependent   -7.85 -12.86 – -2.83 .002   1.11 0.09 – 2.13 .033
moderately dependent   -19.49 -24.42 – -14.57 <.001   2.37 1.37 – 3.37 <.001
severely dependent   -56.87 -62.12 – -51.63 <.001   3.92 2.86 – 4.99 <.001
Observations   821   832
R2 / adj. R2   .653 / .650   .160 / .152

Removing estimates from the output

With remove.estimates, specific estimates can be removed from the table output. This may make sense in case you have stepwise regression models and only want to compare the varying predictors but not the controls. remove.estimates either accepts the row indices of the rows of the table output that should be removed, or the coefficient’s names.

When using numeric indices, the estimates’ index number relates to the same order as coef(fit).

Note that automatic grouping of categorical predictors (argument group.pred) does not yet work properly when removing estimates! See also Example 6 for further details.

data(efc)
# make education categorical
efc$c172code <- to_factor(efc$c172code)
# make education categorical
efc$e42dep <- to_factor(efc$e42dep)
# make prettier variable labels
set_label(efc$c172code) <- "Education"
set_label(efc$e42dep) <- "Dependency"
# fit first model
fit1 <- lm(neg_c_7 ~ c160age + c172code + c161sex, data = efc)
# fit second model
fit2 <- lm(neg_c_7 ~ c160age + c172code + c161sex + c12hour, data = efc)
# fit third model
fit3 <- lm(neg_c_7 ~ c160age + c172code + e42dep + tot_sc_e, data = efc)

Example 1: Complete table output

Here you have the complete table output. This helps you identify the row index numbers. Especially when you have multiple models with different predictors, the estimate’s position in the last model may differ from this estimate’s position in the table output.

sjt.lm(fit1, fit2, fit3, group.pred = FALSE)
    Negative impact with 7 items   Negative impact with 7 items   Negative impact with 7 items
    B CI p   B CI p   B CI p
(Intercept)   8.40 6.72 – 10.08 <.001   9.18 7.53 – 10.83 <.001   8.48 6.99 – 9.97 <.001
carer’ age   0.04 0.02 – 0.06 <.001   0.01 -0.01 – 0.03 .306   0.01 -0.01 – 0.03 .384
Education (intermediate level of education)   0.16 -0.52 – 0.83 .652   0.12 -0.54 – 0.78 .728   0.08 -0.56 – 0.72 .806
Education (high level of education)   0.79 -0.05 – 1.64 .066   0.91 0.09 – 1.74 .030   0.52 -0.28 – 1.32 .203
carer’s gender   0.70 0.09 – 1.32 .025   0.59 -0.01 – 1.19 .053    
average number of hours of care per week       0.02 0.01 – 0.02 <.001    
Dependency (slightly dependent)           1.18 0.16 – 2.20 .024
Dependency (moderately dependent)           2.53 1.53 – 3.52 <.001
Dependency (severely dependent)           4.32 3.31 – 5.33 <.001
Services for elderly           0.21 0.01 – 0.41 .042
Observations   832   832   833
R2 / adj. R2   .026 / .021   .081 / .075   .154 / .147

Example 2: Removing the intercept

Note that currently the intercept cannot be removed from the model output. However, you can “fake” a removed intercept by setting the font size of the first row (with intercept) to zero via CSS = list(css.topcontentborder = "+font-size: 0px;").

sjt.lm(fit1, fit2, fit3, group.pred = FALSE, CSS = list(css.topcontentborder = "+font-size: 0px;"))
    Negative impact with 7 items   Negative impact with 7 items   Negative impact with 7 items
    B CI p   B CI p   B CI p
(Intercept)   8.40 6.72 – 10.08 <.001   9.18 7.53 – 10.83 <.001   8.48 6.99 – 9.97 <.001
carer’ age   0.04 0.02 – 0.06 <.001   0.01 -0.01 – 0.03 .306   0.01 -0.01 – 0.03 .384
Education (intermediate level of education)   0.16 -0.52 – 0.83 .652   0.12 -0.54 – 0.78 .728   0.08 -0.56 – 0.72 .806
Education (high level of education)   0.79 -0.05 – 1.64 .066   0.91 0.09 – 1.74 .030   0.52 -0.28 – 1.32 .203
carer’s gender   0.70 0.09 – 1.32 .025   0.59 -0.01 – 1.19 .053    
average number of hours of care per week       0.02 0.01 – 0.02 <.001    
Dependency (slightly dependent)           1.18 0.16 – 2.20 .024
Dependency (moderately dependent)           2.53 1.53 – 3.52 <.001
Dependency (severely dependent)           4.32 3.31 – 5.33 <.001
Services for elderly           0.21 0.01 – 0.41 .042
Observations   832   832   833
R2 / adj. R2   .026 / .021   .081 / .075   .154 / .147

Example 3: Remove first coefficient (after intercept)

sjt.lm(fit1, fit2, fit3, group.pred = FALSE, remove.estimates = 2)
    Negative impact with 7 items   Negative impact with 7 items   Negative impact with 7 items
    B CI p   B CI p   B CI p
(Intercept)   8.40 6.72 – 10.08 <.001   9.18 7.53 – 10.83 <.001   8.48 6.99 – 9.97 <.001
Education (intermediate level of education)   0.16 -0.52 – 0.83 .652   0.12 -0.54 – 0.78 .728   0.08 -0.56 – 0.72 .806
Education (high level of education)   0.79 -0.05 – 1.64 .066   0.91 0.09 – 1.74 .030   0.52 -0.28 – 1.32 .203
carer’s gender   0.70 0.09 – 1.32 .025   0.59 -0.01 – 1.19 .053    – 
average number of hours of care per week    –    0.02 0.01 – 0.02 <.001    – 
Dependency (slightly dependent)    –     –    1.18 0.16 – 2.20 .024
Dependency (moderately dependent)    –     –    2.53 1.53 – 3.52 <.001
Dependency (severely dependent)    –     –    4.32 3.31 – 5.33 <.001
Services for elderly    –     –    0.21 0.01 – 0.41 .042
Observations   832   832   833
R2 / adj. R2   .026 / .021   .081 / .075   .154 / .147

Example 4: Remove age and sex

sjt.lm(fit1, fit2, fit3, group.pred = FALSE, remove.estimates = c("c160age", "c161sex"))
    Negative impact with 7 items   Negative impact with 7 items   Negative impact with 7 items
    B CI p   B CI p   B CI p
(Intercept)   8.40 6.72 – 10.08 <.001   9.18 7.53 – 10.83 <.001   8.48 6.99 – 9.97 <.001
Education (intermediate level of education)   0.16 -0.52 – 0.83 .652   0.12 -0.54 – 0.78 .728   0.08 -0.56 – 0.72 .806
Education (high level of education)   0.79 -0.05 – 1.64 .066   0.91 0.09 – 1.74 .030   0.52 -0.28 – 1.32 .203
average number of hours of care per week    –    0.02 0.01 – 0.02 <.001    – 
Dependency (slightly dependent)    –     –    1.18 0.16 – 2.20 .024
Dependency (moderately dependent)    –     –    2.53 1.53 – 3.52 <.001
Dependency (severely dependent)    –     –    4.32 3.31 – 5.33 <.001
Services for elderly    –     –    0.21 0.01 – 0.41 .042
Observations   832   832   833
R2 / adj. R2   .026 / .021   .081 / .075   .154 / .147

Example 5: Remove many esimates

sjt.lm(fit1, fit2, fit3, group.pred = FALSE, remove.estimates = c(2,5,6,10))
    Negative impact with 7 items   Negative impact with 7 items   Negative impact with 7 items
    B CI p   B CI p   B CI p
(Intercept)   8.40 6.72 – 10.08 <.001   9.18 7.53 – 10.83 <.001   8.48 6.99 – 9.97 <.001
Education (intermediate level of education)   0.16 -0.52 – 0.83 .652   0.12 -0.54 – 0.78 .728   0.08 -0.56 – 0.72 .806
Education (high level of education)   0.79 -0.05 – 1.64 .066   0.91 0.09 – 1.74 .030   0.52 -0.28 – 1.32 .203
Dependency (slightly dependent)    –     –    1.18 0.16 – 2.20 .024
Dependency (moderately dependent)    –     –    2.53 1.53 – 3.52 <.001
Dependency (severely dependent)    –     –    4.32 3.31 – 5.33 <.001
Observations   832   832   833
R2 / adj. R2   .026 / .021   .081 / .075   .154 / .147

Example 6: Custom predictor labels

In most cases you need to define your own labels when removing estimates, especially when you have grouped categorical predictors, because automatic label detection is quite tricky in such situations. If you provide own labels, please note that grouped predictors’ headings (the variable name of the grouped, categorical variable) are still automatically set by the sjt.lm() function (variable labels are used, so use set_label() for those categorical predictors). All data rows in the table, i.e. for each coefficient appearing in the model, you need to specify a label string.

In the next example, we have seven table rows with data (excluding intercept): mid and hi education (categories of the variable Education), Hours of Care, slight, moderate and severe dependency (categories of the variable Dependency) and Service Usage. These ‘rows’ need to be labelled.

sjt.lm(fit1, fit2, fit3,
       pred.labels = c("mid education", "hi education", "Hours of Care", 
                       "slight dependency", "moderate dependency", 
                       "severe dependency", "Service Usage"),
       remove.estimates = c("c160age", "c161sex"))
    Negative impact with 7 items   Negative impact with 7 items   Negative impact with 7 items
    B CI p   B CI p   B CI p
(Intercept)   8.40 6.72 – 10.08 <.001   9.18 7.53 – 10.83 <.001   8.48 6.99 – 9.97 <.001
Education
mid education   0.16 -0.52 – 0.83 .652   0.12 -0.54 – 0.78 .728   0.08 -0.56 – 0.72 .806
hi education   0.79 -0.05 – 1.64 .066   0.91 0.09 – 1.74 .030   0.52 -0.28 – 1.32 .203
Hours of Care    –    0.02 0.01 – 0.02 <.001    – 
Dependency
slight dependency    –     –    1.18 0.16 – 2.20 .024
moderate dependency    –     –    2.53 1.53 – 3.52 <.001
severe dependency    –     –    4.32 3.31 – 5.33 <.001
Service Usage    –     –    0.21 0.01 – 0.41 .042
Observations   832   832   833
R2 / adj. R2   .026 / .021   .081 / .075   .154 / .147