Utilities and options for emmeans

Russ Lenth

2018-01-09

Contents

  1. Updating an emmGrid object
  2. Setting options and defaults
  3. Combining and subsetting emmGrid objects
  4. Adding grouping factors

Vignette index

Updating an emmGrid object

Several internal settings are saved when functions like ref_grid(), emmeans(), contrast(), etc. are run. Those settings can be manipulated via the update() method for emmGrids. To illustrate, consider the pigs dataset and model yet again:

pigs.lm <- lm(log(conc) ~ source + factor(percent), data = pigs)
pigs.emm.s <- emmeans(pigs.lm, "source")
pigs.emm.s
##  source   emmean         SE df lower.CL upper.CL
##  fish   3.394492 0.03668122 23 3.318612 3.470373
##  soy    3.667260 0.03744798 23 3.589793 3.744727
##  skim   3.796770 0.03938283 23 3.715300 3.878240
## 
## Results are averaged over the levels of: percent 
## Results are given on the log (not the response) scale. 
## Confidence level used: 0.95

We see confidence intervals but not tests, by default. This happens as a result of internal settings in pigs.emm.s that are passed to summary() when the object is displayed. If we are going to work with this object a lot, we might want to change its internal settings rather than having to rely on explicitly calling summary() with several arguments. If so, just update the internal settings to what is desired; for example:

pigs.emm.s <- update(pigs.emm.s, infer = c(TRUE, TRUE), null = log(35))
pigs.emm.s
##  source   emmean         SE df lower.CL upper.CL     null t.ratio p.value
##  fish   3.394492 0.03668122 23 3.318612 3.470373 3.555348  -4.385  0.0002
##  soy    3.667260 0.03744798 23 3.589793 3.744727 3.555348   2.988  0.0066
##  skim   3.796770 0.03938283 23 3.715300 3.878240 3.555348   6.130  <.0001
## 
## Results are averaged over the levels of: percent 
## Results are given on the log (not the response) scale. 
## Confidence level used: 0.95

See `help(“update.emmGrid”) for details on the keywords that can be changed. Mostly, they are the same as the names of arguments in the functions that construct these objects.

Of course, we can always get what we want via calls to test(), confint() or summary() with appropriate arguments. But the update() function is more useful in sophisticated manipulations of objects, or called implicitly via the options argument in emmeans() and other functions. Those options are passed to update() just before the object is returned. For example, we could have done the above update within the emmeans() call as follows:

emmeans(pigs.lm, "source", options = list(infer = c(TRUE, TRUE), null = log(35)))

Back to contents

Setting options and defaults

Speaking of the options argument, note that the default in emmeans() is options = get_emm_option("emmeans"). Let’s see what that is:

get_emm_option("emmeans")
## $infer
## [1]  TRUE FALSE

So, by default, confidence intervals, but not tests, are displayed when the result is summarized. The reverse is true for results of contrast() (and also the default for pairs() which calls contrast()):

get_emm_option("contrast")
## $infer
## [1] FALSE  TRUE

There are also defaults for a newly constructed reference grid:

get_emm_option("ref_grid")
## $is.new.rg
## [1] TRUE
## 
## $infer
## [1] FALSE FALSE

The default is to display neither intervals nor tests when summarizing. In addition, the flag is.new.rg is set to TRUE, and that is why one sees a str() listing rather than a summary as the default when the object is simply shown by typing its name at the console.

The user may have other preferences. She may want to see both intervals and tests whenever contrasts are produced; and perhaps she also wants to always default to the response scale when transformations or links are present. We can change the defaults by setting the corresponding options; and that is done via the emm_options() function:

emm_options(emmeans = list(type = "response"),
            contrast = list(infer = c(TRUE, TRUE)))

Now, new emmeans() results and contrasts follow the new defaults:

pigs.anal.p <- emmeans(pigs.lm, consec ~ percent)
pigs.anal.p
## $emmeans
##  percent response       SE df lower.CL upper.CL
##        9 31.35290 1.281961 23 28.81003 34.12023
##       12 37.51952 1.439849 23 34.65613 40.61950
##       15 38.96664 1.704010 23 35.59637 42.65601
##       18 42.31561 2.241048 23 37.92458 47.21506
## 
## Results are averaged over the levels of: source 
## Confidence level used: 0.95 
## Intervals are back-transformed from the log scale 
## 
## $contrasts
##  contrast    ratio         SE df  lower.CL upper.CL t.ratio p.value
##  12 / 9   1.196684 0.06710564 23 1.0379600 1.379680   3.202  0.0113
##  15 / 12  1.038570 0.06042501 23 0.8960192 1.203799   0.650  0.8613
##  18 / 15  1.085945 0.07499759 23 0.9113765 1.293950   1.194  0.5199
## 
## Results are averaged over the levels of: source 
## Confidence level used: 0.95 
## Conf-level adjustment: mvt method for 3 estimates 
## Intervals are back-transformed from the log scale 
## P value adjustment: mvt method for 3 tests 
## Tests are performed on the log scale

Observe that the contrasts “inherited” the type = "response" default from the EMMs.

NOTE: Setting the above options does not change how existing emmGrid objects are displayed; it only affects ones constructed in the future.

There is one more option – summary – that overrides all other display defaults for both existing and future objects. For example, specifying emm_options(summary = list(infer = c(TRUE, TRUE))) will result in both intervals and tests being displayed, regardless of their internal defaults, unless infer is explicitly specified in a call to summary().

To temporarily revert to factory defaults in a single call to emmeans() or contrast() or pairs(), specify options = NULL in the call. To reset everything to factory defaults (which we do presently), null-out all of the emmeans package options:

options(emmeans = NULL)

Back to contents

Combining and subsetting emmGrid objects

Two or more emmGrid objects may be combined using the rbind() or + methods. The most common reason (or perhaps the only good reason) to do this is to combine EMMs or contrasts into one family for purposes of applying a multiplicity adjustment to tests or intervals. A user may want to combine the three pairwise comparisons of sources with the three comparisons above of consecutive percents into a single family of six tests with a suitable multiplicity adjustment. This is done quite simply:

rbind(pairs(pigs.emm.s), pigs.anal.p[[2]])
##  contrast       estimate         SE df t.ratio p.value
##  fish - soy  -0.27276784 0.05293450 23  -5.153  0.0002
##  fish - skim -0.40227768 0.05415929 23  -7.428  <.0001
##  soy - skim  -0.12950983 0.05304280 23  -2.442  0.1364
##  12 - 9       0.17955450 0.05607632 23   3.202  0.0238
##  15 - 12      0.03784451 0.05818098 23   0.650  1.0000
##  18 - 15      0.08245025 0.06906208 23   1.194  1.0000
## 
## Results are averaged over some or all of the levels of: percent, source 
## Results are given on the log (not the response) scale. 
## P value adjustment: bonferroni method for 6 tests

The default adjustment is "bonferroni"; we could have specified something different via the adjust argument. An equivalent way to combine emmGrids is via the addition operator. Any options may be provided by update(). Below, we combine the same results into a family but ask for the “exact” multiplicity adjustment.

update(pigs.anal.p[[2]] + pairs(pigs.emm.s), adjust = "mvt")
##  contrast        ratio         SE df  lower.CL  upper.CL t.ratio p.value
##  12 / 9      1.1966841 0.06710564 23 1.0215876 1.4017916   3.202  0.0213
##  15 / 12     1.0385697 0.06042501 23 0.8813596 1.2238218   0.650  0.9681
##  18 / 15     1.0859446 0.07499759 23 0.8937042 1.3195371   1.194  0.7305
##  fish / soy  0.7612695 0.04029742 23 0.6556677 0.8838795  -5.153  0.0002
##  fish / skim 0.6687950 0.03622146 23 0.5740343 0.7791987  -7.428  <.0001
##  soy / skim  0.8785259 0.04659947 23 0.7564275 1.0203329  -2.442  0.1111
## 
## Results are averaged over some or all of the levels of: source, percent 
## Confidence level used: 0.95 
## Conf-level adjustment: mvt method for 6 estimates 
## Intervals are back-transformed from the log scale 
## P value adjustment: mvt method for 6 tests 
## Tests are performed on the log scale

Also evident in comparing these results is that settings are obtained from the first object combined. So in the second output, where they are combined in reverse order, we get both confidence intervals and tests, and transformation to the response scale.

To subset an emmGrid object, just use the subscripting operator []. For instance,

pigs.emm.s[2:3]
##  source  emmean         SE df lower.CL upper.CL     null t.ratio p.value
##  soy    3.66726 0.03744798 23 3.589793 3.744727 3.555348   2.988  0.0066
##  skim   3.79677 0.03938283 23 3.715300 3.878240 3.555348   6.130  <.0001
## 
## Results are averaged over the levels of: percent 
## Results are given on the log (not the response) scale. 
## Confidence level used: 0.95

Back to contents

Adding grouping factors

Sometimes, users want to group levels of a factor into a smaller number of groups. Those groups may then be, say, averaged separately and compared, or used as a by factor. The add_grouping() function serves this purpose. The function takes four arguments: the object, the name of the grouping factor to be created, the name of the reference factor that is being grouped, and a vector of level names of the grouping factor corresponding to levels of the reference factor. Suppose for example that we want to distinguish animal and non-animal sources of protein in the pigs example:

pigs.emm.ss <- add_grouping(pigs.emm.s, "type", "source",
                            c("animal", "vegetable", "animal"))
str(pigs.emm.ss)
## 'emmGrid' object with variables:
##     source = fish, soy, skim
##     type = animal, vegetable
## Nesting structure:  source %in% type
## Transformation: "log"

Note that the new object has a nesting structure (see more about this in the “messy-data” vignette), with the reference factor nested in the new grouping factor. Now we can obtain means and comparisons for each group

emmeans(pigs.emm.ss, pairwise ~ type)
## $emmeans
##  type        emmean         SE df lower.CL upper.CL
##  animal    3.595631 0.02673860 23 3.540318 3.650944
##  vegetable 3.667260 0.03744798 23 3.589793 3.744727
## 
## Results are averaged over the levels of: percent, source 
## Results are given on the log (not the response) scale. 
## Confidence level used: 0.95 
## 
## $contrasts
##  contrast            estimate        SE df t.ratio p.value
##  animal - vegetable -0.071629 0.0455466 23  -1.573  0.1295
## 
## Results are averaged over the levels of: percent, source 
## Results are given on the log (not the response) scale.

Back to contents