Getting started with amt

Johannes Signer

2018-02-08

Basics

Creating a track

The basic building blocks of amt are tracks. Tracks are data_frames with at least two columns that contain the coordinates: x_ and y_. A track behaves exactly like a data_frame (the only difference being that we added an other S3 class).

library(amt)
df1 <- data_frame(x = 1:3, y = 1:3)
is.data.frame(df1)
## [1] TRUE
df1
## # A tibble: 3 x 2
##       x     y
##   <int> <int>
## 1     1     1
## 2     2     2
## 3     3     3
# Now we can create a track
tr1 <- mk_track(df1, x, y)
is.data.frame(tr1)
## [1] TRUE
tr1
## # A tibble: 3 x 2
##      x_    y_
##   <int> <int>
## 1     1     1
## 2     2     2
## 3     3     3

At the moment there are two types of tracks:

Whether a track_xy or track_xyt is created is determined whether or not a timestamp is passed to the function mk_track. In the previous example we only passed x and y coordinates. Hence a track_xy was created.

class(tr1)
## [1] "track_xy"   "tbl_df"     "tbl"        "data.frame"

To create a track_xyt we could do the following

df1 <- data_frame(x = 1:3, y = 1:3, t = lubridate::ymd("2017-01-01") + lubridate::days(0:2))
tr2 <- mk_track(df1, x, y, t)
## .t found, creating `track_xyt`.
class(tr2)
## [1] "track_xyt"  "track_xy"   "tbl_df"     "tbl"        "data.frame"

From the output above we see that a track_xyt is also a track_xy. This means that all methods for track_xy also work for a track_xyt (but not the reverse).

Adding additional information

We can also add additional information for each relocation (e.g., the id of the animal, or some other sensor information such as the DOP). Any number of additional named columns can be passed to mk_track. By named we mean, that columns should always be passed in the form of col_name = content to avoid confusion with coordinates and time stamp. We will extend the dummy example from above, by passing 2 more columns (the id of animal and the age).

df1 <- data_frame(x = 1:3, y = 1:3, t = lubridate::ymd("2017-01-01") + lubridate::days(0:2), 
                  id = 1, age = 4)

# first we only create a track_xy
tr3 <- mk_track(df1, x, y, id = id, age = age)
## .t missing, creating `track_xy`.
tr3
## # A tibble: 3 x 4
##      x_    y_    id   age
##   <int> <int> <dbl> <dbl>
## 1     1     1  1.00  4.00
## 2     2     2  1.00  4.00
## 3     3     3  1.00  4.00
# now lets create a track_xyt
tr4 <- mk_track(df1, x, y, t, id = id, age = age)
## .t found, creating `track_xyt`.
tr4
## # A tibble: 3 x 5
##      x_    y_ t_            id   age
##   <int> <int> <date>     <dbl> <dbl>
## 1     1     1 2017-01-01  1.00  4.00
## 2     2     2 2017-01-02  1.00  4.00
## 3     3     3 2017-01-03  1.00  4.00

Coordinate refenece system

mk_track has one further optional argument (crs), which allows the user to set a coordinate reference system (CRS) of the track. The CRS needs to be provided as valid proj4string, see the documentation of sp::CRS for further details.

An example with real data

In the amt relocation data of one red deer from northern Germany is included. We will use this data set to to illustrate how to create a track.

We benign with loading and inspecting the data.

data(sh)
head(sh)
##   x_epsg31467 y_epsg31467        day     time
## 1     3558403     5999400 2009-02-13 00:02:23
## 2     3558548     5999099 2009-02-13 06:02:21
## 3     3558541     5999019 2009-02-13 12:01:51
## 4     3558453     5999026 2009-02-13 18:00:55
## 5     3558566     5999365 2009-02-14 00:01:36
## 6     3557836     5999185 2009-02-14 06:02:24

Before creating a track, we have to do some data cleaning:

  1. check if any coordinates are missing (and if so, remove the relocation),
  2. parse the date and time,
  3. create a time stamp,
  4. check for duplicated time stamps, and
  5. create two new columns for the id and month of the year.
# check if all observations are complete
all(complete.cases(sh)) # no action required
## [1] TRUE
# parse date and time
sh$day <- lubridate::ymd(sh$day)
sh$time <- lubridate::hms(sh$time)

# create a time stamp
sh$ts <- as.POSIXct(sh$day + sh$time)

# check for duplicated time stamps
any(duplicated(sh$ts))
## [1] TRUE
# We have some duplicated time stamps, these need to be removed prior to
# creating a track.
sh <- sh[!duplicated(sh$ts), ]

# create new columns
sh$id <- "Animal 1"
sh$month <- lubridate::month(sh$ts)

Now we can create a track.

tr1 <- mk_track(sh, x_epsg31467, y_epsg31467, ts, id = id, month = month)
## .t found, creating `track_xyt`.

The column names of the data set already indicate the CRS of the data. We can add this information when creating a track.

tr1 <- mk_track(sh, x_epsg31467, y_epsg31467, ts, id = id, month = month, 
                crs = sp::CRS("+init=epsg:31467"))
## .t found, creating `track_xyt`.

A note on pipes (%>%)

amt was heavily inspired through workflows suggested by the popular packages from the tidyverse. The above steps could easily be connected using pipes. Note that result will be exactly the same.

data(sh)
tr2 <- sh %>% filter(complete.cases(.)) %>% 
  mutate(
    day = lubridate::ymd(day), 
    time = lubridate::hms(.$time), 
    ts = as.POSIXct(day + time), 
    id = "Animal 1", 
    month = lubridate::month(ts)
  ) %>% 
  filter(!duplicated(ts)) %>% 
  mk_track(x_epsg31467, y_epsg31467, ts, id = id, month = month, 
           crs = sp::CRS("+init=epsg:31467"))
## .t found, creating `track_xyt`.
tr2
## # A tibble: 1,493 x 5
##         x_      y_ t_                  id       month
##  *   <int>   <int> <dttm>              <chr>    <dbl>
##  1 3558528 5999094 2008-03-30 00:01:47 Animal 1  3.00
##  2 3558513 5999055 2008-03-30 06:00:54 Animal 1  3.00
##  3 3558564 5999146 2008-03-30 12:01:47 Animal 1  3.00
##  4 3558504 5999072 2008-03-30 18:01:24 Animal 1  3.00
##  5 3558495 5999051 2008-03-30 18:25:56 Animal 1  3.00
##  6 3558493 5999052 2008-03-30 18:26:05 Animal 1  3.00
##  7 3558489 5999051 2008-03-30 18:26:14 Animal 1  3.00
##  8 3558486 5999046 2008-03-30 18:26:24 Animal 1  3.00
##  9 3558484 5999052 2008-03-30 18:26:33 Animal 1  3.00
## 10 3558317 5998989 2008-03-30 18:38:01 Animal 1  3.00
## # ... with 1,483 more rows

Working with tracks

Utility functions

Basic manipulation

Remember, that tracks_xy* behave like regular data.frames. This means that we can use all data manipulation verbs that we are used from base R or tidyverse. For example we an filter a track based on some characteristic. As an example we extract all relocations from May.

tr3 <- tr2 %>% filter(month == 5)

# we are left with a track
class(tr3)
## [1] "track_xyt"  "track_xy"   "tbl_df"     "tbl"        "data.frame"

Transforming CRS

If we set the CRS when creating a track (we can verify this with has_crs), we can transform the CRS of the coordinates with the function transform_coords (a wrapper around sp::spTransform). For illustration we will transform the CRS of tr2 to geographical coordinates (EPSG:4326).

transform_coords(tr2, sp::CRS("+init=epsg:4326"))
## # A tibble: 1,493 x 5
##       x_    y_ t_                  id       month
##  * <dbl> <dbl> <dttm>              <chr>    <dbl>
##  1  9.89  54.1 2008-03-30 00:01:47 Animal 1  3.00
##  2  9.89  54.1 2008-03-30 06:00:54 Animal 1  3.00
##  3  9.89  54.1 2008-03-30 12:01:47 Animal 1  3.00
##  4  9.89  54.1 2008-03-30 18:01:24 Animal 1  3.00
##  5  9.89  54.1 2008-03-30 18:25:56 Animal 1  3.00
##  6  9.89  54.1 2008-03-30 18:26:05 Animal 1  3.00
##  7  9.89  54.1 2008-03-30 18:26:14 Animal 1  3.00
##  8  9.89  54.1 2008-03-30 18:26:24 Animal 1  3.00
##  9  9.89  54.1 2008-03-30 18:26:33 Animal 1  3.00
## 10  9.89  54.1 2008-03-30 18:38:01 Animal 1  3.00
## # ... with 1,483 more rows

Some initial data exploration

Several functions for calculating derived quantities are available. We will start with looking at step length. The function step_lengths can be used for this.

tr2 <- tr2 %>% mutate(sl_ = step_lengths(.))

If we look at a summary of sl_

summary(tr2$sl_)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##    0.00   35.01  105.33  249.07  297.75 4727.86       1

Note, 1) there is one NA for step lengths, this is expected, because we are still in a point representation (i.e., there is no step length for the last relocation). 2) the range is quiet large from 0 to almost 5km. Before looking at step lengths in any further detail, we will have to make sure the sampling rate is more or less regular.

The function summarize_sampling_rate provides an easy way to look at the sampling rate.

summarize_sampling_rate(tr2)
## # A tibble: 1 x 9
##   min         q1               median  mean  q3    max      sd     n unit 
##   <S3: table> <S3: table>      <S3: t> <S3:> <S3:> <S3:> <dbl> <int> <chr>
## 1 0.0025      1.99694444444444 2.0055… 6.33… 6.00… 3923…   102  1492 hour

From tracks to steps

todo

How to deal with several animals

todo

Session

sessionInfo()
## R version 3.4.3 (2017-11-30)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 16.04.3 LTS
## 
## Matrix products: default
## BLAS: /usr/lib/libblas/libblas.so.3.6.0
## LAPACK: /usr/lib/lapack/liblapack.so.3.6.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=de_DE.UTF-8        LC_COLLATE=C              
##  [5] LC_MONETARY=de_DE.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=de_DE.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] bindrcpp_0.2    amt_0.0.2.0     survival_2.41-3 forcats_0.2.0  
##  [5] stringr_1.2.0   dplyr_0.7.4     purrr_0.2.4     readr_1.1.1    
##  [9] tidyr_0.8.0     tibble_1.4.2    ggplot2_2.2.1   tidyverse_1.2.1
## 
## loaded via a namespace (and not attached):
##  [1] progress_1.1.2.9002    reshape2_1.4.3         rematch2_2.0.1        
##  [4] splines_3.4.3          haven_1.1.1            lattice_0.20-35       
##  [7] colorspace_1.3-2       htmltools_0.3.6        yaml_2.1.16           
## [10] utf8_1.1.3             XML_3.98-1.9           rlang_0.1.6.9003      
## [13] pillar_1.1.0           withr_2.1.1            foreign_0.8-69        
## [16] glue_1.2.0             selectr_0.3-1          sp_1.2-7              
## [19] modelr_0.1.1           readxl_1.0.0           bindr_0.1             
## [22] plyr_1.8.4             munsell_0.4.3          gtable_0.2.0          
## [25] cellranger_1.1.0       rvest_0.3.2            psych_1.7.8           
## [28] evaluate_0.10.1        knitr_1.19             ansistrings_1.0.0.9000
## [31] parallel_3.4.3         broom_0.4.3            Rcpp_0.12.15          
## [34] scales_0.5.0           backports_1.1.2        jsonlite_1.5          
## [37] mnormt_1.5-5           hms_0.4.1              digest_0.6.15         
## [40] stringi_1.1.6          grid_3.4.3             rprojroot_1.3-2       
## [43] rgdal_1.2-16           cli_1.0.0.9001         tools_3.4.3           
## [46] magrittr_1.5           lazyeval_0.2.1         crayon_1.3.4          
## [49] pkgconfig_2.0.1        Matrix_1.2-12          xml2_1.2.0            
## [52] prettyunits_1.0.2      lubridate_1.7.2        assertthat_0.2.0      
## [55] rmarkdown_1.8          httr_1.3.1             rstudioapi_0.7        
## [58] R6_2.2.2               nlme_3.1-131           compiler_3.4.3