The mudata2 package is designed to be used as little as possible. That is, if you need use data that is currently in mudata format, the functions in this package are designed to let you spend as little time as possible reading, subsetting, and inspecting your data. The steps are generally as follows:
read_mudata()
summary()
, print()
, autoplot()
, distinct_locations()
, and distinct_params()
tbl_locations()
and tbl_params()
select_params()
or filter_params()
select_locations()
or filter_locations()
tbl_data()
or tbl_data_wide()
In this vignette we will use the ns_climate
dataset within the mudata2 package, which is a collection of monthly climate observations from Nova Scotia (Canada), sourced from Environment Canada using the rclimateca package.
library(mudata2)
data("ns_climate")
ns_climate
## A mudata object aligned along "date"
## distinct_datasets(): "ecclimate_monthly"
## distinct_locations(): "ANNAPOLIS ROYAL 6289", "BADDECK 6297" ... and 13 more
## distinct_params(): "dir_of_max_gust", "extr_max_temp" ... and 9 more
## src_tbls(): "data", "locations" ... and 3 more
##
## tbl_data() %>% head():
## # A tibble: 6 x 7
## dataset location param date value flag
## <chr> <chr> <chr> <date> <dbl> <chr>
## 1 ecclimate_monthly SABLE ISLAND 6454 mean_max_temp 1897-01-01 NA M
## 2 ecclimate_monthly SABLE ISLAND 6454 mean_max_temp 1897-02-01 NA M
## 3 ecclimate_monthly SABLE ISLAND 6454 mean_max_temp 1897-03-01 NA M
## 4 ecclimate_monthly SABLE ISLAND 6454 mean_max_temp 1897-04-01 NA M
## 5 ecclimate_monthly SABLE ISLAND 6454 mean_max_temp 1897-05-01 NA M
## 6 ecclimate_monthly SABLE ISLAND 6454 mean_max_temp 1897-06-01 NA M
## # ... with 1 more variables: flag_text <chr>
The ns_climate
object is already an object in R, but if it wasn’t, you would need to use read_mudata()
to read it in. If you’re curious what a mudata object looks like on disk, you could try using write_mudata()
to find out. I tend to prefer writing to a directory rather than a JSON or ZIP file, but you can take your pick.
# write to directory
write_mudata(ns_climate, "ns_climate.mudata")
# write to ZIP
write_mudata(ns_climate, "ns_climate.mudata.zip")
# write to JSON
write_mudata(ns_climate, "ns_climate.mudata.json")
Then, you can read in the object using read_mudata()
:
# read from directory
read_mudata("ns_climate.mudata")
# read from ZIP
read_mudata("ns_climate.mudata.zip")
# read from JSON
read_mudata("ns_climate.mudata.json")
The three main ways to quickly inspect a mudata object are print()
, summary()
, and autoplot()
. The print()
function is what you get when you type the name of the object at the prompt, and gives a short summary of the object. The output suggests a couple of other ways to inspect the object, including distinct_locations()
, which returns a character vector of location identifiers, and distinct_params()
, which returns a character vector of parameter identifiers.
print(ns_climate)
## A mudata object aligned along "date"
## distinct_datasets(): "ecclimate_monthly"
## distinct_locations(): "ANNAPOLIS ROYAL 6289", "BADDECK 6297" ... and 13 more
## distinct_params(): "dir_of_max_gust", "extr_max_temp" ... and 9 more
## src_tbls(): "data", "locations" ... and 3 more
##
## tbl_data() %>% head():
## # A tibble: 6 x 7
## dataset location param date value flag
## <chr> <chr> <chr> <date> <dbl> <chr>
## 1 ecclimate_monthly SABLE ISLAND 6454 mean_max_temp 1897-01-01 NA M
## 2 ecclimate_monthly SABLE ISLAND 6454 mean_max_temp 1897-02-01 NA M
## 3 ecclimate_monthly SABLE ISLAND 6454 mean_max_temp 1897-03-01 NA M
## 4 ecclimate_monthly SABLE ISLAND 6454 mean_max_temp 1897-04-01 NA M
## 5 ecclimate_monthly SABLE ISLAND 6454 mean_max_temp 1897-05-01 NA M
## 6 ecclimate_monthly SABLE ISLAND 6454 mean_max_temp 1897-06-01 NA M
## # ... with 1 more variables: flag_text <chr>
The summary()
function provides some numeric summaries by dataset, location, and parameter if the value
column of the data
table is numeric (if it isn’t, it provides counts instead).
summary(ns_climate)
## # A tibble: 137 x 7
## param location dataset mean_value
## <chr> <chr> <chr> <dbl>
## 1 dir_of_max_gust SABLE ISLAND 6454 ecclimate_monthly 19.77258
## 2 extr_max_temp ANNAPOLIS ROYAL 6289 ecclimate_monthly 19.93257
## 3 extr_max_temp BADDECK 6297 ecclimate_monthly 18.85291
## 4 extr_max_temp BEAVERBANK 6301 ecclimate_monthly 17.22857
## 5 extr_max_temp COLLEGEVILLE 6329 ecclimate_monthly 20.33914
## 6 extr_max_temp DIGBY 6338 ecclimate_monthly 19.04834
## 7 extr_max_temp KENTVILLE CDA 6375 ecclimate_monthly 21.00661
## 8 extr_max_temp MAHONE BAY 6396 ecclimate_monthly 20.76598
## 9 extr_max_temp MOUNT UNIACKE 6413 ecclimate_monthly 19.67059
## 10 extr_max_temp NAPPAN CDA 6414 ecclimate_monthly 19.33575
## # ... with 127 more rows, and 3 more variables: sd_value <dbl>, n <int>,
## # n_NA <int>
Finally, the autoplot()
function provides an attempt at the best way to plot the object. The smaller the subset, the more useful the plot, but it produces reasonable results for large objects as well. This function produces ggplot2 objects, which can be modified as such (e.g., + scale_y_reverse()
, etc.).
autoplot(ns_climate)
## Using x = "date", y = "value"
## Using first 9 facets of 11. Use max_facets = FALSE to plot all facets
You can have a look at the embedded documentation using tbl_params()
, and tbl_locations()
, which contain any additional information about parameters and locations for which data are available. The identifiers (i.e., param
and location
columns) of these can be used to subset the object using select_*()
functions; the tables themselves can be used to subset the object using the filter_*()
functions.
# extract the parameters table
ns_climate %>% tbl_params()
## # A tibble: 11 x 4
## dataset param label
## <chr> <chr> <chr>
## 1 ecclimate_monthly mean_max_temp Mean Max Temp (C)
## 2 ecclimate_monthly mean_min_temp Mean Min Temp (C)
## 3 ecclimate_monthly mean_temp Mean Temp (C)
## 4 ecclimate_monthly extr_max_temp Extr Max Temp (C)
## 5 ecclimate_monthly extr_min_temp Extr Min Temp (C)
## 6 ecclimate_monthly total_rain Total Rain (mm)
## 7 ecclimate_monthly total_snow Total Snow (cm)
## 8 ecclimate_monthly total_precip Total Precip (mm)
## 9 ecclimate_monthly snow_grnd_last_day Snow Grnd Last Day (cm)
## 10 ecclimate_monthly dir_of_max_gust Dir of Max Gust (10's deg)
## 11 ecclimate_monthly spd_of_max_gust Spd of Max Gust (km/h)
## # ... with 1 more variables: unit <chr>
# exract the locations table
ns_climate %>% tbl_locations()
## # A tibble: 15 x 19
## dataset location name province
## <chr> <chr> <chr> <chr>
## 1 ecclimate_monthly ANNAPOLIS ROYAL 6289 ANNAPOLIS ROYAL NOVA SCOTIA
## 2 ecclimate_monthly BADDECK 6297 BADDECK NOVA SCOTIA
## 3 ecclimate_monthly BEAVERBANK 6301 BEAVERBANK NOVA SCOTIA
## 4 ecclimate_monthly COLLEGEVILLE 6329 COLLEGEVILLE NOVA SCOTIA
## 5 ecclimate_monthly DIGBY 6338 DIGBY NOVA SCOTIA
## 6 ecclimate_monthly KENTVILLE CDA 6375 KENTVILLE CDA NOVA SCOTIA
## 7 ecclimate_monthly MAHONE BAY 6396 MAHONE BAY NOVA SCOTIA
## 8 ecclimate_monthly MOUNT UNIACKE 6413 MOUNT UNIACKE NOVA SCOTIA
## 9 ecclimate_monthly NAPPAN CDA 6414 NAPPAN CDA NOVA SCOTIA
## 10 ecclimate_monthly PARRSBORO 6428 PARRSBORO NOVA SCOTIA
## 11 ecclimate_monthly PORT HASTINGS 6441 PORT HASTINGS NOVA SCOTIA
## 12 ecclimate_monthly SABLE ISLAND 6454 SABLE ISLAND NOVA SCOTIA
## 13 ecclimate_monthly ST MARGARET'S BAY 6456 ST MARGARET'S BAY NOVA SCOTIA
## 14 ecclimate_monthly SPRINGFIELD 6473 SPRINGFIELD NOVA SCOTIA
## 15 ecclimate_monthly UPPER STEWIACKE 6495 UPPER STEWIACKE NOVA SCOTIA
## # ... with 15 more variables: climate_id <chr>, station_id <int>,
## # wmo_id <int>, tc_id <chr>, latitude <dbl>, longitude <dbl>,
## # elevation <dbl>, first_year <int>, last_year <int>,
## # hly_first_year <int>, hly_last_year <int>, dly_first_year <int>,
## # dly_last_year <int>, mly_first_year <int>, mly_last_year <int>
You can subset mudata objects using select_params()
and select_locations()
, which use dplyr-like selection syntax to quickly subset mudata objects using the identifiers from distinct_locations()
and distinct_params()
(respectively).
# find out which parameters are available
ns_climate %>% distinct_params()
## [1] "dir_of_max_gust" "extr_max_temp" "extr_min_temp"
## [4] "mean_max_temp" "mean_min_temp" "mean_temp"
## [7] "snow_grnd_last_day" "spd_of_max_gust" "total_precip"
## [10] "total_rain" "total_snow"
# subset by parameter
ns_climate %>% select_params(mean_temp, total_precip)
## A mudata object aligned along "date"
## distinct_datasets(): "ecclimate_monthly"
## distinct_locations(): "ANNAPOLIS ROYAL 6289", "BADDECK 6297" ... and 13 more
## distinct_params(): "mean_temp", "total_precip"
## src_tbls(): "data", "locations" ... and 3 more
##
## tbl_data() %>% head():
## # A tibble: 6 x 7
## dataset location param date value flag
## <chr> <chr> <chr> <date> <dbl> <chr>
## 1 ecclimate_monthly SABLE ISLAND 6454 mean_temp 1897-01-01 NA M
## 2 ecclimate_monthly SABLE ISLAND 6454 mean_temp 1897-02-01 NA M
## 3 ecclimate_monthly SABLE ISLAND 6454 mean_temp 1897-03-01 NA M
## 4 ecclimate_monthly SABLE ISLAND 6454 mean_temp 1897-04-01 NA M
## 5 ecclimate_monthly SABLE ISLAND 6454 mean_temp 1897-05-01 NA M
## 6 ecclimate_monthly SABLE ISLAND 6454 mean_temp 1897-06-01 NA M
## # ... with 1 more variables: flag_text <chr>
You can also use the dplyr select helpers to select related params/locations…
ns_climate %>% select_params(contains("temp"))
## A mudata object aligned along "date"
## distinct_datasets(): "ecclimate_monthly"
## distinct_locations(): "ANNAPOLIS ROYAL 6289", "BADDECK 6297" ... and 13 more
## distinct_params(): "extr_max_temp", "extr_min_temp" ... and 3 more
## src_tbls(): "data", "locations" ... and 3 more
##
## tbl_data() %>% head():
## # A tibble: 6 x 7
## dataset location param date value flag
## <chr> <chr> <chr> <date> <dbl> <chr>
## 1 ecclimate_monthly SABLE ISLAND 6454 mean_max_temp 1897-01-01 NA M
## 2 ecclimate_monthly SABLE ISLAND 6454 mean_max_temp 1897-02-01 NA M
## 3 ecclimate_monthly SABLE ISLAND 6454 mean_max_temp 1897-03-01 NA M
## 4 ecclimate_monthly SABLE ISLAND 6454 mean_max_temp 1897-04-01 NA M
## 5 ecclimate_monthly SABLE ISLAND 6454 mean_max_temp 1897-05-01 NA M
## 6 ecclimate_monthly SABLE ISLAND 6454 mean_max_temp 1897-06-01 NA M
## # ... with 1 more variables: flag_text <chr>
…and rename params/locations on the fly.
ns_climate %>% select_locations(Kentville = starts_with("KENT"))
## A mudata object aligned along "date"
## distinct_datasets(): "ecclimate_monthly"
## distinct_locations(): "Kentville"
## distinct_params(): "extr_max_temp", "extr_min_temp" ... and 7 more
## src_tbls(): "data", "locations" ... and 3 more
##
## tbl_data() %>% head():
## # A tibble: 6 x 7
## dataset location param date value flag
## <chr> <chr> <chr> <date> <dbl> <chr>
## 1 ecclimate_monthly Kentville mean_max_temp 1913-01-01 NA M
## 2 ecclimate_monthly Kentville mean_max_temp 1913-02-01 NA M
## 3 ecclimate_monthly Kentville mean_max_temp 1913-03-01 NA M
## 4 ecclimate_monthly Kentville mean_max_temp 1913-04-01 9.7 <NA>
## 5 ecclimate_monthly Kentville mean_max_temp 1913-05-01 12.5 <NA>
## 6 ecclimate_monthly Kentville mean_max_temp 1913-06-01 19.9 <NA>
## # ... with 1 more variables: flag_text <chr>
To select params/locations based on the tbl_params()
and tbl_locations()
tables, you can use the filter_*()
functions (note that last_year
is a column in tbl_locations()
, and unit
is a column in tbl_params()
):
# only use locations whose last data point was after 2000
ns_climate %>%
filter_locations(last_year > 2000)
## A mudata object aligned along "date"
## distinct_datasets(): "ecclimate_monthly"
## distinct_locations(): "ANNAPOLIS ROYAL 6289", "COLLEGEVILLE 6329" ... and 7 more
## distinct_params(): "dir_of_max_gust", "extr_max_temp" ... and 9 more
## src_tbls(): "data", "locations" ... and 3 more
##
## tbl_data() %>% head():
## # A tibble: 6 x 7
## dataset location param date value flag
## <chr> <chr> <chr> <date> <dbl> <chr>
## 1 ecclimate_monthly SABLE ISLAND 6454 mean_max_temp 1897-01-01 NA M
## 2 ecclimate_monthly SABLE ISLAND 6454 mean_max_temp 1897-02-01 NA M
## 3 ecclimate_monthly SABLE ISLAND 6454 mean_max_temp 1897-03-01 NA M
## 4 ecclimate_monthly SABLE ISLAND 6454 mean_max_temp 1897-04-01 NA M
## 5 ecclimate_monthly SABLE ISLAND 6454 mean_max_temp 1897-05-01 NA M
## 6 ecclimate_monthly SABLE ISLAND 6454 mean_max_temp 1897-06-01 NA M
## # ... with 1 more variables: flag_text <chr>
# use only params measured in mm
ns_climate %>%
filter_params(unit == "mm")
## A mudata object aligned along "date"
## distinct_datasets(): "ecclimate_monthly"
## distinct_locations(): "ANNAPOLIS ROYAL 6289", "BADDECK 6297" ... and 13 more
## distinct_params(): "total_precip", "total_rain"
## src_tbls(): "data", "locations" ... and 3 more
##
## tbl_data() %>% head():
## # A tibble: 6 x 7
## dataset location param date value flag
## <chr> <chr> <chr> <date> <dbl> <chr>
## 1 ecclimate_monthly SABLE ISLAND 6454 total_rain 1891-01-01 NA M
## 2 ecclimate_monthly SABLE ISLAND 6454 total_rain 1891-02-01 40.4 <NA>
## 3 ecclimate_monthly SABLE ISLAND 6454 total_rain 1891-03-01 32.0 <NA>
## 4 ecclimate_monthly SABLE ISLAND 6454 total_rain 1891-04-01 131.8 <NA>
## 5 ecclimate_monthly SABLE ISLAND 6454 total_rain 1891-05-01 44.7 <NA>
## 6 ecclimate_monthly SABLE ISLAND 6454 total_rain 1891-06-01 105.7 <NA>
## # ... with 1 more variables: flag_text <chr>
Similarly, we can subset parameters, locations, and the data table all at once using filter_data()
.
library(lubridate)
##
## Attaching package: 'lubridate'
## The following object is masked from 'package:base':
##
## date
# extract only June temperature from the data table
ns_climate %>%
filter_data(month(date) == 6)
## A mudata object aligned along "date"
## distinct_datasets(): "ecclimate_monthly"
## distinct_locations(): "ANNAPOLIS ROYAL 6289", "BADDECK 6297" ... and 13 more
## distinct_params(): "dir_of_max_gust", "extr_max_temp" ... and 9 more
## src_tbls(): "data", "locations" ... and 3 more
##
## tbl_data() %>% head():
## # A tibble: 6 x 7
## dataset location param date value flag
## <chr> <chr> <chr> <date> <dbl> <chr>
## 1 ecclimate_monthly SABLE ISLAND 6454 mean_max_temp 1897-06-01 NA M
## 2 ecclimate_monthly SABLE ISLAND 6454 mean_max_temp 1898-06-01 13.4 <NA>
## 3 ecclimate_monthly SABLE ISLAND 6454 mean_max_temp 1899-06-01 14.4 <NA>
## 4 ecclimate_monthly SABLE ISLAND 6454 mean_max_temp 1900-06-01 14.6 <NA>
## 5 ecclimate_monthly SABLE ISLAND 6454 mean_max_temp 1901-06-01 15.3 <NA>
## 6 ecclimate_monthly SABLE ISLAND 6454 mean_max_temp 1902-06-01 13.6 <NA>
## # ... with 1 more variables: flag_text <chr>
The data is stored in the data table (i.e., tbl_data()
) in parameter-long form (that is, one row per measurement rather than one row per observation). This has advantages in that information about each measurement can be stored next to the value (e.g., standard deviation, notes, etc.), however it is rarely the form required for analysis. To extract data in parameter-long form, you can use tbl_data()
:
ns_climate %>% tbl_data()
## # A tibble: 115,541 x 7
## dataset location param date value
## <chr> <chr> <chr> <date> <dbl>
## 1 ecclimate_monthly SABLE ISLAND 6454 mean_max_temp 1897-01-01 NA
## 2 ecclimate_monthly SABLE ISLAND 6454 mean_max_temp 1897-02-01 NA
## 3 ecclimate_monthly SABLE ISLAND 6454 mean_max_temp 1897-03-01 NA
## 4 ecclimate_monthly SABLE ISLAND 6454 mean_max_temp 1897-04-01 NA
## 5 ecclimate_monthly SABLE ISLAND 6454 mean_max_temp 1897-05-01 NA
## 6 ecclimate_monthly SABLE ISLAND 6454 mean_max_temp 1897-06-01 NA
## 7 ecclimate_monthly SABLE ISLAND 6454 mean_max_temp 1897-07-01 NA
## 8 ecclimate_monthly SABLE ISLAND 6454 mean_max_temp 1897-08-01 NA
## 9 ecclimate_monthly SABLE ISLAND 6454 mean_max_temp 1897-09-01 NA
## 10 ecclimate_monthly SABLE ISLAND 6454 mean_max_temp 1897-10-01 12.2
## # ... with 115,531 more rows, and 2 more variables: flag <chr>,
## # flag_text <chr>
To extract data in a more standard parameter-wide form, you can use tbl_data_wide()
:
ns_climate %>% tbl_data_wide()
## # A tibble: 14,311 x 14
## dataset location date dir_of_max_gust
## * <chr> <chr> <date> <dbl>
## 1 ecclimate_monthly ANNAPOLIS ROYAL 6289 1914-01-01 NA
## 2 ecclimate_monthly ANNAPOLIS ROYAL 6289 1914-02-01 NA
## 3 ecclimate_monthly ANNAPOLIS ROYAL 6289 1914-03-01 NA
## 4 ecclimate_monthly ANNAPOLIS ROYAL 6289 1914-04-01 NA
## 5 ecclimate_monthly ANNAPOLIS ROYAL 6289 1914-05-01 NA
## 6 ecclimate_monthly ANNAPOLIS ROYAL 6289 1914-06-01 NA
## 7 ecclimate_monthly ANNAPOLIS ROYAL 6289 1914-07-01 NA
## 8 ecclimate_monthly ANNAPOLIS ROYAL 6289 1914-08-01 NA
## 9 ecclimate_monthly ANNAPOLIS ROYAL 6289 1914-09-01 NA
## 10 ecclimate_monthly ANNAPOLIS ROYAL 6289 1914-10-01 NA
## # ... with 14,301 more rows, and 10 more variables: extr_max_temp <dbl>,
## # extr_min_temp <dbl>, mean_max_temp <dbl>, mean_min_temp <dbl>,
## # mean_temp <dbl>, snow_grnd_last_day <dbl>, spd_of_max_gust <dbl>,
## # total_precip <dbl>, total_rain <dbl>, total_snow <dbl>
The tbl_data_wide()
function isn’t limited to parameter-wide data - data can be anything-wide (Edzer Pebesma has a great discussion on this). Using tbl_data_wide()
is identical to using tbl_data()
and tidyr::spread()
, with context-specific defaults.
ns_climate %>%
select_params(mean_temp) %>%
filter_data(year(date) == 1960) %>%
tbl_data_wide(key = location)
## # A tibble: 12 x 16
## dataset param date `BADDECK 6297`
## * <chr> <chr> <date> <dbl>
## 1 ecclimate_monthly mean_temp 1960-01-01 -3.8
## 2 ecclimate_monthly mean_temp 1960-02-01 -1.2
## 3 ecclimate_monthly mean_temp 1960-03-01 -1.3
## 4 ecclimate_monthly mean_temp 1960-04-01 3.0
## 5 ecclimate_monthly mean_temp 1960-05-01 11.7
## 6 ecclimate_monthly mean_temp 1960-06-01 14.4
## 7 ecclimate_monthly mean_temp 1960-07-01 17.1
## 8 ecclimate_monthly mean_temp 1960-08-01 NA
## 9 ecclimate_monthly mean_temp 1960-09-01 15.2
## 10 ecclimate_monthly mean_temp 1960-10-01 8.7
## 11 ecclimate_monthly mean_temp 1960-11-01 4.6
## 12 ecclimate_monthly mean_temp 1960-12-01 -0.8
## # ... with 12 more variables: `COLLEGEVILLE 6329` <dbl>, `DIGBY
## # 6338` <dbl>, `KENTVILLE CDA 6375` <dbl>, `MAHONE BAY 6396` <dbl>,
## # `MOUNT UNIACKE 6413` <dbl>, `NAPPAN CDA 6414` <dbl>, `PARRSBORO
## # 6428` <dbl>, `PORT HASTINGS 6441` <dbl>, `SABLE ISLAND 6454` <dbl>,
## # `SPRINGFIELD 6473` <dbl>, `ST MARGARET'S BAY 6456` <dbl>, `UPPER
## # STEWIACKE 6495` <dbl>
Using the pipe (%>%
), we can string all the steps together concisely:
temp_1960 <- ns_climate %>%
# pick parameters
select_params(contains("temp")) %>%
# pick locations
select_locations(`Sable Island` = starts_with("SABLE"),
`Kentville` = starts_with("KENT"),
`Badeck` = starts_with("BADD")) %>%
# filter data table
filter_data(year(date) == 1960) %>%
# extract data in wide format
tbl_data_wide()
temp_1960
## # A tibble: 36 x 8
## dataset location date extr_max_temp extr_min_temp
## * <chr> <chr> <date> <dbl> <dbl>
## 1 ecclimate_monthly Badeck 1960-01-01 8.9 -16.7
## 2 ecclimate_monthly Badeck 1960-02-01 6.1 -13.3
## 3 ecclimate_monthly Badeck 1960-03-01 7.2 -9.4
## 4 ecclimate_monthly Badeck 1960-04-01 16.7 -7.8
## 5 ecclimate_monthly Badeck 1960-05-01 26.7 2.2
## 6 ecclimate_monthly Badeck 1960-06-01 30.6 0.0
## 7 ecclimate_monthly Badeck 1960-07-01 28.3 8.9
## 8 ecclimate_monthly Badeck 1960-08-01 33.3 8.9
## 9 ecclimate_monthly Badeck 1960-09-01 25.6 4.4
## 10 ecclimate_monthly Badeck 1960-10-01 18.3 -0.6
## # ... with 26 more rows, and 3 more variables: mean_max_temp <dbl>,
## # mean_min_temp <dbl>, mean_temp <dbl>
We can then use this data with ggplot2 to lead us to the conclusion that three locations in the same province had more or less the same monthly temperature characteristics in 1960.
library(ggplot2)
ggplot(temp_1960,
aes(x = date, y = mean_temp,
ymin = extr_min_temp,
ymax = extr_max_temp,
col = location, fill = location)) +
geom_ribbon(alpha = 0.2, col = NA) +
geom_line()