Excerpt from the Gapminder data. The main object in this package is the gapminder
data frame or “tibble”. There are other goodies, such as the data in tab delimited form, a larger unfiltered dataset, premade color schemes for the countries and continents, and ISO 3166-1 country codes.
The gapminder
data frames include six variables, (Gapminder.org documentation page):
variable | meaning |
---|---|
country | |
continent | |
year | |
lifeExp | life expectancy at birth |
pop | total population |
gdpPercap | per-capita GDP |
Per-capita GDP (Gross domestic product) is given in units of international dollars, “a hypothetical unit of currency that has the same purchasing power parity that the U.S. dollar had in the United States at a given point in time” – 2005, in this case.
Package contains two main data frames or tibbles:
gapminder
: 12 rows for each country (1952, 1955, …, 2007). It’s a subset of …gapminder_unfiltered
: more lightly filtered and therefore about twice as many rows.Note: this package exists for the purpose of teaching and making code examples. It is an excerpt of data found in specific spreadsheets on Gapminder.org circa 2010. It is not a definitive source of socioeconomic data and I don’t update it. Use other data sources if it’s important to have the current best estimate of these statistics.
Install gapminder
from CRAN:
install.packages("gapminder")
Or you can install gapminder
from GitHub:
devtools::install_github("jennybc/gapminder")
Load it and test drive with some data aggregation and plotting:
library("gapminder")
aggregate(lifeExp ~ continent, gapminder, median)
#> continent lifeExp
#> 1 Africa 47.7920
#> 2 Americas 67.0480
#> 3 Asia 61.7915
#> 4 Europe 72.2410
#> 5 Oceania 73.6650
library("dplyr")
gapminder %>%
filter(year == 2007) %>%
group_by(continent) %>%
summarise(lifeExp = median(lifeExp))
#> # A tibble: 5 x 2
#> continent lifeExp
#> <fctr> <dbl>
#> 1 Africa 52.9265
#> 2 Americas 72.8990
#> 3 Asia 72.3960
#> 4 Europe 78.6085
#> 5 Oceania 80.7195
library("ggplot2")
ggplot(gapminder, aes(x = continent, y = lifeExp)) +
geom_boxplot(outlier.colour = "hotpink") +
geom_jitter(position = position_jitter(width = 0.1, height = 0), alpha = 1/4)
country_colors
and continent_colors
are provided as character vectors where elements are hex colors and the names are countries or continents.
head(country_colors, 4)
#> Nigeria Egypt Ethiopia Congo, Dem. Rep.
#> "#7F3B08" "#833D07" "#873F07" "#8B4107"
head(continent_colors)
#> Africa Americas Asia Europe Oceania
#> "#7F3B08" "#A50026" "#40004B" "#276419" "#313695"