Nowcasting Package: simplest user guide

Guilherme Branco Gomes

2017-11-06

This package contains a collection of functions to estimate “forecasts” of macroeconomic variables in the near futures or the recent past, in other words “nowcasting”. The econometric framework is a model of state spaces with many variables and commom components, the Dynamic Factor Model, where the information is released in a unsynchronized way. In this first version we focus on the problem when the variable of interest is a quarterly time series and the regressor are monthly time series.

Theoretical framework:

The standard model is the Dynamic Factor Model1. It can be specified in a space state representation:

\[X_t = C F_t + e_t\] \[ F_{t} = \sum_{j=1}^p (A_{j}L^jF_t) + Bu_t\] where:

Models and visualization

A few words about each method of estimation

Two Stages

This method is based on Giannone et al (2008) and Bańbura and Rünstler (2011). Briefly, the explanatory variables are all in monthly frequencies, the dependent variable is in quarterly frequency, the common factors are estimated by a two stages method based on Principal Component Annalysis and Kalman Filter. The aggregatoion is based on Murasawa and Mariano (2003) approximation.

Quarterly factors

The explanatory variables are aggregated in a quarterly quantity ,and the factor is estimated in this frequency. A bridge equation is estimated to explain and forecast the lower frequency serie.

pib<-BRGDP[,8]
y<-month2qtr(diff(diff(pib,3),12))
x<-Bpanel(BRGDP[,-8],rep(4,dim(BRGDP)[2]),aggregate = T)
## Warning in Bpanel(BRGDP[, -8], rep(4, dim(BRGDP)[2]), aggregate = T): 1
## serie(s) ruled out due to lack in observations (more than 1/3 is NA).
q<-1
r<-2
p<-1
now_2sq<-nowcast(y,x,q,r,p,method = '2sq')

The main output is a mts containing the dependent variable, the estimation in-sample and out-sample.

window(now_2sq$main,start=c(2017,1),frequency=4)
##            y       in        out
## 2017 Q1 3.51 2.332632         NA
## 2017 Q2 1.00 1.102109         NA
## 2017 Q3   NA       NA -0.5145013
## 2017 Q4   NA       NA -1.1060873
## 2018 Q1   NA       NA -0.9714737
## 2018 Q2   NA       NA -0.7363172
## 2018 Q3   NA       NA -0.5141008
nowcast.plot(now_2sq)

The output of bridge equation is available.

summary(now_2sq$reg)
## 
## Call:
## stats::lm(formula = Y ~ ., data = na.omit(data.frame(dados)))
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -5.9387 -1.5141 -0.0262  1.1774  6.9170 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.04077    0.24487   0.166    0.868    
## X1           1.15204    0.25735   4.476 2.43e-05 ***
## X2          -5.01117    3.11085  -1.611    0.111    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.257 on 82 degrees of freedom
## Multiple R-squared:  0.2705, Adjusted R-squared:  0.2527 
## F-statistic: 15.21 on 2 and 82 DF,  p-value: 2.416e-06

Another output are the factors themselves:

dfactors<-now_2sq$factors$dynamic_factors
window(dfactors,start=c(2017,1),end=c(2017,12),frequency=12)
##              Fator1      Fator2
## Jan 2017  2.3562489 -0.02063873
## Feb 2017  1.7409024 -0.07236217
## Mar 2017  1.7889105 -0.04609138
## Apr 2017  1.1561743 -0.09544473
## May 2017  0.9468537 -0.09199534
## Jun 2017  0.3631034 -0.12831952
## Jul 2017  0.3133052 -0.10331861
## Aug 2017 -0.4924744 -0.16164765
## Sep 2017 -1.4662869 -0.22628366
## Oct 2017 -1.8661595 -0.22018852
## Nov 2017 -1.6414041 -0.15255814
## Dec 2017 -1.4385843 -0.10186165
nowcast.plot(now_2sq,type='factors')

Monthly factors

The factors are estimared in a monthly frequency, these factors are aggregated in a quarterly quantity, and the bridge equation is estimated.

This method permits one to create a monthly measure of a quarterly variable

x<-Bpanel(BRGDP[,-8],rep(4,dim(BRGDP)[2]),aggregate = F)
## Warning in Bpanel(BRGDP[, -8], rep(4, dim(BRGDP)[2]), aggregate = F): 1
## serie(s) ruled out due to lack in observations (more than 1/3 is NA).
q<-1
r<-2
p<-1
now_2sm<-nowcast(y,x,q,r,p,method = '2sm')
nowcast.plot(now_2sm,type = 'month_y')

Expected Maximization

This method is based on Bańbura et al (2011). No need of bridge equation, series of all frequencies can used to estimate the factor jointly.

x<-Bpanel(BRGDP[,-8],rep(4,dim(BRGDP)[2]),aggregate = F)
## Warning in Bpanel(BRGDP[, -8], rep(4, dim(BRGDP)[2]), aggregate = F): 1
## serie(s) ruled out due to lack in observations (more than 1/3 is NA).
q<-1
r<-2
p<-1
now_em<-nowcast(y,x,q,r,p,'EM')

All the three methods return forecasts and nowcasts for the explanatory variables.

serie1 serie12 serie1373 serie1453 serie1455 serie4394
2017-04-01 0.1463150 -0.0019756 -18573 -233900 4.4 6.15
2017-05-01 0.0998914 -0.0015804 55683 204451 0.8 -0.34
2017-06-01 0.2006708 -0.0035520 -45868 12963 0.2 -0.65
2017-07-01 0.0596820 -0.0005173 7061 61394 0.2 3.77
2017-08-01 0.0106851 -0.0031037 46739 -15090 NA -7.60
2017-09-01 -0.0628380 -0.0029232 NA NA NA -3.20
2017-10-01 0.0929364 -0.0003828 NA NA NA NA
serie1 serie12 serie1373 serie1453 serie1455 serie4394
2017-04-01 0.1463045 -0.0019756 -18575.455 -233885.604 4.3994361 6.1490789
2017-05-01 0.0998871 -0.0015807 55679.003 204432.528 0.7998648 -0.3396444
2017-06-01 0.2006487 -0.0035516 -45862.292 12935.740 0.1999513 -0.6503055
2017-07-01 0.0596828 -0.0005175 7060.997 61383.736 0.2000770 3.7694664
2017-08-01 0.0106836 -0.0031034 46731.551 -15080.380 -0.0335844 -7.5989564
2017-09-01 -0.0628212 -0.0029230 -12619.149 10293.632 -0.6178403 -3.1995361
2017-10-01 0.0929275 -0.0003828 5616.796 -4544.155 0.2606991 0.2600324

Data extraction and tranformation

Data base available offline in the package

BRGDP

This is a panel containing a piece of the information for the Brazilian economic activity extracted in 03/10/2017. It is available for practice and offline examples. The time series are encoded as in the Brazilian Central Bank Time Series Management System v2.1.

  • Exchange rate - United States dollar in Brazilian currency (1);
  • Interest rate - CDI (12);
  • Vehicles production (1373);
  • Credit Sales Index (1453);
  • Retail sales (1455);
  • Current economic conditions index (4394);
  • Industrial production, general index (21859)
  • Quarterly GDP - observed data - GDP at market prices (22099)

Example of some series:

serie1455 serie4394 serie21859 serie22099
2017-04-01 87.3 71.32 79.5 NA
2017-05-01 89.4 66.45 90.1 NA
2017-06-01 88.2 70.81 88.4 163.25
2017-07-01 89.9 73.52 92.3 NA
2017-08-01 NA 69.31 90.1 NA
2017-09-01 NA 70.10 NA NA
2017-10-01 NA NA NA NA

USGDP

This dataset contains informations for US economy available in replication files of the seminal paper. It is a list with 2 data.frames.

USGDP$Base

It contains the time series with its values. Example of some series:

IPTOT IPFPS IPFP IPCG IPDCG
2007-08-01 114.1040 113.9952 115.0203 109.5507 107.8738
2007-09-01 114.4008 114.2023 115.1328 109.3959 106.3782
2007-10-01 113.5893 113.2830 114.2329 108.5158 105.0395
2007-11-01 113.9019 113.4212 114.4993 108.3419 106.0263
2007-12-01 NA NA NA NA NA
USGDP$Legenda

It contains the legend with the specifications of the model and the series. Example of some series:

col… Transformation mnemonic Series Release
189 189 2 PHBOS_HRSW Outlook: Work hours Business Outlook Survey
190 190 1 DEFICIT Federal govt. deficit or surplus (bil of $) (NSA) Monthly Treasury Statement
191 191 1 MFG_MIDW_IND Chicago Fed Midwest Mfg. Survey: General activity Chicago Fed MM1 Survey
192 192 1 SALES_RETAIL_NOM Sales: Retail & food services, total (mil of $) Adv. Monthly sales
193 193 0 RGDPGR Real GDP growth (annualized qurterly change) GDP - release

Real Time Data Base

This is a function to create real time data base from series available in Brazilian Central Bank Time Series Management System v2.1. One can use these functions to evaluate forecast models in real-time out-of-sample exercises.

Warning: We take no responsibility for delays in disclosure of new information or malfunctioning of the source API.

Collecting information now:

br_gdp<-base_extraction(series_code = 22099)
## 1 from 1 series extracted
window(br_gdp,start = c(2016,1),frequency = 12)
##         Jan    Feb    Mar    Apr    May    Jun    Jul    Aug    Sep    Oct
## 2016 161.17     NA     NA 162.82     NA     NA 164.38     NA     NA 161.54
## 2017 160.60     NA     NA 163.25     NA     NA     NA     NA     NA     NA
##         Nov    Dec
## 2016     NA     NA
## 2017     NA

What information was available yesterday?

We can answer this question for a limited set of time series. To take a look at those use the function RTDB with no argument.

head(RTDB())
##   series_code
## 1           1
## 2          12
## 3         188
## 4         189
## 5         192
## 6         193

Calling the information of a specific vintage.2:

serie1<-RTDB(series_code = 1,vintage ='2017-10-30')
window(serie1,start = c(2017,1),frequency = 12)
##                   Jan              Feb              Mar              Apr
## 2017 3.19660909090909 3.10419444444444 3.12793043478261 3.13617222222222
##                   May              Jun              Jul              Aug
## 2017 3.20950909090909 3.29536666666667 3.2061380952381  3.15091739130435
##                   Sep
## 2017 3.13479

For which vintages this serie is available:

tail(RTDB(series_code = 1))
##      vintages
## 65 2017-11-01
## 66 2017-11-02
## 67 2017-11-03
## 68 2017-11-04
## 69 2017-11-05
## 70 2017-11-06

Which series were available in a specific vintage:

head(RTDB(vintage ='2017-04-04'))
##   series_code
## 1           1
## 2          12
## 3         188
## 4         189
## 5         192
## 6         193

Pseudo RTDB

This function re-creates a pseudo real-time data base, supposing that there was no revision in the series and that the calendar of releases is fixed in a stylized way provided by the user. One can use this function to evaluate forecast models in pseudo real-time out-of-sample exercises.

suppose all variables are released instantly after the end of the period, i.e. 0 days of delay, and we are on 2017-03-23.

prtdb<-PRTDB(mts = BRGDP,delay = rep(0,dim(BRGDP)[2]),vintage = '2017-03-23')
window(prtdb,start=c(2017,1),frequency=12)
##            serie1    serie12 serie1373 serie1453 serie1455 serie4394
## Jan 2017 3.196609 0.04904500    174713   1558773      88.1     68.19
## Feb 2017 3.104194 0.04779489    200385   1460795      81.1     74.55
## Mar 2017       NA         NA        NA        NA        NA        NA
##          serie21859 serie22099
## Jan 2017       77.6         NA
## Feb 2017       75.5         NA
## Mar 2017         NA         NA

Now suppose the information is released in a asynchronous fashion, and we are on 2016-12-04.

prtdb<-PRTDB(mts = BRGDP,delay = c(1,3,-50,6,60,15,120,0),vintage = '2016-12-04')
window(prtdb,start=c(2016,1),frequency=12)
##            serie1    serie12 serie1373 serie1453 serie1455 serie4394
## Jan 2016 4.052350 0.05248375    148693   1620912      89.2     57.08
## Feb 2016 3.973742 0.05246100    144183   1594130      84.2     66.52
## Mar 2016 3.703918 0.05246100    198830   1876637      90.4     53.50
## Apr 2016 3.565845 0.05246100    171517   1834999      85.8     51.88
## May 2016 3.539290 0.05246100    177159   1849636      87.1     47.35
## Jun 2016 3.424477 0.05246100    184483   1858193      85.7     52.36
## Jul 2016 3.275567 0.05246100    190612   1748880      87.2     51.30
## Aug 2016 3.209661 0.05246100    178704   1805584      87.1     54.69
## Sep 2016 3.256371 0.05246100    170304   1858652      84.0     58.68
## Oct 2016 3.185845 0.05211300    175710   1957165        NA     59.11
## Nov 2016 3.342030 0.05159100    216297        NA        NA        NA
## Dec 2016       NA         NA    199864        NA        NA        NA
##          serie21859 serie22099
## Jan 2016       76.2         NA
## Feb 2016       75.8         NA
## Mar 2016       83.7     161.17
## Apr 2016       83.0         NA
## May 2016       86.3         NA
## Jun 2016       87.7     162.82
## Jul 2016       89.6         NA
## Aug 2016         NA         NA
## Sep 2016         NA     164.38
## Oct 2016         NA         NA
## Nov 2016         NA         NA
## Dec 2016         NA         NA

Balanced Panel

This function transforms the original monthly time series to its stationary representation following the user specification. The time series with more than 1/3 missing, i.e. NAs are deleted, and the remaining are modified such that the missings and outiliers are replaced by an approximated value.

The missings and outliers are “corrected” following the same method avaible in the replication files of Giannone et al. 2008. Outliers are defined as observations that lie more than 4 IQR from the median. All missings and outliers are replaced by the median. A centered moving average of degree k is calculated, forming a new panel. Then the missings and outliers are replaced by their equivalent observations on this new panel. We’ve made an important modification on the outlier_correction function found in the above mentioned files: Here the median of an even-sized sample is calculated by the mean of the two most central values, rather than using the largest of those numbers. Because of this modification the results obtained with the original replication files in (USGDP) are slightly different than those found here.

In the end, the monthly series can be aggregated to quarterly quantities following the Mariano and Murasawsa 2003 approximation.

The transformation is specified by codes, as follows:

\[\frac{X_t - X_{t-1}}{X_{t-1}}\]

\[X_t - X_{t-1}\]

\[\frac{X_t - X_{t-12}}{X_{t-12}} - \frac{X_{t-1} - X_{t-13}}{X_{t-13}}\]

\[(X_t - X_{t-12}) - (X_{t-1} - X_{t-13})\]

bpanel<-Bpanel(BRGDP,rep(4,dim(BRGDP)[2]))
## Warning in Bpanel(BRGDP, rep(4, dim(BRGDP)[2])): 2 serie(s) ruled out due
## to lack in observations (more than 1/3 is NA).
window(bpanel,start=c(2017,1),end=c(2018,1),frequency=12)
##               serie1       serie12 serie1373 serie1453 serie1455 serie4394
## Jan 2017 -0.33687273 -0.0016607500    -30996    140891       4.8     -3.57
## Feb 2017 -0.01380675 -0.0012273611     30182    -71196      -2.0     -3.08
## Mar 2017  0.29355991 -0.0023528889    -19599    169280       0.2      5.26
## Apr 2017  0.14631497 -0.0019755556    -18573   -233900       4.4      6.15
## May 2017  0.09989139 -0.0015804444     55683    204451       0.8     -0.34
## Jun 2017  0.20067078 -0.0035520000    -45868     12963       0.2     -0.65
## Jul 2017  0.05968203 -0.0005172857      7061     61394       0.2      3.77
## Aug 2017  0.01068509 -0.0031037143     46739    -15090        NA     -7.60
## Sep 2017 -0.06283795 -0.0029232000        NA        NA        NA     -3.20
## Oct 2017  0.09293643 -0.0003828000        NA        NA        NA        NA
## Nov 2017          NA            NA        NA        NA        NA        NA
## Dec 2017          NA            NA        NA        NA        NA        NA
## Jan 2018          NA            NA        NA        NA        NA        NA

month2qtr

This function transforms monthly in quarterly time series only choosing the value of the last month of the quarter to represent it.

Example:

gdp_month<-BRGDP[,'serie22099']
window(gdp_month,c(2016,1),frequency=12)
##         Jan    Feb    Mar    Apr    May    Jun    Jul    Aug    Sep    Oct
## 2016     NA     NA 161.17     NA     NA 162.82     NA     NA 164.38     NA
## 2017     NA     NA 160.60     NA     NA 163.25     NA     NA     NA     NA
##         Nov    Dec
## 2016     NA 161.54
## 2017
gdp_qtr<-month2qtr(gdp_month)
window(gdp_qtr,c(2016,1),frequency=4)
##        Qtr1   Qtr2   Qtr3   Qtr4
## 2016 161.17 162.82 164.38 161.54
## 2017 160.60 163.25     NA

qtr2month

This function transforms quarterly in monthly time series doing the opposite process of month2qtr function.

Example:

gdp_month2<-qtr2month(gdp_qtr)
window(gdp_month2,c(2016,1),frequency=12)
##         Jan    Feb    Mar    Apr    May    Jun    Jul    Aug    Sep    Oct
## 2016     NA     NA 161.17     NA     NA 162.82     NA     NA 164.38     NA
## 2017     NA     NA 160.60     NA     NA 163.25     NA     NA     NA       
##         Nov    Dec
## 2016     NA 161.54
## 2017

Reference:

Bańbura, M., & Rünstler, G. (2011). A look into the factor model black box: publication lags and the role of hard and soft data in forecasting GDP. International Journal of Forecasting, 27(2), 333-346.

Bańbura M., Giannone, D. & Reichlin, L. (2011). Nowcasting, in Michael P. Clements and David F. Hendry, editors, Oxford Handbook on Economic Forecasting, pages 193-224, January 2011.

Giannone, D., Reichlin, L., & Small, D. (2008). Nowcasting: The real-time informational content of macroeconomic data. Journal of Monetary Economics, 55(4), 665-676.

Mariano, R. S., & Murasawa, Y. (2003). A new coincident index of business cycles based on monthly and quarterly series. Journal of applied Econometrics, 18(4), 427-443.

Stock, J. H., & Watson, M. (2011). Dynamic factor models. Oxford Handbook on Economic Forecasting.


  1. For more details see Stock and Watson (2011).

  2. Here a vintage is the day when the information was gathered.