The Particle Filter

Nonlinear DSGE Models

From Linear to Nonlinear DSGE Models

From Linear to Nonlinear DSGE Models

Some Prominent Examples

Particle Filter

Particle Filters

Filtering - General Idea

Filtering - General Idea

Bootstrap Particle Filter

Bootstrap Particle Filter

Bootstrap Particle Filter

Bootstrap Particle Filter

Likelihood Approximation

The Role of Measurement Errors

Generic Particle Filter – Recursion

Asymptotics

Asymptotics

Adapting the Generic PF

More on Conditionally-Linear Models

More on Conditionally-Linear Models

More on Conditionally-Linear Models

\begin{itemize} \item Then, \begin{eqnarray} \lefteqn{\int h(m_{t},s_{t}) p(m_t,s_t|Y_{1:t-1}) d(m_t,s_t)} \\\ &=& \int \left[ \int h(m_{t},s_{t}) p(s_{t}|m_{t},Y_{1:t-1}) d s_{t} \right] p(m_{t}|Y_{1:t-1}) dm_{t} \label{eq_generalpfhtt1condlinear} \nonumber \\\ &\approx&\frac{1}{M} \sum_{j=1}^M \left[ \int h(m_{t}^j,s_{t}^j) p_N\big(s_t| \tilde{s}_{t|t-1}^j,P_{t|t-1}^j \big) ds_t \right] \omega_t^j W_{t-1}^j \nonumber \end{eqnarray} \item The likelihood approximation is based on the incremental weights \begin{equation} \tilde{w}_t^j = p_N \big(y_t|\tilde{y}_{t|t-1}^j,F_{t|t-1}^j \big) \omega_t^j. \label{eq_generalpfincrweightcondlinear} \end{equation} \item Conditional on $\tilde{m}_t^j$ we can use the Kalman filter once more to update the information about $s_t$ in view of the current observation $y_t$: \begin{equation} \begin{array}{lcl} \tilde{s}_{t|t}^j &=& \tilde{s}_{t|t-1}^j + P_{t|t-1}^j \Psi_2(\tilde{m}^j_t)’ \big( F_{t|t-1}^j \big)^{-1} (y_t - \bar{y}^j_{t|t-1}) \\\ \tilde{P}_{t|t}^j &=& P^j_{t|t-1} - P^j_{t|t-1} \Psi_2(\tilde{m}^j_t)’\big(F^j_{t|t-1} \big)^{-1} \Psi_2(\tilde{m}^j_t) P_{t|t-1}^j. \end{array} \label{eq_pfupdatecondlinear} \end{equation} \end{itemize}

Particle Filter For Conditionally Linear Models

\begin{enumerate} \item {\bf Initialization.}

\item {\bf Recursion.} For $t=1,\ldots,T$:
\begin{enumerate}
	\item {\bf Forecasting $s\_t$.} Draw $\tilde{m}\_t^j$ from density $g\_t(\tilde{m}\_t|m\_{t-1}^j,\theta)$,
	calculate the importance weights $\omega\_t^j$ in~(\ref{eq\_generalpfomegacondlinear}),
	and compute $\tilde{s}\_{t|t-1}^j$ and $P\_{t|t-1}^j$ according to~(\ref{eq\_pfforeccondlinear}).
	An approximation of $\mathbb{E}[h(s\_t,m\_t)|Y\_{1:t-1},\theta]$ is given by~(\ref{eq\_generalpfhtt1condlinear}).
	\item {\bf Forecasting $y\_t$.} Compute the incremental weights $\tilde{w}\_t^j$
	according to~(\ref{eq\_generalpfincrweightcondlinear}).
	Approximate the predictive density $p(y\_t|Y\_{1:t-1},\theta)$
	by
	\begin{equation}
	\hat{p}(y\_t|Y\_{1:t-1},\theta) = \frac{1}{M} \sum\_{j=1}^M \tilde{w}^j\_t W\_{t-1}^j.
	\end{equation}
	\item {\bf Updating.} Define the normalized weights
	\begin{equation}
	\tilde{W}\_t^j = \frac{\tilde{w}\_t^j W\_{t-1}^j}{\frac{1}{M} \sum\_{j=1}^M \tilde{w}\_t^j W\_{t-1}^j}
	\end{equation}
	and compute $\tilde{s}\_{t|t}^j$ and $\tilde{P}\_{t|t}^j$ according to~(\ref{eq\_pfupdatecondlinear}). An approximation of $\mathbb{E}[h(m\_{t},s\_{t})|Y\_{1:t},\theta]$ can be obtained
	from $\\{\tilde{m}\_t^j,\tilde{s}\_{t|t}^j,\tilde{P}\_{t|t}^j,\tilde{W}\_t^j\\}$.
	\item {\bf Selection.}
\end{enumerate}

\end{enumerate}

Nonlinear and Partially Deterministic State Transitions

\begin{itemize} \item Example: \[ s_{1,t} = \Phi_1(s_{t-1},\epsilon_t), \quad s_{2,t} = \Phi_2(s_{t-1}), \quad \epsilon_t \sim N(0,1). \] \item Generic filter requires evaluation of $p(s_t|s_{t-1})$. \spitem Define $\varsigma_t = [s_t’,\epsilon_t’]’$ and add identity $\epsilon_t = \epsilon_t$ to state transition. \spitem Factorize the density $p(\varsigma_t|\varsigma_{t-1})$ as \[ p(\varsigma_t|\varsigma_{t-1}) = p^\epsilon(\epsilon_t) p(s_{1,t}|s_{t-1},\epsilon_t) p(s_{2,t}|s_{t-1}). \] where $p(s_{1,t}|s_{t-1},\epsilon_t)$ and $p(s_{2,t}|s_{t-1})$ are pointmasses. \item Sample innovation $\epsilon_t$ from $g_t^\epsilon(\epsilon_t|s_{t-1})$. \item Then \[ \omega_t^j = \frac{ p(\tilde{\varsigma}^j_t|\varsigma^j_{t-1}) }{g_t (\tilde{\varsigma}^j_t|\varsigma^j_{t-1})} = \frac{ p^\epsilon( \tilde{\epsilon}_t^j) p(\tilde{s}_{1,t}^j|s^j_{t-1},\tilde{\epsilon}^j_t) p(\tilde{s}^j_{2,t}|s^j_{t-1}) } { g_t^\epsilon(\tilde{\epsilon}^j_t|s^j_{t-1}) p(\tilde{s}_{1,t}^j|s^j_{t-1},\tilde{\epsilon}^j_t) p(\tilde{s}^j_{2,t}|s^j_{t-1}) } = \frac{ p^\epsilon(\tilde{\epsilon}_t^j)}{g_t^\epsilon(\tilde{\epsilon}^j_t|s^j_{t-1})}. \label{eq_pfomegaepsilon} \] \end{itemize}

Degenerate Measurement Error Distributions

\begin{itemize} \item Our discussion of the conditionally-optimal importance distribution suggests that in the absence of measurement errors, one has to solve the system of equations \[ y_t = \Psi \big( \Phi( s_{t-1}^j,\tilde{\epsilon}_t^j) \big), \label{eq_pfepssystem} \] to determine $\tilde{\epsilon}_t^j$ as a function of $s_{t-1}^j$ and the current observation $y_t$. \spitem Then define \[ \omega_t^j = p^\epsilon(\tilde{\epsilon}_t^j) \quad \mbox{and} \quad \tilde{s}_t^j = \Phi( s_{t-1}^j,\tilde{\epsilon}_t^j). \] \item Difficulty: one has to find all solutions to a nonlinear system of equations. \spitem While resampling duplicates particles, the duplicated particles do not mutate, which can lead to a degeneracy. \end{itemize}

Next Steps

\begin{itemize} \item We will now apply PFs to linearized DSGE models. \item This allows us to compare the Monte Carlo approximation to the ``truth.’' \item Small-scale New Keynesian DSGE model \item Smets-Wouters model \end{itemize}

Illustration 1: Small-Scale DSGE Model

Parameter Values For Likelihood Evaluation

\begin{center} \begin{tabular}{lcclcc} \hline\hline Parameter & $\theta^{m}$ & $\theta^{l}$ & Parameter & $\theta^{m}$ & $\theta^{l}$ \ \hline $\tau$ & 2.09 & 3.26 & $\kappa$ & 0.98 & 0.89 \\\ $\psi_1$ & 2.25 & 1.88 & $\psi_2$ & 0.65 & 0.53 \\\ $\rho_r$ & 0.81 & 0.76 & $\rho_g$ & 0.98 & 0.98 \\\ $\rho_z$ & 0.93 & 0.89 & $r^{(A)}$ & 0.34 & 0.19 \\\ $\pi^{(A)}$ & 3.16 & 3.29 & $\gamma^{(Q)}$ & 0.51 & 0.73 \\\ $\sigma_r$ & 0.19 & 0.20 & $\sigma_g$ & 0.65 & 0.58 \\\ $\sigma_z$ & 0.24 & 0.29 & $\ln p(Y|\theta)$ & -306.5 & -313.4 \ \hline \end{tabular} \end{center}

Likelihood Approximation

\begin{center} \begin{tabular}{c} $\ln \hat{p}(y_t|Y_{1:t-1},\theta^m)$ vs. $\ln p(y_t|Y_{1:t-1},\theta^m)$ \\\ \includegraphics[width=3.2in]{static/dsge1_me_paramax_lnpy.pdf} \end{tabular} \end{center}

Notes: The results depicted in the figure are based on a single run of the bootstrap PF (dashed, \(M=40,000\)), the conditionally-optimal PF (dotted, \(M=400\)), and the Kalman filter (solid).

Filtered State

\begin{center} \begin{tabular}{c} $\widehat{\mathbb{E}}[\hat{g}_t|Y_{1:t},\theta^m]$ vs. $\mathbb{E}[\hat{g}_t|Y_{1:t},\theta^m]$\\\ \includegraphics[width=3.2in]{static/dsge1_me_paramax_ghat.pdf} \end{tabular} \end{center}

Notes: The results depicted in the figure are based on a single run of the bootstrap PF (dashed, \(M=40,000\)), the conditionally-optimal PF (dotted, \(M=400\)), and the Kalman filter (solid).

Distribution of Log-Likelihood Approximation Errors

\begin{center} \begin{tabular}{c} Bootstrap PF: $\theta^m$ vs. $\theta^l$ \\\ \includegraphics[width=3in]{static/dsge1_me_bootstrap_lnlhbias.pdf} \end{tabular} \end{center}

Notes: Density estimate of \(\hat{\Delta}_1 = \ln \hat{p}(Y_{1:T}|\theta)- \ln p(Y_{1:T}|\theta)\) based on \(N_{run}=100\) runs of the PF. Solid line is \(\theta = \theta^m\); dashed line is \(\theta = \theta^l\) (\(M=40,000\)).

Distribution of Log-Likelihood Approximation Errors

\begin{center} \begin{tabular}{c} $\theta^m$: Bootstrap vs. Cond. Opt. PF \\\ \includegraphics[width=3in]{static/dsge1_me_paramax_lnlhbias.pdf} \\\ \end{tabular} \end{center}

Notes: Density estimate of \(\hat{\Delta}_1 = \ln \hat{p}(Y_{1:T}|\theta)- \ln p(Y_{1:T}|\theta)\) based on \(N_{run}=100\) runs of the PF. Solid line is bootstrap particle filter (\(M=40,000\)); dotted line is conditionally optimal particle filter (\(M=400\)).

Summary Statistics for Particle Filters

\begin{center} \begin{tabular}{lrrr} \ \hline \hline & Bootstrap & Cond. Opt. & Auxiliary \ \hline Number of Particles $M$ & 40,000 & 400 & 40,000 \\\ Number of Repetitions & 100 & 100 & 100 \ \hline \multicolumn{4}{c}{High Posterior Density: $\theta = \theta^m$} \ \hline Bias $\hat{\Delta}_1$ & -1.39 & -0.10 & -2.83 \\\ StdD $\hat{\Delta}_1$ & 2.03 & 0.37 & 1.87 \\\ Bias $\hat{\Delta}_2$ & 0.32 & -0.03 & -0.74 \ \hline \multicolumn{4}{c}{Low Posterior Density: $\theta = \theta^l$} \ \hline Bias $\hat{\Delta}_1$ & -7.01 & -0.11 & -6.44 \\\ StdD $\hat{\Delta}_1$ & 4.68 & 0.44 & 4.19 \\\ Bias $\hat{\Delta}_2$ & -0.70 & -0.02 & -0.50 \ \hline \end{tabular} \end{center}

Notes: \(\hat{\Delta}_1 = \ln \hat{p}(Y_{1:T}|\theta) - \ln p(Y_{1:T}|\theta)\) and \(\hat{\Delta}_2 = \exp[ \ln \hat{p}(Y_{1:T}|\theta) - \ln p(Y_{1:T}|\theta) ] - 1\). Results are based on \(N_{run}=100\) runs of the particle filters.

Great Recession and Beyond

\begin{center} \begin{tabular}{c} Mean of Log-likelihood Increments $\ln \hat{p}(y_t|Y_{1:t-1},\theta^m)$ \\\ \includegraphics[width=3in]{static/dsge1_me_great_recession_lnpy.pdf} \end{tabular} \end{center}

Notes: Solid lines represent results from Kalman filter. Dashed lines correspond to bootstrap particle filter (\(M=40,000\)) and dotted lines correspond to conditionally-optimal particle filter (\(M=400\)). Results are based on \(N_{run}=100\) runs of the filters.

Great Recession and Beyond

\begin{center} \begin{tabular}{c} Mean of Log-likelihood Increments $\ln \hat{p}(y_t|Y_{1:t-1},\theta^m)$ \\\ \includegraphics[width=2.9in]{static/dsge1_me_post_great_recession_lnpy.pdf} \end{tabular} \end{center}

Notes: Solid lines represent results from Kalman filter. Dashed lines correspond to bootstrap particle filter (\(M=40,000\)) and dotted lines correspond to conditionally-optimal particle filter (\(M=400\)). Results are based on \(N_{run}=100\) runs of the filters.

Great Recession and Beyond

\begin{center} \begin{tabular}{c} Log Standard Dev of Log-Likelihood Increments \\\ \includegraphics[width=3in]{static/dsge1_me_great_recession_lnpy_lnstd.pdf} \end{tabular} \end{center}

Notes: Solid lines represent results from Kalman filter. Dashed lines correspond to bootstrap particle filter (\(M=40,000\)) and dotted lines correspond to conditionally-optimal particle filter (\(M=400\)). Results are based on \(N_{run}=100\) runs of the filters.

SW Model: Distr. of Log-Likelihood Approximation Errors

\begin{center} \begin{tabular}{c} BS ($M=40,000$) versus CO ($M=4,000$) \\\ \includegraphics[width=3in]{static/sw_me_paramax_lnlhbias.pdf} \end{tabular} \end{center}

Notes: Density estimates of \(\hat{\Delta}_1 = \ln \hat{p}(Y|\theta)- \ln p(Y|\theta)\) based on \(N_{run}=100\). Solid densities summarize results for the bootstrap (BS) particle filter; dashed densities summarize results for the conditionally-optimal (CO) particle filter.

SW Model: Distr. of Log-Likelihood Approximation Errors

\begin{center} \begin{tabular}{c} BS ($M=400,000$) versus CO ($M=4,000$) \\\ \includegraphics[width=3in]{static/sw_me_paramax_bs_lnlhbias.pdf} \end{tabular} \end{center}

Notes: Density estimates of \(\hat{\Delta}_1 = \ln \hat{p}(Y|\theta)- \ln p(Y|\theta)\) based on \(N_{run}=100\). Solid densities summarize results for the bootstrap (BS) particle filter; dashed densities summarize results for the conditionally-optimal (CO) particle filter.

SW Model: Summary Statistics for Particle Filters

\begin{center} \begin{tabular}{lrrrr} \ \hline \hline & \multicolumn{2}{c}{Bootstrap} & \multicolumn{2}{c}{Cond. Opt.} \ \hline Number of Particles $M$ & 40,000 & 400,000 & 4,000 & 40,000 \\\ Number of Repetitions & 100 & 100 & 100 & 100 \ \hline \multicolumn{5}{c}{High Posterior Density: $\theta = \theta^m$} \ \hline Bias $\hat{\Delta}_1$ & -238.49 & -118.20 & -8.55 & -2.88 \\\ StdD $\hat{\Delta}_1$ & 68.28 & 35.69 & 4.43 & 2.49 \\\ Bias $\hat{\Delta}_2$ & -1.00 & -1.00 & -0.87 & -0.41 \ \hline \multicolumn{5}{c}{Low Posterior Density: $\theta = \theta^l$} \ \hline Bias $\hat{\Delta}_1$ & -253.89 & -128.13 & -11.48 & -4.91 \\\ StdD $\hat{\Delta}_1$ & 65.57 & 41.25 & 4.98 & 2.75 \\\ Bias $\hat{\Delta}_2$ & -1.00 & -1.00 & -0.97 & -0.64 \ \hline \end{tabular} \end{center}

Notes: \(\hat{\Delta}_1 = \ln \hat{p}(Y_{1:T}|\theta) - \ln p(Y_{1:T}|\theta)\) and \(\hat{\Delta}_2 = \exp[ \ln \hat{p}(Y_{1:T}|\theta) - \ln p(Y_{1:T}|\theta) ] - 1\). Results are based on \(N_{run}=100\).

Tempered Particle Filter

The Key Idea

\begin{itemize}

\spitem Define
\begin{eqnarray\*} p\_n(y\_t|s\_t,\theta) &\propto& {\color{blue}\phi\_n^{d/2}}
|\Sigma\_u(\theta)|^{-1/2}\exp \bigg\\{ - \frac{1}{2} (y\_t - \Psi(s\_t,t;\theta))' \\\\\\
&& \times {\color{blue}\phi\_n} \Sigma\_u^{-1}(\theta)(y\_t - \Psi(s\_t,t;\theta)) \bigg\\},
\end{eqnarray\*}
where:
\\[
{\color{blue} \phi\_1 < \phi\_2 < \ldots < \phi\_{N\_\phi} = 1}.
\\]
\item {\color{red} Bridge posteriors given $s\_{t-1}$:}
\\[
p\_n(s\_t|y\_t,s\_{t-1},\theta)
  \propto p\_n(y\_t|s\_t,\theta) p(s\_t|s\_{t-1},\theta).
\\]
\item {\color{red} Bridge posteriors given $Y\_{1:t-1}$:}
\\[
p\_n(s\_t|Y\_{1:t})= \int p\_n(s\_t|y\_t,s\_{t-1},\theta) p(s\_{t-1}|Y\_{1:t-1}) ds\_{t-1}.
\\]

\end{itemize}

Algorithm Overview

Overview

An Illustration: \(p_n(s_t|Y_{1:t})\), \(n=1,\ldots,N_\phi\).

\begin{center} \includegraphics[width=4in]{static/phi_evolution.pdf} \end{center}

Choice of \(\phi_n\)

\begin{itemize} \spitem Based on Geweke and Frischknecht (2014). \spitem {\color{blue} Express post-correction inefficiency ratio as} \[ \mbox{InEff}(\phi_n) = \frac{\frac{1}{M} \sum_{j=1}^M \exp [ -2(\phi_n-\phi_{n-1}) e_{j,t}] }{ \left(\frac{1}{M} \sum_{j=1}^M \exp [ -(\phi_n-\phi_{n-1}) e_{j,t}] \right)^2} \] where \[ e_{j,t} = \frac{1}{2} (y_t - \Psi(s_t^{j,n-1},t;\theta))’ \Sigma_u^{-1}(y_t - \Psi(s_t^{j,n-1},t;\theta)). \] \item {\color{red} Pick target ratio $r^*$ and solve equation $\mbox{InEff}(\phi_n^*) = r^*$ for $\phi_n^*$.} \end{itemize}

Small-Scale Model: PF Summary Statistics

\begin{tabular}{l@{\hspace{1cm}}r@{\hspace{1cm}}rrrr} \ \hline \hline & BSPF & \multicolumn{4}{c}{TPF} \ \hline Number of Particles $M$ & 40k & 4k & 4k & 40k & 40k \\\ Target Ineff. Ratio $r^*$ & & 2 & 3 & 2 & 3 \ \hline \multicolumn{6}{c}{High Posterior Density: $\theta = \theta^m$} \ \hline Bias & -1.4 & -0.9 & -1.5 & -0.3 & -.05 \\\ StdD & 1.9 & 1.4 & 1.7 & 0.4 & 0.6 \\\ $T^{-1}\sum_{t=1}^{T}N_{\phi,t}$ & 1.0 & 4.3 & 3.2 & 4.3 & 3.2 \\\ Average Run Time (s) & 0.8 & 0.4 & 0.3 & 4.0 & 3.3 \ \hline \multicolumn{6}{c}{Low Posterior Density: $\theta = \theta^l$} \ \hline Bias & -6.5 & -2.1 & -3.1 & -0.3 & -0.6 \\\ StdD & 5.3 & 2.1 & 2.6 & 0.8 & 1.0 \\\ $T^{-1}\sum_{t=1}^{T}N_{\phi,t}$ & 1.0 & 4.4 & 3.3 & 4.4 & 3.3 \\\ Average Run Time (s) & 1.6 & 0.4 & 0.3 & 3.7 & 2.9 \ \hline \end{tabular}

Computational Considerations

Parallel Particle Filtering

Parallel Resampling

Weight Balancing

Speed Gains from Parallelization, 100 lik. eval.

\vspace*{-0.25in}

\begin{center} \includegraphics[width=4.8in]{static/parallel_pf} \end{center}

References

References

Bibliography

[Fern_ndez_Villaverde_2011] Fernández-Villaverde, Guerrón-Quintana, Pablo, Kuester & Rubio-Ramírez, Fiscal Volatility Shocks and Economic Activity, , (2011). link. doi.

[Fern_ndez_Villaverde_2015] Fernández-Villaverde, Gordon, , Guerrón-Quintana & Rubio-Ramírez, Nonlinear adventures at the zero lower bound, Journal of Economic Dynamics and Control, 57, 182–204 (2015). link. doi.

[Bora_an_Aruoba_2017] Aruoba, Cuba-Borda, & Schorfheide, Macroeconomic Dynamics Near the ZLB: A Tale of Two Countries, The Review of Economic Studies, 85(1), 87 118 (2018). link. doi.

[Gust_2017] Gust, Herbst, , López-Salido & Smith, The Empirical Implications of the Interest-Rate Lower Bound, American Economic Review, 107(7), 1971 2006 (2017). link. doi.