Cranking through Bayesian Calculus, Part III

Last time we talked about the regression model,

\begin{equation} \label{eq:regression} y_t = x_t’\beta + u_t, \quad u_t \sim iid N(0, \sigma^2). \end{equation}

We focused on the “new” parameter, \(\sigma^2\), and talked about how to construct a prior for it. We then described a few different parameterizations of the prior. Finally, we derived the posterior for \(\sigma^2\), under the likelihood defined in (\ref{eq:regression}) with the restriction that \(\beta=0\). Today we’re going to focus on jointly estimating the two parameters of our regression model: \((\beta,\sigma^2)\). We’ll refer to this vector of parameters as \(\theta\). Also, let’s make it explicit that \(\beta\) is a \(k \times 1\) vector; that is, there are \(k\) explanatory variables in our regression.

First, we’ll introduce a new distribution defined jointly over \((\beta,\sigma^2)\), which we’ll use as our prior distribution. To do so, we’ll factorize the prior as follows: \[ p(\beta,\sigma^2) = p(\beta|\sigma^2)p(\sigma^2). \] \((\beta,\sigma^2)\) follows a normal inverse gamma distribution with parameters \((\nu_0, s_0^2, \mu_0, V_0)\) if \(\sigma^2\) follows an inverse gamma distribution with parameters \((\nu_0, s_0^2)\) and \(\beta\) conditional on \(\sigma^2\) follows a normal distribution with mean \(\mu_0\) and variance \(\sigma^2V_0.\) Use the above factorization, the joint density of the distribution can be written as:

\begin{multline*} p(\beta,\sigma^2) = (2\pi)^{-k/2} [\mbox{det}(\sigma^2 V_0)]^{-1/2}\exp\left\{-\frac12 (\beta - \mu_0)’[\sigma^2 V_0]^{-1}(\beta - \mu_0)\right\} \\ \times \frac{\nu_0/2}{\Gamma(\nu_0/2)}s_0^{\nu_0}(1/\sigma^2)^{\nu_0/2+1}\exp\left\{-\nu_0 s_0^2 / (2\sigma^2)\right\} \end{multline*}

Problem 1: Write two scripts is python or R that: (1) generate draws from this distribution and (2) evaluate the log pdf of the distribution, given some values.

#test

Note that in our formulation, we’ve constructed \(\beta\) conditional on \(\sigma^2\). It’s also interesting to examine the marginal distribution of \(\beta\), \[ p(\beta) = \int p(\beta,\sigma^2)d\sigma^2. \] Problem 2: Derive the marginal distribution of \(\beta\) by integrating out \(\sigma^2\). Validate your derivation by comparing a density estimated from the simulations in Problem 1 to the analytic formulation. What is the name of this distribution?

#test

The normal inverse gamma prior is convenient because it’s conjugate for the normal regression model in (\ref{eq:regression}). This means that the posterior distribution of the parameters is also a normal inverse gamma distribution.

Problem 3: Derive the posterior distribution for the model in (\ref{eq:regression}). Use a normal inverse gamma prior with parameters \((\nu_0, s_0^2, \mu_0, V_0)\). For notation, let \(X = [x_1, \ldots, x_T]’\) and \(Y = [y_1,\ldots,y_T]’\).

#test

Let’s run a Bayesian regression! The data in the table below come from T. Haavelmo, “Methods of Measuring the Marginal Propensity to Consume,” J. Am. Statist. Assoc, 42, p. 88 (1947). Using (\ref{eq:regression}) to relate income, \(y_t\), to a constant and “autonomous” investment, the independent variable. The coefficient associated with investment is termed the investment multiplier.

Problem 4: Pick a parameterization of the normal inverse gamma distribution that is not very informative; that is, it doesn’t impose strong beliefs about the plausible values one the coefficients. Let’s center the prior for \(\beta\) at \(\mu_0 = 0\), and for the inverse gamma portion set \(s_0^2 =600\). What should you do with \(\nu_0\) and \(V_0\)? Construct the posterior distribution for \(\beta\) and \(\sigma^2\). What is the posterior mean of \(\beta_2\), the coefficient associated with investment? What happens when you increase the “strength” of the prior, by increasing \(\nu_0\) or decreasing \(V_0\)?

Table 1: Haavelmo's Data on Income and Investment \vspace*{0.1in}

Year	Income	Investment
1922	433	39
1923	483	60
1924	479	42
1925	486	52
1926	494	47
1927	498	51
1928	511	45
1929	534	60
1930	478	39
1931	440	41
1932	372	22
1933	381	17
1934	419	27
1935	449	33
1936	511	48
1937	520	51
1938	477	33
1939	517	46
1940	548	54
1941	629	100