# Inferring posterior forms

Inferring posterior forms in a Gaussian distribution can get really tricky with so many terms. There is a simple trick however that can save a lot of time. This post is in presenting this trick to save lot of calculations.

This post specifically addresses the following topics:

Let $${\bf X}$$ be a random vector that follows multivariate Gaussian distribution i.e,

where $${\bf \mu} = (\mu_1, \dots, \mu_D)^T$$ is the mean vector and $${\bf \Sigma}$$ is the covariance matrix. If $${\bf x}$$ is instantiation of the above random variable,

Expanding the exponent term gives;

where $$\text{constant}$$ are terms that do not depend on $$\bf x$$. All we care about in the exponent term is coefficient of quadratic term and the linear term.

## Bayesian Inference for the Gaussian

Let us confine to univariate Gaussian distribution and divide the derivation into three cases.

### case 1: $$\sigma^2$$ is known

We shall find the parameter $$\mu$$ given $$N$$ independent observations $${\bf X} = \{x_1, x_2, \dots, x_N\}$$. Then, the likelihood becomes;

Choosing the prior such that the posterior forms conjugate pair,

Now, the posterior is given by;

Looking at the sum in the exponent of the above terms, it is clear that it is of quadratic from confirming that the posterior is a Gaussian. Let the posterior gaussian be represented by $$\mathcal{N}(\mu \mid \mu_N, \sigma_N^2)$$.To find the parameters of the Gaussian, we look for the quadratic and linear terms in the exponent for $$\mu$$. Writing the exponent,

Next comparing the linear term gives,

And hence, $$\mu_N$$ becomes,

### case 2: $$\mu$$ is known

In this case, we will find the parameter $$\lambda \equiv \sigma^{-2}$$ given $$N$$ independent observations $${\bf X} = \{x_1, x_2, \dots, x_N\}$$. Then, the likelihood becomes;

To form the conjuagate pair, choose the prior such that it is proportional to $$\lambda^a exp(\alpha \lambda)$$ which is a gamma distribution.

Choosing above gamma distribution with parameters $$a_o, b_o$$ as the prior, i.e,

gives posterior the form;

which is again a Gamma distribution, say $$Gamma(\lambda \mid a_N, b_N)$$ and comparing to the standard Gamma distribution,

### case 3: When both $$\mu$$ and $$\sigma^2$$ are unkown

Again using $$\lambda \equiv \sigma^{-2}$$ for $$N$$ independent observations $${\bf X} = \{x_1, x_2, \dots, x_N\}$$ likelihood is given by;

Now the prior is choosen similar to the way we did before;

where $$c, d, \beta$$ are constants. Now the posterior is given by;

which is of the form of product of Gaussian and Gamma i.e, $$\mathcal{N}(\mu \mid \mu_N, \lambda_N) Gamma(\lambda \mid a_N, b_N)$$ and the parameters are given by;

## Maximum likelihood parameters for a multivariate Gaussian

Let $${\bf X} = \{x_1, x_2, \dots, x_N\}$$ be set of independent observations. The likelihood function is given by:

We shall estimate parameters of the above density model by first taking log likelihood of the above;

#### Estimating the parameter $$\bf \mu$$

Differentiate above equation wrt $${\bf \mu}$$ and setting to zero;

equating the above to zero;

#### Estimating the parameter $$\bf \Sigma$$

equating the above to zero; we get;