Bayes’ Rule
The form of bayes’ rule is
$\mathbb{P}\left(\mathcal{H}_{i} | \mathcal{D}\right)$ is the posterior probability of a hypothesis $\mathcal{H}_{i}$ (i.e. the probability of $\mathcal{H}_{i}$ after we know the data)
$\mathbb{P}\left(\mathcal{D} | \mathcal{H}_{i}\right)$ is the likelihood of the data given the hypothesis. Note, that we calculated this from the forward problem
$\mathbb{P}\left(\mathcal{H}_{i}\right)$ is the prior probability (i.e. the probability of $\mathcal{H}_{i}$ before we know the data)
$\mathbb{P}(\mathcal{D})$ is the evidence. It is the normalising constant given by
In most of our task, what we want is the posterior probability.
The bayes’ rule converts this into the forward problem.
Prior and Posterior
(to be continued)
When the posterior is the same as the prior then the likelihood and prior distributions are said to be conjugate. The prior then is the conjugate prior.
In the xxxxxx, we want to maximize the posterior probabilities(MAP), which is different with the fully bayesian inference.(2017-2018 exam paper AML)
MAP and MLE
MAP: maximize the posterior distribution
MLE: maximum likelihood estimation
In MAP, we should add a prior distribution and by adding observations and then to maximize the posterior distribution $p(w|X)$ to get the parameter.
最大似然估计是求参数$\theta$, 使似然函数$P(x_0|\theta)$最大。最大后验概率估计则是想求$\theta$使$P(x_0|\theta)P(\theta)$最大,求得的$ \theta $不单单让似然函数大,$ \theta $自己出现的先验概率(也是得到的后验概率)也得大.
Reference
- Bishop PRML
- 详解最大似然估计(MLE)、最大后验概率估计(MAP)
- Adam slide bayes_prn comp6208