# Log-Sum-Exp trick for getting posterior probabilities from small likelihoods.

###### PUBLISHED ON OCT 10, 2019 — BIOINFORMATICS, STATISTICS

To formulate the problem, consider calculating the posterior probability given a series of weighted likelihoods. Let

$$x_i$$

be the likelihood for observation i. We wish to calculate the posterior probability as:

$$p_i = \frac{x_i}{\sum_j x_j}$$

As Robert Eisele very nicely discusses in his blog post, we can run into underflow issues. Suppose you have the following vector of logged likelihoods.

x <- c(-9000, -3000, -1000)
exp(x)/sum(exp(x))
# [1] NaN NaN NaN


As we can see, all the values will be zero. However, consider making the calculation in log-space.

\begin{align} \text{log}\left(\frac{x_i}{\sum_j x_j}\right) &= \text{log}x_i - \text{log}\sum_j x_j \\ &= \text{log}x_i - \text{log}\sum_j e^{\text{log}x_j} \\ \end{align}

Given the proof from Robert Eisele, we know:

$$\text{log}\sum_i e^{y_i} = \text{log}\sum_i e^{y_i - a} + a$$

for arbitrary a. Let a be the maximum log-likelihood value, and we get:

\begin{align} \text{log}\left(\frac{x_i}{\sum_j x_j}\right) &= \text{log}x_i - \left(\text{log}\sum_j e^{\text{log}x_j - \operatorname{max}_j(\text{log}x_j)} + \operatorname{max}_j(\text{log}x_j) \right) \\ p_i & = \text{exp}\left\{ \text{log}x_i - \left(\text{log}\sum_j e^{\text{log}x_j - \operatorname{max}_j(\text{log}x_j)} + \operatorname{max}_j(\text{log}x_j) \right) \right\} \end{align}

Finally, a simple R function to do the work:

getPost <- function(l) {
## l: vector of logged-likelihood values
exp(l - (log(sum(exp(l - max(l)))) + max(l)))
}
getPost(x)
# [1] 0 0 1


and we see it gives the exact same values for tractable likelihoods:

y <- c(-9, -3, -1)
exp(y)/sum(exp(y))
# [1] 0.0002953872 0.1191677110 0.8805369018
getPost(y)
# [1] 0.0002953872 0.1191677110 0.8805369018