Log-likelihood and component responsibilities under a Gaussian mixture
The responsibility R(k,n) is the probability of a particular component in the Gaussian mixture, k, having generated a particular data point, n. It is calculated from the distances between the data point n and the centres of the mixture components, 1..K, and the inverse variance, beta, common to all components.
[llh, R] = gtm_resp(DIST, minDist, maxDist, beta, D, mode)
[llh, R] = gtm_resp(DIST, beta, D)
DIST
- a K-by-N matrix in which element (k,n) is the squared distance between the centre of component
k and the data point n.
minDistmaxDist
- vectors containing the minimum and maximum of each column in DIST, respectively;
1-by-N; required iff m > 0.
beta
- a scalar value of the inverse variance common to all components of the mixture.
D
- dimensionality of space where the data and the Gaussian mixture lives; necessary to calculate
the correct log-likelihood.
mode
- optional argument used to control the mode of calculation; it can be set to 0, 1 or 2 corresponding
to increasingly elaborate measure taken to reduce the amount of numerical errors; mode = 0 will be fast but less
accurate, mode = 2 will be slow but more accurate; the default mode is 0
llh
- the log-likelihood of data under the Gaussian mixture
R
- an K-by-N responsibility matrix; R(k,n) is the responsibility takened by mixture component
k for data point n.
'llh' is put as the first output argument, as 'R' is not of interest in the fairly common task of calculating the log-likelihood of a data set under a given model. This allows for calls like:
llh = gtm_resp(...);