Fisher-Rao metric
Calyampudi Radhakrishna Rao (2009), Scholarpedia, 4(2):7085. | doi:10.4249/scholarpedia.7085 | revision #91265 [link to/cite this article] |
The Fisher–Rao metric is a choice of Riemannian metric in the space of probability distributions. The derived geodesic distance, known as Rao distance, provides a measure of difference between two probability distributions.
Contents |
Definition
Let \(P\) be the space of probability distributions \(p(x,\theta)\) where \(x\) is a random variable and \(\theta=(\theta_1,\cdots,\theta_p)\in R^p\) is a \(p\)-vector continuous parameter. Under suitable regularity conditions Fisher information is defined to be the \(p \times p\) matrix.
\[ I = \left[ i_{rs} \right], i_{rs} = E \left[ \left(\frac{\partial log p}{\partial \theta_r}\right) \left(\frac{\partial log p}{\partial \theta_s}\right) \right] \]
Fisher–Rao metric is defined as :
\[\tag{1} ds = \sum \sum i_{rs} d \theta_r d \theta_s \]
which is a measure of change in \(p(x,\theta)\) when \(\theta\) is replaced by \(\theta + \delta \theta\ ,\) where \(\delta \theta = (\delta \theta_1,\cdots,\delta \theta_p)\) represents a small change in \(\theta\ .\) The geodesic distance between two probability distributions induced by the metric (1), with Levi Civita connection associated with Fisher information matrix is defined as the Rao distance. The metric and distance which were originally introduced in connection with the statistical problems of classification and cluster analysis have found applications in the discussion of some problems in quantum mechanics, detecting structures in images and so on.
History
When I joined the Indian Statistical Institute (ISI) in 1943 as a research scholar, the Director of the Institute, PC Mahalanobis , asked me to make a cluster analysis of different castes and tribes in an Indian State using a measure of distance between two populations. This needed a choice of distance measure between two populations based on some anthropometric measurements taken on individuals. Mahalanobis suggested representing a population by the mean values of the measurements taken on individuals in a coordinate space with oblique axes and computing the straight line distance. I felt that Mahalanobis distance is appropriate when the measurements have a multivariate normal distribution and looked for a general measure which can be computed for any specified probability distribution of the measurements. I suggested the differential geometric approach in my 1945 paper (Bull.Cal.Math.Soc., 37, 81-91) by considering the space of probability distributions. I used Fisher information matrix in defining the metric, so it was called Fisher – Rao metric. Differential geometry was not well known at that time, and in order to compute the geodesic distance from the metric, I had to learn the mathematics from papers on relativity describing Einstein metric. It was only 30 years later, my work received attention. Efron (1975) carried the argument a step forward when he introduced, in effect, a new affine connection on the parameter space manifold, and thus shed light on the role of the embedding curvature of the statistical model in the relevant space of probability distributions. The work of Efron has been followed up and extended by a number of authors, for example Amari (1982) and Kass and Voss (1977). Ditkinson and Mitchal (1981) computed Rao distance for a number of probability distributions.
Applications
During the last 10 years, Fisher-Rao metric and Rao distance received numerous applications. Brody and Hughston (1998) used these concepts in developing a concise geometric formulation of statistical estimation theory and applying the results to quantum statistical inference. In another paper, Brody and Hughston (2001), applied these concepts to the theory of interest rates in economics. The use of Fisher-Rao metric in the detection of structures in images is detailed in a series of papers by Maybank (2004,2005,2007, 2008a). Applications in Econometrics are described in a book by Uwe jansen (1997). An interesting development is the use of Rao measure, defined by \([I(\theta)]^{1/2} d \theta\) as the appropriate prior distribution in Bayesian analysis by Maybank (2007).Some invariant properties 0f Fisher-Rao Metric justifying its use in statistical inference are discussed in Maybank(2008b).Crooks(2007) characterises Fisher-Rao Metric and Rao distance as more general and fundamental than the thermodynamic definition of length.
References
- Amari, S (1985), Differential Geometric Methods in Statistics, Lecture notes in statistics 28, Berlin, Springer – Verlag.
- Brody, D.C. and Hughston, L.P.(1998), Proc.Roy, Soc.A.London, 454, 2455-2475.
- Brody, D.C.and Hughston, L.P. (2001), Proc.Roy.Soc.A, London, 457, 1343-1367.
- Crooks, G.E.(2007), Physical Review Letters,99, 100602,1-4.
- Kass.R.E. and Voss, P.W.(1997), Geometrical Foundations of Asymptotic Inference.
- Maybank, S.J. (2003), Proc.Roy. Soc.London A, 459, 1829-1849
- Maybank, S.J.(2004), IEEE Trans, Pattern Anal. Mach.Intell., 26,1579-1589.
- Maybank,S.J.(2005), Int.J.Comp.Vision, 63,191-206.
- Maybank,S.J.(2007), Int.J.Comp.Vision, 72, 287-307.
- Maybank,S.J.(2008a), Neurocomputing, 71,2033-2028.
- Maybank,S.J.(2008b), Mathematics TODAY,December 2008,7-9.
- Uwe Jensen (1992), Herleitung, Berechnung und okonomische Anvendung von Rao-Distansen, Verlag Joseph Eul.
Internal references
- Lawrence M. Ward (2008) Attention. Scholarpedia, 3(10):1538.
- Calyampudi Radhakrishna Rao (2008) Cramér-Rao bound. Scholarpedia, 3(8):6533.
- Calyampudi Radhakrishna Rao (2008) Rao-Blackwell theorem. Scholarpedia, 3(8):7039.
Further reading
- Burbea, J. and Rao, C.R.(1982), J. Multivariate Analysis, 12, 575-596.
- Burbea, J. and Rao,C.R.(1984), Prob. Math. Stat., 3, 241-258.
- Mayar-wolf, E. (1990), Ann.Prob.18, 840-850.
External links
- C.R. Rao's web page
- Chronology of probabilists and statisticians