Second order efficiency
Calyampudi Radhakrishna Rao (2009), Scholarpedia, 4(3):7084. | doi:10.4249/scholarpedia.7084 | revision #91752 [link to/cite this article] |
The Fisher–Rao Theorem provides an asymptotic bound to loss of information in replacing the sample by an estimator of the unknown parameters. The Rao Theorem provides a lower bound to asymptotic variance of an estimator up to terms of \(O(1/n^2)\ .\)
Contents |
Definition
Denote by \(L(X,\theta)\) the likelihood based on a sample of \(n\!\) independent observations, \(x_1,\cdots,x_n\) which we represent by \(X\!\ .\) Further let
\[ Z(\theta) = \frac{\partial \log L}{\partial \theta}, ni(\theta) = V \left[Z(\theta) \right] \]
where \(ni(\theta)\) is Fisher information in the sample. Let \(T\) be an estimator of \(\theta\) and \(M(T,\theta)\) be the likelihood based on \(T\ .\) Define:
\[ ni_T = V \left[ \frac{\partial \log M}{\partial \theta} \right] \]
as the information contained in \(T\ .\) Fisher (1925) considered
\[ E^\prime = \lim n(i-i_T) \]
as a measures of efficiency of T and showed that the maximum likelihood estimator has the minimum value for \(E^\prime\) in the estimation of parameter in a \(k\)–category multinomial distribution, with cell probabilities as functions of \(\theta\ .\)
Rao (1961) defined first order efficiency of an estimator \(T\) as the property
\[ \mid n^{-1/2} Z(\theta) - \alpha - \beta n^{1/2} (T-\theta) \mid \to 0 \] in probability as \(n \to \infty\ ,\) for appropriate choice of \(\alpha\) and \(\beta\ ,\) which implies as Doob (1934) showed \(i_T \to i\) as \(n \to \infty\ .\)
Rao (1961) defined the second order efficiency as
\[ E = \min_{\lambda} V_a \left[ Z(\theta) - n^{1/2}\alpha - n\beta(T-\theta) -n\lambda(T-\theta)^2 \right] \]
where \(V_a\) stands for asymptotic variance.
The two concepts of Fisher (1925) and of Rao (1961) are similar but in particular cases \(E\) and \(E^\prime\) may not be the same as pointed out by Efron (1975). However, Fisher (1925) reported \(E\) as his computation of \(E^\prime\ .\) In the case of a multinomial distribution with probabilities in the classes as \(\pi_1(\theta),\pi_2(\theta),\cdots,\pi_k(\theta)\ ,\) \(E\) has the lower bound obtained by Fisher and Rao
\[\tag{1} \frac{\mu_{02}-2\mu_{21}+\mu_{40}}{i} - i - \frac{\mu_{11}^2 + \mu_{30}^2 - 2\mu_{11}\mu_{30}}{i^2} \]
where
\[ \mu_{rs} = \sum \pi_j \left( \frac{\pi^\prime_j}{\pi_j} \right)^r\left( \frac{\pi^{\prime\prime}_j}{\pi_j} \right)^s \]
which is attained by the maximum likelihood estimator. Efron (1975) called the result (1) the Fisher–Rao Theorem. He extended the computations to exponential family and identified the expression \(E\) as the curvature of the family of distributions at \(\theta\!\ .\) In another paper, Rao (1961) obtained the expansion of the asymptotic variance of a consistent estimator, corrected for bias of \(O(1/n)\ ,\) up to terms of \(O(1/n^2)\) as
\[\tag{2} \frac{1}{ni} + \frac{\phi}{n^2} + o(1/n^2) \]
and showed that the minimum value of \(\phi\) is
\[\tag{3} \frac{E}{i^2} + \frac{\mu_{11}^2}{2i^4} \]
which is attained by the maximum likelihood estimator (MLE). Ghosh and Subramanyam (1974) identified the result (2), (3) as the Rao Theorem. They clarified the computations of \(E^\prime\) and \(E\) and extended the results to exponential family of distributions.
History
After Fisher (1922) introduced maximum likelihood as a general method of estimation of unknown parameters asserting that it provides estimators which are consistent and have least asymptotic variance, several papers appeared questioning Fisher’s claims. Examples have been given of other methods of estimation which yield estimators with the same or better properties. This motivated the author to make a deeper investigation of properties of estimators and methods of estimation. In a series of papers, Rao (1960, 1961, 1962, 1963) introduced the concepts of Fisher consistency, which places a restriction on the estimating function, first order efficiency, correction for bias up to \(O(1/n)\ .\) These concepts bring out maximum likelihood estimates as having better properties than those obtained by other proposed methods.
Applications
Second Order Efficiency (SOE) provides an effective measure to choose an estimator with the best possible summary of data for drawing inference. Berkson (1955) claimed that minimum logit chi-square estimator performs better than the maximum likelihood (ML) estimator. Ghosh and Subramanyam (1974) showed that the ML estimator corrected for bias has better performance in terms of (SOE)
References
- Berkson. J (1955), J.Am.Statist. Ass.50, 130-136.
- Efron, B (1975), Ann. Statist. 3, 1189-1242.
- Clarke, B. and Ghosal, S. (2008), IMS Collections, 3, 1-18.
- Fisher, R.A. (1925), Proc. Camb. Phil.Soc., 22, 700-724.
- Fisher, R.A. (1922), Phil. Trans. Roy. Soc. London, Series A, 222, 309-368.
- Ghosh, J. and Subramanyam, K (1974), Sankhya, A, 36, 325-358.
- Ghosh, J. and Sinha, B.K. (1982), Calcutta Statist. Assoc. Bull. 31,151-158.
- Ghosh, J.K., Sinha, B.K. and Wieand (1980). Annals of Statistics,
- Ghosh, J.K., Sinha, B.K., and Joshi (1982), Proc. Third Purdue Symposium,
- Ghosh, J.K.(1994), Higher Order Asymptotics, IMS monograph, 4.
- Rao, C.R. (1945), Bull. Calcutta Math. Soc., 37, 81-91.
- Rao, C.R. (1960), Proc. 32nd Session of ISI
- Rao, C.R. (1961), Sankhya, 24, 73-102 (1961)
- Rao, C.R. (1961), Proc. 4th Berkeley Symposium, 1, 531-546.
- Rao, C.R. (1962), J. Roy. Statist. Soc. B, 24,46-63.
- Rao, C.R. (1963), Sankhya, 25, 189-206.
Further reading
There is extensive literature arising out of the work of Fisher and Rao on SOE. The main contributors are Efron (1975), and a number of authors who contributed to the discussion on Efron’s paper and Ghosh with a number of collaborators listed in Clarke and Ghosal (2008). These papers raise a number of questions some of which are not resolved yet. Ghosh and Sinha (1982) showed that MLE does not have third order efficiency. See Rao (1945) for some basic results on estimation. Rao's theorem has been extended well beyond curved exponential families, using Bayesian methods. Ghosh and Subramanyam (1974) show that Bayes estimates can be approximated up to second order by a function of the MLE alone, the derivatives at MLE are not required. (Such results do not hold for Bayes tests.) It is conjectured there that this can be used to prove Rao's Theorem under general regularity conditions and for more general loss function than squared loss functions and squared error. This program is implemented in Ghosh, Sinha and Wieand (1980). A slightly different proof is offered in Ghosh, Sinha and Joshi (1982). Essentially the same result was obtained by Takeuchi and Akhiara, and Bickel. Goetz and van Zwet. References to all the above papers are available in the monograph by Ghosh (1994) on Higher Order Asymptotics. It may be noted that Rao's second order efficiency is usually called third order efficiency by other authors. If one considers the asymptotic expansion of expected squared error loss of an estimate up to O(1/n^2), one gets two terms in powers of 1/n. hence the term second order. If one approaches through Edgeworth expansions, one has 3 terms in powers of (1/square root of n), hence third order. Second order efficiency in this sense is different from Rao's.