Cramér-Rao bound

 Calyampudi Radhakrishna Rao (2008), Scholarpedia, 3(8):6533. doi:10.4249/scholarpedia.6533 revision #137303 [link to/cite this article]
Jump to: navigation, search
Post-publication activity

Curator: Calyampudi Radhakrishna Rao

Cramér-Rao bound stands for an inequality that is the basis of a method for determining a lower bound to the variance of an estimate of a deterministic parameter.

Definition

Let $$f(x, \theta)$$ be the probability density at observed data $$x\!\ ,$$ where $$\theta = ( \theta_1,\cdots, \theta_p )$$ is an unknown p-vector parameter. The $$p \times p$$ information matrix on $$\theta$$ is defined by $$I(\theta) = (i_{rs})$$ where $$i_{rs} = E\left[\left(\frac{\partial \log f}{\partial \theta_r}\right)\left(\frac{\partial \log f}{\partial \theta_s}\right)\right]$$ and $$E$$ stands for expectation.

Let $$T(x) = (T_1(x),\cdots, T_p(x))$$ be an unbiased estimator of $$\theta = (\theta_1,\cdots,\theta_p)\ ,$$ and $$C(T) = (j_{rs}), j_{rs} = E(T_r - \theta_r) (T_s - \theta_s)\ ,$$ is the covariance matrix. The Cramér-Rao bound in a general form is

The $$p \times p$$ matrix $$C(T) - I^{-1}$$ is non-negative definite.

In case $$\theta$$ is a one dimensional parameter, the bound becomes $\tag{1} V(T) \ge 1/I$

where $$I = E\left[\left(\frac{\partial \log f}{ \partial \theta}\right)^2\right]\ .$$ More generally, if $$E(T_i) = g_i(\theta)$$ and $$G(\theta)$$ is the matrix with the $$(r,s)$$ term as $$\frac{\partial g_r(\theta )} {\partial \theta_s}$$ then (Bull. Cal. Math. Soc., 1945)

$C(T) - G I^{-1} G'$ is nonnegative definite

The result holds even when I is singular with the inverse of I replaced by a generalised inverse. Denoting $$I^{-1}=(i^{rs})$$ and using the multiparameter CRB, $$C(T)-I^{-1}$$ is nonnegative definite, we have the CRB for a single parameter when there are other parameters $V(T_r)=E(T_s-\theta_r)^2\ge i^{rr}\ge i_{rr}$ so that the lower bound is possibly higher for an unbiased estimate of $$T_r$$ when other parameters are unknown. Using the general CRB, $$C(T)-GI^{-1}G^\prime$$ is nonnegative definite $M(T)=E[(T-\theta)^\prime(T-\theta)]=(\theta-g(\theta))^\prime(\theta-g(\theta))+G^\prime I^{-1}G$ which is CRB for the mean covariance error of $$T=(T_1,\ldots,T_r)$$ as an estimate of $$\theta=(\theta_1,\ldots,\theta_p)$$ and $$\theta-g(\theta)=(\theta_1-g_1(\theta),\ldots,\theta_p-g_p(\theta))$$ is the bias (Rao, 1952).

History

Figure 1: Article in the newspaper Times of India, 1988.

The origin of Cramér-Rao Bound (CRB) as reported in the newspaper Times of India, dated December 31, 1988, (with the heading:The top ten greatest contributions to Indian science) is as follows. “At the young age of 24, Calyampudi Radhakrishna Rao was giving a course on estimation to Master’s students of Calcutta University. He proved in his class a result first obtained by R.A. Fisher regarding the lower bound for the variance of an estimator for large samples. When a student asked, “why don’t you prove it for finite samples?”, Rao went back home, worked all night and next day proved what is now known as Cramér-Rao inequality for finite samples.” About the same time, the CRB for one parameter was reported in a paper by Fréchet (1943).

Applications

The CRB is useful in finding whether a given estimator has the minimum variance or how close is it to the best possible one. For instance, if the sample is from a Normal distribution, $$N( \mu, \sigma^2)\ ,$$ then $I = \begin{pmatrix} \frac{n}{\sigma^2} & 0 \\ 0 & \frac{n}{2 \sigma^4} \end{pmatrix}$ and $I^{-1} = \begin{pmatrix} \frac{\sigma^2}{n} & 0 \\ 0 & \frac{2 \sigma^4}{n} \end{pmatrix}$

If $$\overline{x} = \frac{ x_1+x_2+ \ldots +x_n }{ n }$$ is an estimator of $$\mu\ ,$$ then $V \left ( \overline{x} \right ) = \frac{\sigma^2}{n} = CRB$

so that is the best possible estimator of the parameter $$\mu$$ in terms of variance.

If $$s^2 =\frac{\left(x_1 - \overline{x}\right)^2 + \ldots + \left(x_n-\overline{x} \right)^2}{n-1} \ ,$$ is an estimate of $$\sigma^2\ ,$$ then $V \left( s^2 \right) = \frac{2 \sigma^4}{n-1} > \frac{2 \sigma^4}{n} (CRB).$ However, the ratio $$\frac{CRB}{V \left( s^2 \right)} =\frac{n}{n-1}$$ which is close to 1 as $$n$$ becomes large.

CRB, although originally introduced in estimation theory, has found many applications in statistical inference and other areas. It has been found useful in proving certain propositions in limit theorems, asymptotic inference, decision theory, signal processing and density estimation. Van Trees, in his book "Detection, Estimation and Modulation Theory" (Van Trees, 1968)presented a global CRB $\tag{2} E(x-E(x|y))^2 \ge \frac{1}{E\left[\left(\frac{\partial}{\partial x}\right) \log p(x,y)\right]^2}$

where $$p(x,y)$$ is the joint density of $$(x,y)\ .$$ The above inequality and its generalization to the multivariate case have been used in bounding Bayes risk (Bobrovsky et al., 1987).

An interesting result due to A.J. Stam (Stam, 1959) is the derivation of Weyl-Heisenberg uncertainty principle in physics using a specific version of the CRB. Further applications in physics of CRB and Fisher information as a concept underlying the well known physical theories can be found in the book by B. Roy Frieden, "Physics from Fisher Information" (Frieden, 1998).

Recent developments are Quantum Cramér-Rao Bound in the estimation of manifolds in Quantum Physics, by Brody and Houghston (1998) and the concept of Cramér-Rao Functional based on Cramér-Rao Bound by Mayor Wolf (1990). CRB has been extended to estimation of "manifolds" as "complexified and intrinsic" CRB and used in signal processing.

References

• Bobrovsky, Wolf and Zakai. Ann. Statist. 15, 1421-1438, 1987
• DC Brody and LP Hughston. Proc. Roy. Soc., 454, 2445-2475, 1998.
• M Fréchet. Revue Inst. de Stat., 11, 182-205, 1943.
• B Roy Frieden. Physics from Fisher Information. Cambridge University Press, 1998
• CR Rao. Advanced Statistical Methods in Biometric Research, Wiley, 1952.
• ST Smith. IEEE Transactions on Signal Processing, 1597-1609, 1610-1630, 2005.
• AJ Stam. Information and Control 2, 101-112, 1959
• E Mayor Wolf. Ann. Prob., 18, 840-850, 1990
• Bull. Cal. Math. Soc. 37, 81-91, 1945
• Van Trees. Detection, Estimation and Modulation Theory, Part 1. Wiley, 1968

Internal references

Further reading

• A Bera, ET Interview with CR Rao, Econometric Theory (2003), 19, 329-398.
• MH Degroot, A conversation with CR Rao, Statistical Science (1987), 53-67.
• CR Rao, Linear Statistical Inference and its Applications, Wiley, 1973.
• T Soderstrom and P Stoica, System Identification, Prentice Hall, 1988.