第 10 章 估計和精確度 Estimation and Precision

10.1 估計量和他們的樣本分佈

$Y_i \stackrel{i.i.d}{\sim} N(\mu, \sigma^2), i=1,2,\dots,n$

\begin{aligned} E(\bar{Y}) &= E(\frac{1}{n}\sum Y_i) \\ &= \frac{1}{n}E(\sum Y_i) \\ &= \frac{1}{n}\sum E(Y_i) \\ &= \frac{1}{n}n\mu = \mu \\ Var(\bar{Y}) &= Var(\frac{1}{n}\sum Y_i) \\ \because Y_i \;\text{are} &\; \text{independent} \\ &= \frac{1}{n^2}\sum Var(Y_i) \\ &= \frac{1}{n^2} n Var(Y_i) \\ &= \frac{\sigma^2}{n} \end{aligned}

\begin{aligned} E(Z) &= \frac{1}{\sqrt{Var(\bar{Y})}}E[\bar{Y}-\mu] \\ &= \frac{1}{\sqrt{Var(\bar{Y})}}[\mu-\mu] = 0 \\ Var(Z) &= \frac{1}{Var(\bar{Y})}Var[\bar{Y}-\mu] \\ &= \frac{1}{Var(\bar{Y})}Var(\bar{Y}) =1 \\ \therefore Z \;&\sim N(0,1) \end{aligned}

$Prob(\bar{Y} > \mu+c) = 0.025 \\ Prob(\bar{Y} < \mu-c) = 0.025$

$L=\bar{Y}-c \Rightarrow Prob(L>\mu)=0.025 \\ U=\bar{Y}+c \Rightarrow Prob(U<\mu)=0.025$

\begin{aligned} Prob(\bar{Y}>\mu+c)=Prob(\bar{Y}-\mu>c) \;&= 0.025 \\ \Rightarrow Prob(\frac{\bar{Y}-\mu}{\sqrt{Var(\bar{Y})}} > \frac{c}{\sqrt{Var(\bar{Y})}}) \;&= 0.025 \\ \Rightarrow Prob(Z>\frac{c}{\sqrt{Var(\bar{Y})}}) \;&= 0.025 \\ we\;have\;proved\; Z\sim N(0,1) \\ we\;also\;know\; Prob(Z>1.96) \;&= 0.025 \\ so\;let\; \frac{c}{\sqrt{Var(\bar{Y})}} =1.96 \\ \Rightarrow c=1.96\sqrt{Var(\bar{Y})} \\ the\;95\%\;confidence\;interval \;of\; &the\;population\;mean\;is\\ \mu = \bar{Y}\pm1.96\sqrt{Var(\bar{Y})}=\bar{Y}\pm & 1.96\frac{\sigma}{\sqrt{n}} \end{aligned}

10.2 估計量的特質

1. 什麼因素決定了一個估計量 (estimator) 的好壞，是否實用？
2. 如果有其他的可選擇估計量，該如何取捨呢？
3. 當情況複雜的時候，我們該如何尋找合適的估計量？

10.2.1 偏倚

$E(T)=\theta$

$bias(T) = E(T)-\theta$

$T\;is\;an\;\textbf{unbiased}\;estimator\;for\;\theta\;if\;\\E(T)=\theta\\ T\;is\;an\;\textbf{asymptotically unbiased}\;estimator\;for\;\theta\;if\;\\lim_{n\rightarrow\infty}E(T)=\theta$

10.2.3 均值和中位數的相對效能

$Var(\dot{Y})=\frac{\pi}{2}\frac{\sigma^2}{n}\approx1.571\frac{\sigma^2}{n}$

$\frac{Var(\dot{Y})}{Var(\bar{Y})}\approx1.571$

10.2.4 均方差 mean square error (MSE)

$MSE(T)=E[(T-\theta)^2]$

\begin{aligned} MSE(T) &= E[(T-\theta)^2] \\ &= E\{[T-E(T)+E(T)-\theta]^2\} \\ &= E\{[T-E(T)]^2+[E(T)-\theta]^2 \\ & \;\;\;\;\; \;\;+2[T-E(T)][E(T)-\theta]\} \\ &= E\{[T-E(T)]^2\}+E\{[E(T)-\theta]^2\} + 0\\ &= Var(T) + [bias(T)^2] \end{aligned}

10.3 總體方差的估計，自由度

$V_{\mu}=\frac{1}{n}\sum_{i=1}^n(Y_i-\mu)^2$

\begin{aligned} V_{\mu} &= \frac{1}{n}\sum_{i=1}^n(Y_i-\mu)^2 \\ we\;need\;to\;prove &E(V_{\mu}) = \sigma^2 \\ \Rightarrow E(V_{\mu}) &= \frac{1}{n}\sum_{i=1}^nE(Y_i-\mu)^2 \\ &= \frac{1}{n}\sum_{i=1}^nVar(Y_i) \\ &= \frac{1}{n}\sum_{i=1}^n\sigma^2 \\ &= \sigma^2 \end{aligned}

$V_{\mu}=\frac{1}{n}\sum_{i=1}^n(Y_i-\bar{Y})^2$

$V_{n-1}=\frac{1}{n-1}\sum_{i=1}^n(Y_i-\bar{Y})=\frac{n}{n-1}V_n$

\begin{aligned} V_{\mu} &= \frac{1}{n}\sum_{i=1}^n(Y_i-\mu)^2 \\ &= \frac{1}{n}\sum_{i=1}^n[(Y_i-\bar{Y})+(\bar{Y}-\mu)]^2 \\ &= \frac{1}{n}\sum_{i=1}^n[(Y_i-\bar{Y})^2+(\bar{Y}-\mu)^2\\ &\;\;\;\;\;\;\;\;\;\;\;\;+2(Y_i-\bar{Y})(\bar{Y}-\mu)]\\ &=\frac{1}{n}\sum_{i=1}^n(Y_i-\bar{Y})^2+\frac{1}{n}\sum_{i=1}^n(\bar{Y}-\mu)^2\\ &\;\;\;\;\;\;\;\;\;\;\;\;+\frac{2}{n}(\bar{Y}-\mu)\sum_{i=1}^n(Y_i-\bar{Y}) \\ &= V_n+(\bar{Y}-\mu)^2 \\ &\;\;\;\;\;\;\;\;\;\;\;\;(\text{note that}\;\sum_{i=1}^n(Y_i-\bar{Y})=0) \\ \Rightarrow V_n &= V_{\mu}-(\bar{Y}-\mu)^2 \\ \therefore E(V_n)&= E(V_{\mu}) - E[(\bar{Y}-\mu)^2] \\ &= Var(Y)-Var(\bar{Y}) \\ &= \sigma^2-\frac{\sigma^2}{n} \\ &= \sigma^2(\frac{n-1}{n}) \end{aligned}

\begin{aligned} E[\frac{n}{n-1}V_n] &= \frac{n}{n-1}E[V_n] =\sigma^2 \\ \Rightarrow E[V_{n-1}] &= \sigma^2 \end{aligned}

10.4 樣本方差的樣本分佈

$$S^2$$ 常用來標記樣本方差，取代上面我們用到的 $$V_{n-1}$$

$S^2=\frac{1}{n-1}\sum_{i=1}^n(Y_i-\bar{Y})^2$

\begin{aligned} Var(S) &=E(S^2)-[E(S)]^2 \\ \Rightarrow [E(S)]^2 &=E(S^2)-Var(S) \\ \because E(S^2) &=\sigma^2 \\ \therefore [E(S)]^2 &=\sigma^2-Var(S) \\ E(S) &=\sqrt{\sigma^2-Var(S)} \\ \end{aligned}

$\frac{n-1}{\sigma^2}S^2\sim \mathcal{X}_{n-1}^2\\ Var(S^2)=\frac{2\sigma^4}{n-1}$

$$\mathcal{X}^2_m$$： 自由度爲 $$m$$ 的卡方分佈 (Section 11)。是在圖形上向右歪曲的分佈。當自由度增加時，會越來越接近正態分佈。