# 第 75 章 分析策略和模型檢查 Model checking-survival analysis

## 75.4 模型檢查的要點

1. 總體模型對數據的擬合情況是否合理？
2. 是否有極端數據，影響了模型的擬合結果？
3. 解釋變量，特別是連續型變量是否以正確的形式進入了模型？

## 75.5 比例風險假設的檢查 check the proportional hazard assumtion

1. 用非參數法繪製簡單的生存曲線圖；
2. 用統計檢驗，判斷一個解釋變量對風險的影響是否和時間產生了交互作用；
3. 殘差繪圖法。

### 75.5.1 比例風險檢查的統計檢驗法

$h(t|x) = h_0(t)\exp\{ \beta x + \gamma (x\times t)\}$

$\frac{\hat\gamma}{SE(\hat\gamma)} \sim N(0,1)$

### 75.5.2 用 Schoenfeld 殘差繪圖

The residual compares the observed values of the explanatory variable for the case at a given envent time with the weighted average of the explanatory variable in the risk set. The residuals should not show any dependence on time – this would indicate that the proportional hazards assumptions is not met.

It is actually more convenient to use the “scaled Schoenfeld residuals”. The Scaled Schoenfeld residuals have a mean which is the true log hazard ratio under the proportional hazards assumption, and the average values of the scaled Schoenfeld residuals over time can be interpreted as the time-varying log-hazard ratio. A plot of the scaled Schoenfeld residuals over time is therefore directly informative about how the log hazard ratio changes overtime. It is useful to show a smoothed average curve on these plots.

## 75.6 評價模型擬合的其他有趣方法

### 75.6.1 Martingale 殘差-assessing the functional form of continuous variables

Martingale (馬丁哥?) 殘差圖可以用來檢驗，比較連續型變量在模型中是否被正確擬合，因為有時候連續型變量需要增加該連續型變量的二次項或者多次項，也可能要用對數項之類的變形之後，才能完全把其與生存數據之間的關係完全解釋清楚。

A Martingale is a residual for an event process – it is the difference between what happened to a person (whether they had the event or not) and what is predicted to happen to a person under the model that has been fitted. The Martingale residual for individual i is:

$r_{M_i} = \delta_i - \hat H_0(t_i)\exp(\hat\beta x_i)$

Where, $$\delta_i$$ is the indicator of whether the individual $$i$$ had the event (1) or was censored (0), $$t_i$$ is the event or censoring time, $$x_i$$ denotes the explanatory variable (or more generally a vector of explanatory variables), and $$\hat H_0(t_i)$$ is the estimated baseline cumulative hazard at time $$t_i$$. If the model is correct then the Martingale residuals should sum to 0.

### 75.6.2 Deviance 偏差殘差 – identifying individuals for whom the model does not provide a good fit

$r_{D_i} = \text{sign}(r_{M_i})[-2\{r_{M_i} + \delta_i\log(\delta_i - r_{M_i})\}]^{\frac{1}{2}}$