class: center, middle, inverse, title-slide # Understand Logistic Regression ### 王 超辰 | Chaochen Wang ### 2019-07-08 17:00~18:00
@CSS
--- class: middle # Objectives - Use a logistic regression model to compare the log odds of disease in two groups `\(\Rightarrow\)` estimate a crude odds ratio for **a binary exposure**. - Use a logistic regression model to compare the odds of disease for a categorical exposure with **2 or more levels**. - Use a logistic regression model to examine the association between an exposure and an outcome **adjusting for confounders**, assuming no effect modification. --- class:middle ## Odds and Odds ratio (OR) - The odds of a disease is defined as: `$$\mathbf{Odds} = \frac{n_{D+}}{n_{D-}}$$` <br> where `\(D+\)` indicates individuals **with disease (such as lung cancer)** <br> `\(D-\)` indicates individuals **without the disease** - The ratio of the odds between groups is a **measure for comparing the frequency of disease:** `$$\mathbf{OR} = \frac{\mathbf{Odds}_{E+}}{\mathbf{Odds}_{E-}}$$` <br> where `\(E+\)` indicates individuals **with exposure (such as smoking)** <br> `\(E-\)` indicates individuals **without the exposure** --- class:middle ## Logistic regression model in a nutshell $$ `\begin{aligned} \mathbf{Odds}_{E+} & = \mathbf{Odds}_{E-} \times \mathbf{OR} \\ \Rightarrow \log(\mathbf{Odds}_{E+}) & = \log(\mathbf{Odds}_{E-}) + \log(\mathbf{OR}) \end{aligned}` $$ Where, <br> `\(\log(\mathbf{Odds}_{E-})\)` is called the **"Baseline log odds in the unexposed group"** <br> `\(\log(\mathbf{OR})\)` is referred as the **"log odds ratio of the exposure"** This is an example of a logistic regression model. --- class: middle, inverse, center # The end --- class: middle ## Logistic regression model **It models the log odds of the disease:** - Among unexposed individuals `\((E-)\)`: <br> log odds of disease = Baseline - Among exposed individuals `\((E+)\)`: <br> log odds of disease = Baseline + Exposure Where, **"Exposure" is the log odds ratio** and the model is **additive**. So, in any kind of statistical software, logistic regression model does everything on the log odds scale. We/the software can (manually/automatically) convert it back to odds ratio scale, the thing we use to report. --- class: middle, inverse, center # Why log odds of the disease? --- class:middle ## Why log odds of the disease? - Linear regression is appropriate for outcome variables like birth weight/height/BMI etc., which are continuous and follows a [normal distribution](https://ja.wikipedia.org/wiki/%E6%AD%A3%E8%A6%8F%E5%88%86%E5%B8%83). - When the outcome variables are binary response (died/alive, infected/uninfected, diseased/healthy, succeeded/failed, etc.), it is easier to model the log odds. Because log odds can take any value, positive or negative. But percentages of died/infected/diseased/succeeded are constrained between 0 and 1. - All logs are base `\(e\)`. --- class:middle background-image: url("./pic/LogReg_1.png") background-position: 50% 50% background-size: contain --- class:middle # Example - [Data](https://github.com/winterwang/logistic-reg-CSS/raw/gh-pages/dataset/mortality.dta) from a study conducted in a rural area of northern Nigeria in the late 1980s to early 1990s. - This cohort study with fixed follow-up time included persons aged `\(\geqslant\)` 15 years who all underwent an eye examination by ophthalmic (眼科) nurses. - Individuals were classified as visually impaired or with normal vision. - Communities were followed-up over a period of 3 years and deaths were identified. - We will study the relationship between **the outcome: dead (=1) or alive (=0)**, and **two exposure factors**: - visual impairment as a binary variable (0=unimpaired, 1=visually impaired (including blind)) - microfilarial infection grouped into 4 levels (negative, <10, 10-49, ≥50 mf/mg). --- class:middle ## Estimate the odds, log odds, and the odds ratio "by hand" ```r # install the package if needed by install.packages("haven") library(haven) mortality <- read_dta("dataset/mortality.dta") # change your path accordingly # install the package if needed by install.packages("Epi") library(Epi) tab <- stat.table(index = list(Died = died, "Visually impaired" = vimp), contents = list(count(),percent(died)), data = mortality, margins = T) print(tab, digits = 2) ``` ``` ## -------------------------------- ## ----Visually impaired---- ## Died 0 1 Total ## -------------------------------- ## 0 3874.00 287.00 4161.00 ## 97.56 87.77 96.81 ## ## 1 97.00 40.00 137.00 ## 2.44 12.23 3.19 ## ## ## Total 3971.00 327.00 4298.00 ## 100.00 100.00 100.00 ## -------------------------------- ``` --- class:middle ### Exercise 1: - Fill in the **risk, odds and log odds** of death in the visually impaired and unimpaired individuals. | | Unimpaired | Visually impaired | Total | | :------- | :--------: | :---------------: | :------: | | Risk | | | `\(3.19\%\)` | | Odds | | | `\(0.0329\)` | | Log odds | | | `\(-3.414\)` | - What is the Odds Ratio? - What is the log(odds ratio)? --- class:middle ### Exercise 1: the answers - Fill in the **risk, odds and log odds** of death in the visually impaired and unimpaired individuals. | | Unimpaired | Visually impaired | Total | | :------- | :------------------: | :------------------: | :------: | | Risk | `\(2.44\%\)` | `\(12.23\%\)` | `\(3.19\%\)` | | Odds | `\(97/3874 = 0.0250\)` | `\(40/287 = 0.1394\)` | `\(0.0329\)` | | Log odds | `\(\log(0.0250)=-3.68\)` | `\(\log(0.1394)=-1.97\)` | `\(-3.414\)` | - What is the Odds Ratio? `$$0.1394/0.0250 = 5.576$$` - What is the log(odds ratio)? `$$\log(5.576) = -1.97 - (-3.68) = 1.71$$` --- class:middle ### Fit the model in R (1) ```r vimp_logm0 <- glm(died ~ vimp, data = mortality, family = binomial(link = "logit")) summary(vimp_logm0) ``` ``` ## Call: ## glm(formula = died ~ vimp, family = binomial(link = "logit"), ## data = mortality) ## ## Deviance Residuals: ## Min 1Q Median 3Q Max ## -0.5108 -0.2224 -0.2224 -0.2224 2.7247 ## ## Coefficients: ## Estimate Std. Error z value Pr(>|z|) ## `(Intercept) -3.6873 0.1028 -35.870 <2e-16 ***` ## `vimp 1.7167 0.1976 8.687 <2e-16 ***` ## --- ## Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 ## ## (Dispersion parameter for binomial family taken to be 1) ## ## Null deviance: 1213.8 on 4297 degrees of freedom ## Residual deviance: 1154.7 on 4296 degrees of freedom ## AIC: 1158.7 ## ## Number of Fisher Scoring iterations: 6 ``` ??? - The first column notes the names of the explanatory variable `vimp`. - The first row `(Intercept)` refers to the baseline (log odds in the baseline group) - The second column, labelled `Estimate` is the estimates of the logOdds (unimpaired) logOR for visual impairment --- class:middle ### Recall that the logistic regression model is written as: `$$\log(\mathbf{Odds}_{E+}) = \log(\mathbf{Odds}_{E-}) + \log(\mathbf{OR})$$` Therefore, we can write down our model: `$$\log(\mathbf{Odds}) = -3.687 + 1.717 \times \mathbf{Vimp}$$` Hence, the log odds: - for those with **unimpaired vision**: `\(\log(\mathbf{Odds}) = -3.687 + 1.717 \times 0 = -3.687\)`; - for those with **visually impaired**: `\(\log(\mathbf{Odds}) = -3.687 + 1.717 \times 1 = -1.970\)` The odds ratio (OR) = `\(\exp(1.717) = 5.567\)` --- class:middle ### Look at the output again ``` ## Call: ## glm(formula = died ~ vimp, family = binomial(link = "logit"), ## data = mortality) ## {ommitted} ## Coefficients: ## Estimate `Std. Error` z value Pr(>|z|) ## (Intercept) -3.6873 `0.1028` -35.870 <2e-16 *** ## vimp 1.7167 `0.1976` 8.687 <2e-16 *** ## {ommitted} ``` - `Std.Error` can be used to calculate the 95% confidence interval (CI): `$$\log\mathbf{OR} \pm 1.96\times\mathbf{S.E}$$` Therefore, - 95%CI for logOR of `vimp`: `\(1.7167 \pm 1.96 \times 0.1976 = (1.329, 2.104)\)` - 95%CI for OR of `vimp`: `\((\exp(1.329) = 3.78, \exp(2.104) = 8.19)\)` --- class:middle ### Fit the model in R (2), turn logOR to OR ```r # install the package if needed by install.packages("epiDisplay") library(epiDisplay) logistic.display(vimp_logm0, decimal = 3) ``` ``` ## ## Logistic regression predicting died ## ## `OR(95%CI)` P(Wald's test) P(LR-test) ## `vimp: 1 vs 0 5.566 (3.779,8.199) < 0.001 < 0.001` ## ## Log-likelihood = -577.36596 ## No. of observations = 4298 ## AIC value = 1158.73192 ``` --- class: middle ## Logistic model for comparison of more than two exposure groups When the predictor (explanatory) variable has more than 2 levels, all OR are **relative to the baseline group**. ```r tab <- stat.table(index = list(Died = died, "Microfilarial infection mf/mg (grouped)" = mfgrp), contents = list(count(),percent(died)), data = mortality, margins = T) print(tab, digits = 2) ``` ``` ## --------------------------------------------------- ## --Microfilarial infection mf/mg (grouped)--- ## Died Uninfected < 10 10-49 50+ Total ## --------------------------------------------------- ## 0 1274.00 1609.00 996.00 193.00 4161.00 ## 97.77 96.29 96.79 95.54 96.81 ## ## 1 29.00 62.00 33.00 9.00 137.00 ## 2.23 3.71 3.21 4.46 3.19 ## ## ## Total 1303.00 1671.00 1029.00 202.00 4298.00 ## 100.00 100.00 100.00 100.00 100.00 ## --------------------------------------------------- ``` --- class:middle ### Exercise 2: - Complete this table by calculating the **odds, log odds, odds ratios** (compared to microfilarial group uninfected [0]) and log odds ratios as required. | | uninfected | `\(<10\)` | `\(10\sim 49\)` | `\(\geqslant50\)` | | :------------- | :--------: | :------: | :---------: | :-----------: | | Risk | `\(2.23\%\)` | `\(3.71\%\)` | `\(3.21\%\)` | `\(4.46\%\)` | | Odds | `\(0.0228\)` | `\(0.0385\)` | `\(0.0331\)` | `\(0.0466\)` | | Odds ratio | `\(1\)` | `\(1.693\)` | `\(1.456\)` | | | Log odds | `\(-3.783\)` | | | `\(-3.065\)` | | Log Odds ratio | `\(0\)` | `\(0.526\)` | `\(0.375\)` | | --- class:middle ### Exercise 2: the answers - Complete this table by calculating the **odds, log odds, odds ratios** (compared to microfilarial group uninfected [0]) and log odds ratios as required. | | uninfected | `\(<10\)` | `\(10\sim 49\)` | `\(\geqslant50\)` | | :------------- | :--------: | :----------------------: | :---------------------: | :----------------------: | | Risk | `\(2.23\%\)` | `\(3.71\%\)` | `\(3.21\%\)` | `\(4.46\%\)` | | Odds | `\(0.0228\)` | `\(0.0385\)` | `\(0.0331\)` | `\(0.0466\)` | | Odds ratio | `\(1\)` | `\(1.693\)` | `\(1.456\)` | `\(0.0466/0.0228\)` `\(=2.044\)` | | Log odds | `\(-3.781\)` | `\(\log(0.0385)\)` `\(=-3.257\)` | `\(\log(0.0466)\)` `\(-3.066\)` | `\(-3.066\)` | | Log Odds ratio | `\(0\)` | `\(0.526\)` | `\(0.375\)` | `\(\log(2.044)\)` `\(= 0.715\)` | --- class:middle ### Fit the model in R (3) ```r vimp_logm1 <- glm(died ~ mfgrp, data = mortality, family = binomial(link = "logit")) summary(vimp_logm1) ``` ``` ##Call: ##glm(formula = died ~ mfgrp, family = binomial(link = "logit"), ## data = mortality) ## ##Deviance Residuals: ## Min 1Q Median 3Q Max ##-0.3019 -0.2750 -0.2553 -0.2122 2.7587 ## ##Coefficients: ## Estimate Std. Error z value Pr(>|z|) ##`(Intercept) -3.7826 0.1878 -20.143 <2e-16 ***` ##`mfgrp< 10 0.5264 0.2281 2.308 0.0210 *` ##`mfgrp10-49 0.3754 0.2580 1.455 0.1457` ##`mfgrp50+ 0.7172 0.3893 1.842 0.0655 .` ##--- ##Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 ## ##(Dispersion parameter for binomial family taken to be 1) ## ## Null deviance: 1180.4 on 4204 degrees of freedom ##Residual deviance: 1173.7 on 4201 degrees of freedom ## (93 observations deleted due to missingness) ##AIC: 1181.7 ## ##Number of Fisher Scoring iterations: 6 ``` --- class:middle ### Dummy variables for the categorical variables ``` ## id mfgrp mfgrp_< 10 mfgrp_10-49 mortality_50+ ## 1 Uninfected 0 0 0 ## 2 < 10 1 0 0 ## 3 < 10 1 0 0 ## 4 10-49 0 1 0 ## 5 < 10 1 0 0 ## 6 <NA> 0 0 0 ## 14 Uninfected 0 0 0 ## 15 <NA> 0 0 0 ## 16 < 10 1 0 0 ## 17 < 10 1 0 0 ## 18 <NA> 0 0 0 ## 19 < 10 1 0 0 ## 22 < 10 1 0 0 ## 23 Uninfected 0 0 0 ## 24 < 10 1 0 0 ## 29 <NA> 0 0 0 ## 30 10-49 0 1 0 ## 31 Uninfected 0 0 0 ## 33 10-49 0 1 0 ## 34 Uninfected 0 0 0 ## 35 10-49 0 1 0 ## 38 50+ 0 0 1 ``` --- class:middle ### Recall that the logistic regression model is written as: `$$\log(\mathbf{Odds}_{E+}) = \log(\mathbf{Odds}_{E-}) + \log(\mathbf{OR})$$` Therefore, we can write down our model as: `\(\log(\mathbf{Odds}) = -3.7826 + 0.5264 \times\)` `mfgrp_< 10` `\(+ 0.3754\times\)` `mfgrp_10-49` `\(+ 0.7172\times\)` `mortality_50+` --- class:middle ### Fit the model in R (4), turn logOR to OR ```r logistic.display(vimp_logm1, decimal = 3) ``` ``` ##Logistic regression predicting died ## ## `OR(95%CI) P(Wald's test) P(LR-test)` ##`mfgrp: ref.=Uninfected 0.0822` ## `< 10 1.693 (1.083,2.647) 0.021` ## `10-49 1.456 (0.878,2.414) 0.1457` ## `50+ 2.049 (0.955,4.394) 0.0655` ## ##Log-likelihood = -586.86507 ##No. of observations = 4205 ##AIC value = 1181.73014 ``` --- class: middle, inverse, center # Logistic Regression 2: models with more than one variable --- class: middle ## Number of deaths, by visual impairment and age group ```r mortality %>% group_by(agegrp, died, vimp) %>% summarise(n = n()) ``` ``` ## # A tibble: 16 x 4 ## # Groups: agegrp, died [8] ## agegrp died vimp n ## <dbl+lbl> <dbl> <dbl+lbl> <int> ## 1 0 [15-34] 0 0 [Normal] 2301 ## 2 0 [15-34] 0 1 [Visually impaired] 22 ## 3 0 [15-34] 1 0 [Normal] 29 ## 4 0 [15-34] 1 1 [Visually impaired] 2 ## 5 1 [35-54] 0 0 [Normal] 1271 ## 6 1 [35-54] 0 1 [Visually impaired] 124 ## 7 1 [35-54] 1 0 [Normal] 38 ## 8 1 [35-54] 1 1 [Visually impaired] 10 ## 9 2 [55-64] 0 0 [Normal] 212 ## 10 2 [55-64] 0 1 [Visually impaired] 69 ## 11 2 [55-64] 1 0 [Normal] 15 ## 12 2 [55-64] 1 1 [Visually impaired] 11 ## 13 3 [65+] 0 0 [Normal] 90 ## 14 3 [65+] 0 1 [Visually impaired] 72 ## 15 3 [65+] 1 0 [Normal] 15 ## 16 3 [65+] 1 1 [Visually impaired] 17 ``` --- class:middle, center ## Number of deaths, by visual impairment and age group <table class="table table-striped table-bordered" style="margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="border-bottom:hidden" colspan="1"></th> <th style="border-bottom:hidden; padding-bottom:0; padding-left:3px;padding-right:3px;text-align: center; " colspan="8"><div style="border-bottom: 1px solid #ddd; padding-bottom: 5px; ">died (0 = no, 1 = yes)</div></th> </tr> <tr> <th style="border-bottom:hidden" colspan="1"></th> <th style="border-bottom:hidden; padding-bottom:0; padding-left:3px;padding-right:3px;text-align: center; " colspan="2"><div style="border-bottom: 1px solid #ddd; padding-bottom: 5px; ">agegrp 15-34</div></th> <th style="border-bottom:hidden; padding-bottom:0; padding-left:3px;padding-right:3px;text-align: center; " colspan="2"><div style="border-bottom: 1px solid #ddd; padding-bottom: 5px; ">agegrp 35-54</div></th> <th style="border-bottom:hidden; padding-bottom:0; padding-left:3px;padding-right:3px;text-align: center; " colspan="2"><div style="border-bottom: 1px solid #ddd; padding-bottom: 5px; ">agegrp 55-64</div></th> <th style="border-bottom:hidden; padding-bottom:0; padding-left:3px;padding-right:3px;text-align: center; " colspan="2"><div style="border-bottom: 1px solid #ddd; padding-bottom: 5px; ">agegrp 65+</div></th> </tr> <tr> <th style="text-align:right;"> vimp </th> <th style="text-align:right;"> 0 </th> <th style="text-align:right;"> 1 </th> <th style="text-align:right;"> 0 </th> <th style="text-align:right;"> 1 </th> <th style="text-align:right;"> 0 </th> <th style="text-align:right;"> 1 </th> <th style="text-align:right;"> 0 </th> <th style="text-align:right;"> 1 </th> </tr> </thead> <tbody> <tr> <td style="text-align:right;"> 0 </td> <td style="text-align:right;"> 2301 </td> <td style="text-align:right;"> 29 </td> <td style="text-align:right;"> 1271 </td> <td style="text-align:right;"> 38 </td> <td style="text-align:right;"> 212 </td> <td style="text-align:right;"> 15 </td> <td style="text-align:right;"> 90 </td> <td style="text-align:right;"> 15 </td> </tr> <tr> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 22 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:right;"> 124 </td> <td style="text-align:right;"> 10 </td> <td style="text-align:right;"> 69 </td> <td style="text-align:right;"> 11 </td> <td style="text-align:right;"> 72 </td> <td style="text-align:right;"> 17 </td> </tr> </tbody> </table> ``` ## --------------------------- ## ------Died------- ## Visually 0 1 ## impaired ## --------------------------- ## 0 3874.00 97.00 ## 97.56 2.44 ## ## 1 287.00 40.00 ## 87.77 12.23 ## --------------------------- ``` --- class: middle ### The odds of visual impairment increase with age. - agegrp = 13-34: `$$(22 + 2) / (29 + 2301) = 0.010$$` - agegrp = 35-54: `$$(124 + 10) / (1271 + 38) = 0.102$$` - agegrp = 55-64: `$$(69 + 11) / (212 + 15) = 0.352$$` - agegrp = 65+: `$$(72 + 17) / (90 + 15) = 0.847$$` --- class: middle ### The odds of death also increase with age. - agegrp = 13-34: `$$(29 + 2) / (2301 + 22) = 0.013$$` - agegrp = 35-54: `$$(38 + 10) / (1271 + 124) = 0.034$$` - agegrp = 54-64: `$$(15 + 11) / (212 + 69) = 0.093$$` - agegrp = 65+: `$$(15 + 17) / (90 + 72) = 0.198$$` --- class: middle ### Exercise 3: - Fill in the **odds of death odds ratios** of death in the visually impaired and unimpaired individuals, by **age group**. | agegrp | vimp = 0 | vimp = 1 | odds ratio | | :------- | :------: | :------: | :--------: | | 0 (15-34) | 0.01260 | | | | 1 (35-54) | 0.02990 | | | | 2 (55-64) | 0.07075 | | | | 3 (65+) | 0.16667 | | | --- class: middle ### Exercise 3: answers - Fill in the **odds of death odds ratios** of death in the visually impaired and unimpaired individuals, by **age group**. | agegrp | vimp = 0 | vimp = 1 | odds ratio | | :------- | :------: | :------: | :--------: | | 0 (15-34) | 0.01260 | 0.0909 | 7.214 | | 1 (35-54) | 0.02990 | 0.08065 | 2.6973 | | 2 (55-64) | 0.07075 | 0.15942 | 2.2533 | | 3 (65+) | 0.16667 | 0.23611 | 1.4166 | --- class:middle ## Adjusting for confouding using logistic regression ```r vimp_logm2 <- glm(died ~ vimp + as.factor(agegrp), data = mortality, family = binomial(link = "logit")) library(epiDisplay) logistic.display(vimp_logm2, decimal = 3) ``` ``` ## ## OR lower95ci upper95ci Pr(>|Z|) ## vimp 2.202266 1.408816 3.442592 5.327933e-04 ## as.factor(agegrp)1 2.354987 1.484023 3.737116 2.774844e-04 ## as.factor(agegrp)2 5.415484 3.082732 9.513468 4.199207e-09 ## as.factor(agegrp)3 9.901309 5.541161 17.692306 9.839612e-15 ``` --- class: middle ## What do we conclude from this output? - after controlling for age there is strong evidence of an association between visual impairment and odds of death. - On average, being visually impaired is associated with approximately a doubling in the odds of death. Comparing the adjusted OR (2.20) with the crude odds ratio that we obtained in the previous session (5.57) suggests **important confounding by age**, such that a failure to take age into account in the analysis leads to **an overestimate of the average effect of visual impairment**. --- class: middle ### Look at the output from another format ```r vimp_logm2 <- glm(died ~ vimp + as.factor(agegrp), data = mortality, family = binomial(link = "logit")) summary(vimp_logm2, decimal = 3) ``` ``` ## ## Call: ## glm(formula = died ~ vimp + as.factor(agegrp), family = binomial(link = "logit"), ## data = mortality) ## ## Deviance Residuals: ## Min 1Q Median 3Q Max ## -0.7109 -0.2473 -0.1619 -0.1619 2.9468 ## ## Coefficients: ## Estimate Std. Error z value Pr(>|z|) ## `(Intercept) -4.3286 0.1809 -23.931 < 2e-16 ***` ## `vimp 0.7895 0.2279 3.464 0.000533 ***` ## `as.factor(agegrp)1 0.8565 0.2356 3.635 0.000277 ***` ## `as.factor(agegrp)2 1.6893 0.2875 5.876 4.20e-09 ***` ## `as.factor(agegrp)3 2.2927 0.2962 7.741 9.84e-15 ***` ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## (Dispersion parameter for binomial family taken to be 1) ## ## Null deviance: 1213.8 on 4297 degrees of freedom ## Residual deviance: 1091.7 on 4293 degrees of freedom ## AIC: 1101.7 ``` ??? how do we explain the output on log scale? --- #### we can write down the formula for this model as: `$$\log{\mathbf{odds}_{\mathbf{death}}} = \log{\mathbf{odds}_{\mathbf{baseline}}} + \log{\mathbf{OR}_{Vimp}} + \log{\mathbf{OR}_{agegrp}}$$` | Component of model | Parameter name in output | Estimate | Interpretation | |--------------------|--------------------------|:----------:|--------------------------------------| | Baseline | (intercept) | -4.3286 | log-Odds in baseline group | | visual impairment | Normal (baseline, =0) | (0, fixed) | log(1) = 0 | | | Visually impaired (=1) | 0.7895 | log-Odds ratio, vimp=1 vs vimp=0 | | Age group | 15-34 (base, =0) | (0, fixed) | log(1) = 0 | | | 35-54 (=1) | 0.8565 | log-Odds ratio, agegrp 1 vs agegrp 0 | | | 55-64 (=2) | 1.6893 | log-Odds ratio, agegrp 2 vs agegrp 0 | | | 65+ (=3) | 2.2927 | log-Odds ratio, agegrp 3 vs agegrp 0 | --- class: middle ### Look at the observed odds and odds ratios again | agegrp | vimp = 0 | vimp = 1 | odds ratio | | :------- | :------: | :------: | :--------: | | 0 (15-34) | 0.01260 | 0.0909 | 7.214 | | 1 (35-54) | 0.02990 | 0.08065 | 2.6973 | | 2 (55-64) | 0.07075 | 0.15942 | 2.2533 | | 3 (65+) | 0.16667 | 0.23611 | 1.4166 | - we make no assumption about how the effects of visual impairment and age group combine. - However, in the logistic regression model that we have just fitted, **we made the assumption that the effect of visual impairment (i.e. the odds ratio) is the same in each age group**: i.e. we assumed that the effect of visual impairment (measured as an odds ratio) is not modified (changed) by/does not vary with age group: i.e. that age group is not an effect modifier for/does not interact with visual impairment. --- class: middle, center # The end ## slide address: https://wangcc.me/logistic-reg-CSS/