# R in action读书笔记（6）-第七章：基本统计分析（中）

7.2 频数表和列联表

> library(vcd)

ID Treatment Sex Age Improved

1 57 Treated Male 27 Some

2 46 Treated Male 29 None

3 77 Treated Male 30 None

4 17 Treated Male 32 Marked

5 36 Treated Male 46 Marked

6 23 Treated Male 58 Marked

7.2.1 生成频数表

table(var1, var2, …, varN) 使用 N 个类别型变量（因子）创建一个 N 维列联表

xtabs(formula, data) 根据一个公式和一个矩阵或数据框创建一个 N 维列联表

prop.table(table, margins) 依margins定义的边际列表将表中条目表示为分数形式

margin.table(table, margins) 依margins定义的边际列表计算表中条目的和

ftable(table) 创建一个紧凑的“平铺”式列联表

1. 一维列联表

`> mytable<-with(Arthritis,table(Improved))`

`> mytable`

`Improved`

` None Some Marked `

` 42 14 28 `

`> prop.table(mytable)`

`Improved`

` None Some Marked `

`0.5000000 0.1666667 0.3333333 `

2. 二维列联表

`> mytable<-xtabs(~Treatment+Improved,data=Arthritis)`

`> mytable`

` Improved`

`Treatment None Some Marked`

` Placebo 29 7 7`

` Treated 13 7 21`

`> margin.table(mytable,1)`

`Treatment`

`Placebo Treated `

` 43 41 `

`> prop.table(mytable,1)`

` Improved`

`Treatment None Some Marked`

` Placebo 0.6744186 0.1627907 0.1627907`

` Treated 0.3170732 0.1707317 0.5121951`

`> margin.table(mytable,2)`

`Improved`

` None Some Marked `

` 42 14 28 `

`> prop.table(mytable,2)`

` Improved`

`Treatment None Some Marked`

` Placebo 0.6904762 0.5000000 0.2500000`

` Treated 0.3095238 0.5000000 0.7500000`

`> prop.table(mytable)`

` Improved`

`Treatment None Some Marked`

` Placebo 0.34523810 0.08333333 0.08333333`

` Treated 0.15476190 0.08333333 0.25000000`

`> addmargins(mytable)`

` Improved`

`Treatment None Some Marked Sum`

` Placebo 29 7 7 43`

` Treated 13 7 21 41`

` Sum 42 14 28 84`

`> addmargins(prop.table(mytable))`

` Improved`

`Treatment None Some Marked Sum`

` Placebo 0.34523810 0.08333333 0.08333333 0.51190476`

` Treated 0.15476190 0.08333333 0.25000000 0.48809524`

` Sum 0.50000000 0.16666667 0.33333333 1.00000000`

`> addmargins(prop.table(mytable,1),2)#仅添加了各行的和`

` Improved`

`Treatment None Some Marked Sum`

` Placebo 0.6744186 0.1627907 0.1627907 1.0000000`

` Treated 0.3170732 0.1707317 0.5121951 1.0000000`

`> CrossTable(Arthritis\$Treatment,Arthritis\$Improved)`

CrossTable()函数有很多选项，可以做许多事情：计算（行、列、单元格）的百分比；指

3.多维列联表

`> mytable<-xtabs(~Treatment+Sex+Improved,data=Arthritis)`

`, , Improved = None`

` Sex`

`Treatment Female Male`

` Placebo 19 10`

` Treated 6 7`

`, , Improved = Some`

` Sex`

`Treatment Female Male`

` Placebo 7 0`

` Treated 5 2`

`, , Improved = Marked`

` Sex`

`Treatment Female Male`

` Placebo 6 1`

` Treated 16 5`

` `

`> ftable(mytable)`

` Improved None Some Marked`

`Treatment Sex `

`Placebo Female 19 7 6`

` Male 10 0 1`

`Treated Female 6 5 16`

` Male 7 2 5`

> margin.table(mytable,c(1,3))#治疗情况（Treatment） × 改善情况（Improved）的边际频数

` Improved`

`Treatment None Some Marked`

` Placebo 29 7 7`

` Treated 13 7 21`

7.2.2独立性检验

1. 卡方独立性检验

`> library(vcd)`

`> mytable<-xtabs(~Treatment+Improved,data=Arthritis)`

`> chisq.test(mytable)`

` Pearson's Chi-squared test`

`data: mytable`

`X-squared = 13.055, df = 2, p-value = 0.001463#治疗情况和改善情况不独立`

2. Fisher精确检验

`> fisher.test(mytable)`

` Fisher's Exact Test for Count Data`

`data: mytable`

`p-value = 0.001393`

`alternative hypothesis: two.sided`

3.Cochran-Mantel—Haenszel检验

mantelhaen.test()函数可用来进行Cochran—Mantel—Haenszel卡方检验，其原假设是，两

` > mantelhaen.test(mytable)`

` Cochran-Mantel-Haenszel test`

`data: mytable`

`Cochran-Mantel-Haenszel M^2 = 14.6323, df = 2,`

`p-value = 0.0006647`

7.2.3 相关性的度量

`> mytable<-xtabs(~Treatment+Improved,data=Arthritis)`

`> assocstats(mytable)`

` X^2 df P(> X^2)`

`Likelihood Ratio 13.530 2 0.0011536`

`Pearson 13.055 2 0.0014626`

`Phi-Coefficient : 0.394 `

`Contingency Coeff.: 0.367 `

`Cramer's V : 0.394 `

7.2.5将表转换为扁平格式

`> table2flat<-function(mytable){`

`+ df<-as.data.frame(mytable)`

`+ rows<-dim(df)[1]`

`+ cols<-dim(df)[2]`

`+ x<-NULL`

`+ for(i in 1:rows){`

`+ for(j in 1:df\$Freq[i]){`

`+ row<-df[i,c(1:(cols-1))]`

`+ x<-rbind(x,row)`

`+ }`

`+ }`

`+ row.names(x)<-c(1:dim(x)[1])`

`+ return(x)`

`+ }`

`> treatment<-rep(c("Placebo","Treated"),times=3)`

`> improved<-rep(c("None","Some","marked"),each=2)`

`> Freq<-c(29,13,7,17,7,21)`

`> mytable<-as.data.frame(cbind(treatment,improved,Freq))`

`> mydata<-table2flat(mytable)`

`> head(mydata)`

` treatment inmproved`

`1 Placebo None`

`2 Placebo None`

`3 Placebo None`

`4 Placebo None`

`5 Treated None`

`6 Placebo Some`

Top