Java Mailing List Archive

http://www.r-help.com/

Home » Home (12/2007) » R Help for Statistical Computing »

[R] Nicely formatted summary table with mean,
standard deviation or number and proportion

Keith Wong

2007-05-13

Replies:

Dear all,

The incredibly useful Hmisc package provides a method to generate
summary tables that can be typeset in latex. The Alzola and Harrell book
"An introduction to S and the Hmisc and Design libraries" provides an
example that generates mean and quartiles for continuous variables, and
numbers and percentages for count variables: summary() with method =
'reverse'.

I wonder if there is a way to change it so the mean and standard
deviation are reported instead for continuous variables.

I illustrate my question below using an example from the book.

Thank you.

Keith


> ####
> library(Hmisc)
>
> set.seed(173)
> sex = factor(sample(c("m", "f"), 500, rep = T))
> age = rnorm(500, 50, 5)
> treatment = factor(sample(c("Drug", "Placebo"), 500, rep = T))
> summary(sex ~ treatment, fun = table)
sex   N=500

+---------+-------+---+---+---+
|      |     |N |f |m |
+---------+-------+---+---+---+
|treatment|Drug  |263|140|123|
|      |Placebo|237|133|104|
+---------+-------+---+---+---+
|Overall |     |500|273|227|
+---------+-------+---+---+---+
>
>
>
> (x = summary(treatment ~ age + sex, method = "reverse"))
> # generates quartiles for continuous variables


Descriptive Statistics by treatment

+-------+--------------+--------------+
|     |Drug       |Placebo     |
|     |(N=263)     |(N=237)     |
+-------+--------------+--------------+
|age   |46.5/49.9/53.2|46.7/50.0/53.4|
+-------+--------------+--------------+
|sex : m|  47% (123) |  44% (104) |
+-------+--------------+--------------+
>
>
> # latex(x) generates a very nicely formatted table
> # but I'd like "mean (standard deviation)" instead of quartiles.



> # this function from
http://tolstoy.newcastle.edu.au/R/e2/help/06/11/4713.html
> g <- function(y) {
+  s <- apply(y, 2,
+         function(z) {
+           z <- z[!is.na(z)]
+           n <- length(z)
+           if(n==0) c(NA,NA,NA,0) else
+           if(n==1) c(z, NA,NA,1) else {
+            m <- mean(z)
+            s <- sd(z)
+            c(N=n, Mean=m, SD=s)
+           }
+         })
+  w <- as.vector(s)
+  names(w) <- as.vector(outer(rownames(s), colnames(s), paste, sep=''))
+  w
+ }

>
> summary(treatment ~ age + sex, method = "reverse", fun = g)
> # does not work, 'fun' or 'FUN" argument is ignored.


Descriptive Statistics by treatment

+-------+--------------+--------------+
|     |Drug       |Placebo     |
|     |(N=263)     |(N=237)     |
+-------+--------------+--------------+
|age   |46.5/49.9/53.2|46.7/50.0/53.4|
+-------+--------------+--------------+
|sex : m|  47% (123) |  44% (104) |
+-------+--------------+--------------+
>
>
> (x1 = summarize(cbind(age), llist(treatment), FUN = g,
stat.name=c("n", "mean", "sd")))
 treatment  n mean  sd
1    Drug 263 49.9 4.94
2  Placebo 237 50.1 4.97
>
> # this works but table is rotated, and it count data has to be
> # treated separately.



--
Keith Wong
PhD candidate
Sleep & Circadian Research Group
Woolcock Institute of Medical Research

email  keithw@(protected)
Phone  +61 2 9515 8981
Fax   +61 2 9515 7070
Mail   PO Box M77, Missenden Road NSW 2050, Australia

______________________________________________
R-help@(protected)
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
©2008 r-help.com - Jax Systems, LLC, U.S.A.