Author Login
Post Reply
Hello All,
Using the standard "summary" function in 'R', I ran across some odd
behavior that I cannot understand. Easy to reproduce:
Typing:
summary(c(6,207936))
Yields::
Min. *1st Qu. Median Mean 3rd Qu. Max.*
6 *51990 104000 104000 156000 207900*
None of these values are correct except for the minimum. If I perform
"quantile(c(6, 207936))", it gives the correct values. I originally
presumed that summary was merely calling "quantile" if it saw a numeric, but
this doesn't seem to be the case.
Anyone know what's going on here? On a related note, what is the
statistically correct answer for calculating the 1st quartile & 3rd quartile
when only 2 values are present? I presume one takes the mid-point between
the median (also calculated) and the min or max. So in this case, 51988.5
for 1st & 155953.5 for 3rd (which is what quantile calculates). But taking
25% & 75% of the sum of the 2 also seems "reasonable". Either way,
"summary" is calculating the wrong number, and most disturbing is that it
mis-calculates the max.
Regards,
Mike
"Telescopes and bathyscaphes and sonar probes of Scottish lakes,
Tacoma Narrows bridge collapse explained with abstract phase-space maps,
Some x-ray slides, a music score, Minard's Napoleanic war:
The most exciting frontier is charting what's already here."
-- xkcd
--
Help protect Wikipedia. Donate now:
http://wikimediafoundation.org/wiki/Support_Wikipedia/en
[[alternative HTML version deleted]]
______________________________________________
R-help@(protected)
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.