Java Mailing List Archive

http://www.r-help.com/

Home » Home (12/2007) » R Help for Statistical Computing »

[R] cluster

WeiWei Shi

2005-07-25

Replies:

Dear listers:

Here I have a question on clustering methods available in R. I am
trying to down-sampling the majority class in a classification problem
on an imbalanced dataset. Since I don't want to lose information in
the original dataset, I don't want to use naive down-sampling: I think
using clustering on the majority class' side to select
"representative" samples might help. So, my question is, which
clustering method should be tested to get the best result. I think the
key thing might be the selection of "distance" considering the next
step in which I would like to use decision trees.

Please share your experience in using clustering (Any available
implementation outside R is also welcome)

weiwei
--
Weiwei Shi, Ph.D

"Did you always know?"
"No, I did not. But I believed..."
---Matrix III

______________________________________________
R-help@(protected)
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
©2008 r-help.com - Jax Systems, LLC, U.S.A.