Java Mailing List Archive

http://www.r-help.com/

Home » Home (12/2007) » R Help for Statistical Computing »

Re: [R] Off topic:Spam on R-help increase?

Marc Schwartz

2007-03-10

Replies:

On Sat, 2007-03-10 at 10:17 -0500, François Pinard wrote:
> [Marc Schwartz]
>
> >The "Human Spam Filter" (aka Martin) [...]
>
> The R mailing list has, indeed, be remarkably spam-free, and
> well-managed so far that I can see. I do hope, however, that Martin
> does not have to do the filtering himself -- it would be just daunting!
>
> In any case, Martin, a lot of thanks from me!

The comment was somewhat "tongue-in-cheek".

While a major proportion of spam can be filtered using automated tools,
it takes a significant amount of manual effort to configure the tools to
achieve the level of cleansing that we observe here.

On my system (laptop running FC6 Linux), I am using SpamAssassin with
Bayesian filtering enabled, along with remote spam checks such as DCC,
Razor, Pyzor and some RBLs.

I also recently started using FuzzyOCR (as a plug-in to SA) to enhance
the filtering of spam containing only graphic content. These e-mails are
of course specifically designed to obviate the utility of text based
spam filtering.

However, I still get some that come through despite the above. There are
also 'borderline' e-mails that require manually running the spam/ham
learning scripts.

To increase the filtering effectiveness to the level we see here, I
would have to spend a fair amount of time writing custom rules for SA
and this is where I have no doubt, Martin spends a lot of his time with
list management.

HTH,

Marc Schwartz

______________________________________________
R-help@(protected)
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
©2008 r-help.com - Jax Systems, LLC, U.S.A.