The Cellar - SpamAssassin help please

The Cellar (http://cellar.org/index.php)

- Technology (http://cellar.org/forumdisplay.php?f=7)

- - SpamAssassin help please (http://cellar.org/showthread.php?t=11661)

SpamAssassin help please

I am trying to use sa-learn for the first time here at work and I'd like a little guidance as to the proper usage of the tool. I have considerable documentation here, but the part I'm stuck is how much / what to edit out of the Spam (as opposed to Ham) so that I don't leave something in that I should have left out. For instance, my boss's email address so that he is inadvertently tagged as a spammer. And I don't want to edit out anything that incriminates the real spammers. Basically, I'm having trouble comprehending the verbose email headers.

My brain hurts.

Update:

I have decided to reduce my overthinking on this problem. I had been swept away by the documentation and had turned on the Headers in pine and saw a ton of stuff I wanted to segregate into incriminating and irrelevant with precision.

I discarded that ideal in favor of plain view, exporting the message to a file and deleting the outermost layers of forwarding to leave behind only the original From: To: Subject: etc section. Then I fed each of those files to the SpamAssassin learning module via sa-learn --spam --file filename(s).

Now I have to go through and balance all those spams with a similar quantity of Hams (the logical complement to spam: wanted, valid email) so the Bayesian filters have the right tools to work with.

Usually, SA is intelligent enough to ignore everything that is irrelavant. I sort my incoming email into a Spam mailbox and a set of "other" mailboxes, then call 'sa-learn --mbox --showdots --ham' on the "others" and 'sa-learn --mbox --showdots --spam' on the spammy ones. I don't play with headers; in fact it's usually best to leave them all in place. I usually use mutt as my mailreader (when logged in via ssh).

Even after a mail has gone through SA and has been marked up, you can re-send it though SA and SA will parse it correctly (ie without its own markup). This is also useful if you've inadvertantly tagged a ham as a spam, or vice versa. You can re-sa-learn it, and it will be removed from the wrong database and deposited in the correct one.

The thing I've been dreading is upgrading SA. I'm currently using version 3.0.2, and they're up to 3.1.5... The problem is that they changed database formats (I think) and I'd loose all my aggregate info. Uck.

I was less worried about cluttering the sa db with irrelevant information than I was about giving it mixed information, by training as spam a spam message sent to my boss and forwarded to me to process. I wanted to guarantee that the boss wouldn't be learned as a spam source but I didn't want to give any of the spammness a free ride. Properly dividing the content of the message was my dilema.

btw, thanks for your reply, I was worried I was the only spamassassin user in the house.

bump

I am in need of assistance with my spamassassin installation.

I am trying to whitelist a very specific sender. I have written two rules to try to achieve this effect. Here are the model and the rules. I have obfuscated the real domains.

Model:

Code:

#################

#

#   don't mark XYZ company's mail

#

#################



header XYZ_RULE             From =~ /\@xyz.com/

describe XYZ_RULE           From XYZ company

score XYZ_RULE                -10

Here are the two rules:

Code:

##################

#

# Let ABC company mail through

#

#  bigv 08 mar 2010

#

##################



whitelist_from                  *@ecomm-mail.abc.com



header ABC_RULE             From =~ /\@abc.com/

describe ABC_RULE           From ABC

score ABC_RULE                -50

I've saved my work in the local.cf file. I've forced the system to reread the rules. I've restarted the whole system. Still, mail addressed to me from this domain does not reach me. I have "--lint"ed the local.cf file though I confess I don't know how to interpret the results.

I think that's it. I need some help please. Post your reply here or send me a private message if you wish to connect. I'll make it worthwhile for the person who solves this problem.

Thanks in advance.

Delete spamassassin spyware.

I WIN, I WIN, I WIN. . . exactly what did I win?

What does the spam scoring and list of rules matched look like? Or is the message not even making it to your server?

I think the issue is with the regex: what you have written will allow
someone@abc.com
but not
someone@mailserver.abc.com

Try modding the regexp to something like
/\@[\w+\.]*abc\.com/

You are also missing escapes for the periods, so my version escapes the @-sign, the period(s) and allows for an arbitrary number of subdomain steps.

I love regex.

Or you could be less explicit and say
/[\.\@]abc\.com$/

That matches any address that ends in (at-sign | period) + "abc.com" regardless of what else it starts with.

>>>>>>>> WOOSH >>>>>>>>

The question was way over my head. As I scroll down I see apparently humor is too because pie could save the day with a funny. You're smart + funny pie. :)