The Cellar

The Cellar (http://cellar.org/index.php)
-   Technology (http://cellar.org/forumdisplay.php?f=7)
-   -   SpamAssassin help please (http://cellar.org/showthread.php?t=11661)

BigV 09-05-2006 07:57 PM

SpamAssassin help please
 
I am trying to use sa-learn for the first time here at work and I'd like a little guidance as to the proper usage of the tool. I have considerable documentation here, but the part I'm stuck is how much / what to edit out of the Spam (as opposed to Ham) so that I don't leave something in that I should have left out. For instance, my boss's email address so that he is inadvertently tagged as a spammer. And I don't want to edit out anything that incriminates the real spammers. Basically, I'm having trouble comprehending the verbose email headers.

My brain hurts.

BigV 09-11-2006 03:28 PM

Update:

I have decided to reduce my overthinking on this problem. I had been swept away by the documentation and had turned on the Headers in pine and saw a ton of stuff I wanted to segregate into incriminating and irrelevant with precision.

I discarded that ideal in favor of plain view, exporting the message to a file and deleting the outermost layers of forwarding to leave behind only the original From: To: Subject: etc section. Then I fed each of those files to the SpamAssassin learning module via sa-learn --spam --file filename(s).

Now I have to go through and balance all those spams with a similar quantity of Hams (the logical complement to spam: wanted, valid email) so the Bayesian filters have the right tools to work with.

Pie 09-12-2006 12:01 PM

Usually, SA is intelligent enough to ignore everything that is irrelavant. I sort my incoming email into a Spam mailbox and a set of "other" mailboxes, then call 'sa-learn --mbox --showdots --ham' on the "others" and 'sa-learn --mbox --showdots --spam' on the spammy ones. I don't play with headers; in fact it's usually best to leave them all in place. I usually use mutt as my mailreader (when logged in via ssh).

Even after a mail has gone through SA and has been marked up, you can re-send it though SA and SA will parse it correctly (ie without its own markup). This is also useful if you've inadvertantly tagged a ham as a spam, or vice versa. You can re-sa-learn it, and it will be removed from the wrong database and deposited in the correct one.

The thing I've been dreading is upgrading SA. I'm currently using version 3.0.2, and they're up to 3.1.5... The problem is that they changed database formats (I think) and I'd loose all my aggregate info. Uck.

BigV 09-12-2006 03:24 PM

I was less worried about cluttering the sa db with irrelevant information than I was about giving it mixed information, by training as spam a spam message sent to my boss and forwarded to me to process. I wanted to guarantee that the boss wouldn't be learned as a spam source but I didn't want to give any of the spammness a free ride. Properly dividing the content of the message was my dilema.

btw, thanks for your reply, I was worried I was the only spamassassin user in the house.

BigV 03-08-2010 02:52 PM

bump

I am in need of assistance with my spamassassin installation.

I am trying to whitelist a very specific sender. I have written two rules to try to achieve this effect. Here are the model and the rules. I have obfuscated the real domains.

Model:

Code:

#################
#
#  don't mark XYZ company's mail
#
#################

header XYZ_RULE            From =~ /\@xyz.com/
describe XYZ_RULE          From XYZ company
score XYZ_RULE                -10

Here are the two rules:

Code:

##################
#
# Let ABC company mail through
#
#  bigv 08 mar 2010
#
##################

whitelist_from                  *@ecomm-mail.abc.com

header ABC_RULE            From =~ /\@abc.com/
describe ABC_RULE          From ABC
score ABC_RULE                -50

I've saved my work in the local.cf file. I've forced the system to reread the rules. I've restarted the whole system. Still, mail addressed to me from this domain does not reach me. I have "--lint"ed the local.cf file though I confess I don't know how to interpret the results.

I think that's it. I need some help please. Post your reply here or send me a private message if you wish to connect. I'll make it worthwhile for the person who solves this problem.

Thanks in advance.

classicman 03-08-2010 02:59 PM

Delete spamassassin spyware.

I WIN, I WIN, I WIN. . . exactly what did I win?

SteveDallas 03-10-2010 12:54 PM

What does the spam scoring and list of rules matched look like? Or is the message not even making it to your server?

Pie 03-10-2010 02:02 PM

I think the issue is with the regex: what you have written will allow
someone@abc.com
but not
someone@mailserver.abc.com

Try modding the regexp to something like
/\@[\w+\.]*abc\.com/

You are also missing escapes for the periods, so my version escapes the @-sign, the period(s) and allows for an arbitrary number of subdomain steps.

I love regex.

Pie 03-10-2010 02:05 PM

Or you could be less explicit and say
/[\.\@]abc\.com$/

That matches any address that ends in (at-sign | period) + "abc.com" regardless of what else it starts with.

classicman 03-10-2010 03:30 PM

>>>>>>>> WOOSH >>>>>>>>

Pie 03-10-2010 03:42 PM

http://imgs.xkcd.com/comics/regular_expressions.png

skysidhe 03-10-2010 09:12 PM

The question was way over my head. As I scroll down I see apparently humor is too because pie could save the day with a funny. You're smart + funny pie. :)


All times are GMT -5. The time now is 10:51 PM.

Powered by: vBulletin Version 3.8.1
Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.