The Cellar  

Go Back   The Cellar > Main > Technology
FAQ Community Calendar Today's Posts Search

Technology Computing, programming, science, electronics, telecommunications, etc.

Reply
 
Thread Tools Display Modes
Old 09-05-2006, 07:57 PM   #1
BigV
Goon Squad Leader
 
Join Date: Nov 2004
Location: Seattle
Posts: 27,063
SpamAssassin help please

I am trying to use sa-learn for the first time here at work and I'd like a little guidance as to the proper usage of the tool. I have considerable documentation here, but the part I'm stuck is how much / what to edit out of the Spam (as opposed to Ham) so that I don't leave something in that I should have left out. For instance, my boss's email address so that he is inadvertently tagged as a spammer. And I don't want to edit out anything that incriminates the real spammers. Basically, I'm having trouble comprehending the verbose email headers.

My brain hurts.
__________________
Be Just and Fear Not.
BigV is offline   Reply With Quote
Old 09-11-2006, 03:28 PM   #2
BigV
Goon Squad Leader
 
Join Date: Nov 2004
Location: Seattle
Posts: 27,063
Update:

I have decided to reduce my overthinking on this problem. I had been swept away by the documentation and had turned on the Headers in pine and saw a ton of stuff I wanted to segregate into incriminating and irrelevant with precision.

I discarded that ideal in favor of plain view, exporting the message to a file and deleting the outermost layers of forwarding to leave behind only the original From: To: Subject: etc section. Then I fed each of those files to the SpamAssassin learning module via sa-learn --spam --file filename(s).

Now I have to go through and balance all those spams with a similar quantity of Hams (the logical complement to spam: wanted, valid email) so the Bayesian filters have the right tools to work with.
__________________
Be Just and Fear Not.
BigV is offline   Reply With Quote
Old 09-12-2006, 12:01 PM   #3
Pie
Gone and done
 
Join Date: Sep 2001
Posts: 4,808
Usually, SA is intelligent enough to ignore everything that is irrelavant. I sort my incoming email into a Spam mailbox and a set of "other" mailboxes, then call 'sa-learn --mbox --showdots --ham' on the "others" and 'sa-learn --mbox --showdots --spam' on the spammy ones. I don't play with headers; in fact it's usually best to leave them all in place. I usually use mutt as my mailreader (when logged in via ssh).

Even after a mail has gone through SA and has been marked up, you can re-send it though SA and SA will parse it correctly (ie without its own markup). This is also useful if you've inadvertantly tagged a ham as a spam, or vice versa. You can re-sa-learn it, and it will be removed from the wrong database and deposited in the correct one.

The thing I've been dreading is upgrading SA. I'm currently using version 3.0.2, and they're up to 3.1.5... The problem is that they changed database formats (I think) and I'd loose all my aggregate info. Uck.
__________________
per·son \ˈpər-sən\ (noun) - an ephemeral collection of small, irrational decisions
The fun thing about evolution (and science in general) is that it happens whether you believe in it or not.
Pie is offline   Reply With Quote
Old 09-12-2006, 03:24 PM   #4
BigV
Goon Squad Leader
 
Join Date: Nov 2004
Location: Seattle
Posts: 27,063
I was less worried about cluttering the sa db with irrelevant information than I was about giving it mixed information, by training as spam a spam message sent to my boss and forwarded to me to process. I wanted to guarantee that the boss wouldn't be learned as a spam source but I didn't want to give any of the spammness a free ride. Properly dividing the content of the message was my dilema.

btw, thanks for your reply, I was worried I was the only spamassassin user in the house.
__________________
Be Just and Fear Not.
BigV is offline   Reply With Quote
Old 03-08-2010, 02:52 PM   #5
BigV
Goon Squad Leader
 
Join Date: Nov 2004
Location: Seattle
Posts: 27,063
bump

I am in need of assistance with my spamassassin installation.

I am trying to whitelist a very specific sender. I have written two rules to try to achieve this effect. Here are the model and the rules. I have obfuscated the real domains.

Model:

Code:
#################
#
#   don't mark XYZ company's mail
#
#################

header XYZ_RULE             From =~ /\@xyz.com/
describe XYZ_RULE           From XYZ company
score XYZ_RULE                -10
Here are the two rules:

Code:
##################
#
# Let ABC company mail through
#
#  bigv 08 mar 2010
#
##################

whitelist_from                  *@ecomm-mail.abc.com

header ABC_RULE             From =~ /\@abc.com/
describe ABC_RULE           From ABC
score ABC_RULE                -50
I've saved my work in the local.cf file. I've forced the system to reread the rules. I've restarted the whole system. Still, mail addressed to me from this domain does not reach me. I have "--lint"ed the local.cf file though I confess I don't know how to interpret the results.

I think that's it. I need some help please. Post your reply here or send me a private message if you wish to connect. I'll make it worthwhile for the person who solves this problem.

Thanks in advance.
__________________
Be Just and Fear Not.
BigV is offline   Reply With Quote
Old 03-08-2010, 02:59 PM   #6
classicman
barely disguised asshole, keeper of all that is holy.
 
Join Date: Nov 2007
Posts: 23,401
Delete spamassassin spyware.

I WIN, I WIN, I WIN. . . exactly what did I win?
__________________
"like strapping a pillow on a bull in a china shop" Bullitt
classicman is offline   Reply With Quote
Old 03-10-2010, 12:54 PM   #7
SteveDallas
Your Bartender
 
Join Date: Jan 2002
Location: Philly Burbs, PA
Posts: 7,651
What does the spam scoring and list of rules matched look like? Or is the message not even making it to your server?
SteveDallas is offline   Reply With Quote
Old 03-10-2010, 02:02 PM   #8
Pie
Gone and done
 
Join Date: Sep 2001
Posts: 4,808
I think the issue is with the regex: what you have written will allow
someone@abc.com
but not
someone@mailserver.abc.com

Try modding the regexp to something like
/\@[\w+\.]*abc\.com/

You are also missing escapes for the periods, so my version escapes the @-sign, the period(s) and allows for an arbitrary number of subdomain steps.

I love regex.
__________________
per·son \ˈpər-sən\ (noun) - an ephemeral collection of small, irrational decisions
The fun thing about evolution (and science in general) is that it happens whether you believe in it or not.
Pie is offline   Reply With Quote
Old 03-10-2010, 02:05 PM   #9
Pie
Gone and done
 
Join Date: Sep 2001
Posts: 4,808
Or you could be less explicit and say
/[\.\@]abc\.com$/

That matches any address that ends in (at-sign | period) + "abc.com" regardless of what else it starts with.
__________________
per·son \ˈpər-sən\ (noun) - an ephemeral collection of small, irrational decisions
The fun thing about evolution (and science in general) is that it happens whether you believe in it or not.
Pie is offline   Reply With Quote
Old 03-10-2010, 03:30 PM   #10
classicman
barely disguised asshole, keeper of all that is holy.
 
Join Date: Nov 2007
Posts: 23,401
>>>>>>>> WOOSH >>>>>>>>
__________________
"like strapping a pillow on a bull in a china shop" Bullitt
classicman is offline   Reply With Quote
Old 03-10-2010, 03:42 PM   #11
Pie
Gone and done
 
Join Date: Sep 2001
Posts: 4,808
__________________
per·son \ˈpər-sən\ (noun) - an ephemeral collection of small, irrational decisions
The fun thing about evolution (and science in general) is that it happens whether you believe in it or not.
Pie is offline   Reply With Quote
Old 03-10-2010, 09:12 PM   #12
skysidhe
~~Life is either a daring adventure or nothing.~~
 
Join Date: Apr 2006
Posts: 6,828
The question was way over my head. As I scroll down I see apparently humor is too because pie could save the day with a funny. You're smart + funny pie.
skysidhe is offline   Reply With Quote
Reply


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

All times are GMT -5. The time now is 07:10 PM.


Powered by: vBulletin Version 3.8.1
Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.