Deleting spam from the server

J

Jazsnap

Hi,

We have set up an anti-spam proxy server using spamassassin to append
'***SPAM***' to the subject line of suspected spam emails. Would it be
possible to configure Outlook 2002 (i.e. via some kind of rule) to delete
these emails from the mail server without downloading them to the users PC?

TIA,
Jason
 
R

Roady [MVP]

Assuming POP3? No, but you can configure a rule on the server side mailbox
to delete those messages so they won't be in the Inbox and the POP3 protocol
won't see those messages.
 
J

Jazsnap

Assuming POP3? No, but you can configure a rule on the server side
mailbox to delete those messages so they won't be in the Inbox and the
POP3 protocol won't see those messages.

Unfortunately the messages aren't marked as SPAM until they leave the mail
server & hit our proxy server en route to the client machines.

I'm not sure if the proxy server can be configured to delete / reroute SPAM
from there, we're using an Ubuntu box running proxy software called
Delegate with spamassassin which I'm not overly familiar with. Thanks
anyway.

Jase
 
V

VanguardLH

Jazsnap said:
Unfortunately the messages aren't marked as SPAM until they leave the mail
server & hit our proxy server en route to the client machines.

Does this anti-spam proxy only look at the headers of the e-mail to
determine if it is spam? If it looks at the body then obviously the
e-mail *is* being fully downloaded already. If the proxy isn't the one
initiating the mail session with the mail server (i.e., it establishes a
mail session at the request of the e-mail client) then the e-mail client
doesn't know about the proxy and is going to retrieve e-mails as it
normally does - and that means it will retrieve the entire message from
the "server" (which happens to be your proxy).

If the e-mail client were capable of using the "TOP [n]" command to
retrieve just the headers (and optionally the first n lines of the body)
then the entire e-mail does not get downloaded (unless, of course, n is
equal to or larger than the number of lines in the e-mail, if n is
specified). Outlook doesn't use the TOP command. It just uses the RETR
command which retrieves the entire e-mail (headers and body).

Spam e-mails are usually small (often a lot smaller than non-spam
e-mails). It will take many of them to even show up on radar regarding
their bandwidth consumption to one mailbox. So saving bandwidth by the
user's e-mail client downloading the spam would seem to be related to
the size of the e-mails. Most spam is tiny. Also, the body of the spam
e-mail could be less then or not much larger than the number of bytes
consumed by the headers so you would only save on half, or less, the
bandwidth needed to download an already tiny messages. If you want to
save bandwidth regarding huge e-mails (whether spam or ham) then
configure Outlook's Send/Receive group to only download messages under
some maximum size. If the user wants those large messages downloaded to
Outlook, they will have to mark them and then download them (a manual
process). However, that also means those large messages remain on the
server until downloaded and that could eat up the user's mailbox quota
regarding its disk consumption.

SpamAssassin uses a Bayesian scheme to weight keywords. The problem
with any Bayesian scheme is double-weighting for e-mail clients that do
support using the TOP command. If the n parameter is used to include a
portion of the body of the e-mail, or if the Bayes weighting includes
checking the content of the headers, a TOP command (to see what is on
the mail server) that is then thereafter followed by a RETR command (to
retrieve those e-mails) will result in TWICE adding the keywords for
that e-mail into the Bayes database. Only if the TOP command is used
without its n parameter, or the n parameter is not used but the Bayes
scheme does NOT include the headers in its checking, or if a RETR for
the e-mail does not follow after a TOP command (like using TOP and then
using "delete from server") will the double entry be eliminated - but
ONLY for the spam. The ham will still end up sending a TOP followed by
a RETR so you overly bias the Bayes database with the ham keywords. If
you don't want the Bayes database to get skewed by ham getting recorded
twice then you need to use an e-mail client that doesn't use TOP to see
what is in the mailbox.

So have Outlook do its thing by downloading the entire message. That
ensures the Bayes database doesn't get skewed. Use a rule to
permanently delete items with the spam tag. However, since the Bayesian
scheme is a guessing scheme based on the weighting of only *some*
keywords selected from a message and is based on a past history of
weighting, the users probably should be deleting their suspect e-mails.
Instead their rule should move the suspect e-mails into the Junk folder.
Then use auto-archiving on the Junk folder to delete items a day, or
two, after they have been received (I set it to 3 days). The rule that
moves the spam-tagged e-mails into the Junk folder can also "mark as
read" those items so the user isn't drawn into looking into the Junk
folder - unless they were actually expecting an e-mail, it isn't in
their Inbox, and then they check the Junk folder for a false positive
(something that all Bayes schemes will have).

If what you asked were possible (it is with other e-mail clients or
e-mail monitors that can use the TOP command) and what SpamAssasion
*thought* was spam got deleted from the user's mailbox, just how is that
user going to recover from false positives? What if they actually
*want* to get those e-mails that SpamAssassin has classified as spam,
like they subscribed to a coupon mailing list? It's not like their Safe
Senders or othe whitelists are usable in SpamAssassin that is not
running on that user's own host where their e-mail client runs. What is
the user would actually like to participate in the anti-spam fight by
sending abuse reports to the sender's e-mail provider or their upstream
provider and get blacklists updated, like SpamCop?

SpamAssassin as a proxy is the same as when I used SpamPal which also
runs as a proxy. Both merely tag the SUSPECT e-mails (and Bayesian can
be added to SpamPal with an add-on). Neither of them can guarantee an
e-mail is spam or ham. It's a guessing scheme based on historical
weighting of some keywords. Both merely tag the suspect e-mails. It is
up to the user via rules to decide what to do with those suspect
e-mails. Might be better to your mindset if you stopped calling it spam
and started calling them suspect e-mails. Not everything that
SpamAssassin, K9, SpamPal w/Bayes, SpamBayes, or other Bayes-oriented
filtering schemes mark as spam is actually spam, nor might what is spam
to you or another user considered spam by the actual recipient of that
e-mail. Leave the choice to the recipient of the e-mail. You are
getting too much in the way of delivering their e-mails and of making
choices for them that they may not want or can interfere with their
wanted e-mails.
 
L

lainy

I don't understand why you cannot delete emails prior to downloading by
highlighting it and then going to edit and hitting delete. No matter what I
do when I highlight the email it automatically downloads. I even c hanged it
in options to not automatically download when highlighting?

VanguardLH said:
Jazsnap said:
Unfortunately the messages aren't marked as SPAM until they leave the mail
server & hit our proxy server en route to the client machines.

Does this anti-spam proxy only look at the headers of the e-mail to
determine if it is spam? If it looks at the body then obviously the
e-mail *is* being fully downloaded already. If the proxy isn't the one
initiating the mail session with the mail server (i.e., it establishes a
mail session at the request of the e-mail client) then the e-mail client
doesn't know about the proxy and is going to retrieve e-mails as it
normally does - and that means it will retrieve the entire message from
the "server" (which happens to be your proxy).

If the e-mail client were capable of using the "TOP [n]" command to
retrieve just the headers (and optionally the first n lines of the body)
then the entire e-mail does not get downloaded (unless, of course, n is
equal to or larger than the number of lines in the e-mail, if n is
specified). Outlook doesn't use the TOP command. It just uses the RETR
command which retrieves the entire e-mail (headers and body).

Spam e-mails are usually small (often a lot smaller than non-spam
e-mails). It will take many of them to even show up on radar regarding
their bandwidth consumption to one mailbox. So saving bandwidth by the
user's e-mail client downloading the spam would seem to be related to
the size of the e-mails. Most spam is tiny. Also, the body of the spam
e-mail could be less then or not much larger than the number of bytes
consumed by the headers so you would only save on half, or less, the
bandwidth needed to download an already tiny messages. If you want to
save bandwidth regarding huge e-mails (whether spam or ham) then
configure Outlook's Send/Receive group to only download messages under
some maximum size. If the user wants those large messages downloaded to
Outlook, they will have to mark them and then download them (a manual
process). However, that also means those large messages remain on the
server until downloaded and that could eat up the user's mailbox quota
regarding its disk consumption.

SpamAssassin uses a Bayesian scheme to weight keywords. The problem
with any Bayesian scheme is double-weighting for e-mail clients that do
support using the TOP command. If the n parameter is used to include a
portion of the body of the e-mail, or if the Bayes weighting includes
checking the content of the headers, a TOP command (to see what is on
the mail server) that is then thereafter followed by a RETR command (to
retrieve those e-mails) will result in TWICE adding the keywords for
that e-mail into the Bayes database. Only if the TOP command is used
without its n parameter, or the n parameter is not used but the Bayes
scheme does NOT include the headers in its checking, or if a RETR for
the e-mail does not follow after a TOP command (like using TOP and then
using "delete from server") will the double entry be eliminated - but
ONLY for the spam. The ham will still end up sending a TOP followed by
a RETR so you overly bias the Bayes database with the ham keywords. If
you don't want the Bayes database to get skewed by ham getting recorded
twice then you need to use an e-mail client that doesn't use TOP to see
what is in the mailbox.

So have Outlook do its thing by downloading the entire message. That
ensures the Bayes database doesn't get skewed. Use a rule to
permanently delete items with the spam tag. However, since the Bayesian
scheme is a guessing scheme based on the weighting of only *some*
keywords selected from a message and is based on a past history of
weighting, the users probably should be deleting their suspect e-mails.
Instead their rule should move the suspect e-mails into the Junk folder.
Then use auto-archiving on the Junk folder to delete items a day, or
two, after they have been received (I set it to 3 days). The rule that
moves the spam-tagged e-mails into the Junk folder can also "mark as
read" those items so the user isn't drawn into looking into the Junk
folder - unless they were actually expecting an e-mail, it isn't in
their Inbox, and then they check the Junk folder for a false positive
(something that all Bayes schemes will have).

If what you asked were possible (it is with other e-mail clients or
e-mail monitors that can use the TOP command) and what SpamAssasion
*thought* was spam got deleted from the user's mailbox, just how is that
user going to recover from false positives? What if they actually
*want* to get those e-mails that SpamAssassin has classified as spam,
like they subscribed to a coupon mailing list? It's not like their Safe
Senders or othe whitelists are usable in SpamAssassin that is not
running on that user's own host where their e-mail client runs. What is
the user would actually like to participate in the anti-spam fight by
sending abuse reports to the sender's e-mail provider or their upstream
provider and get blacklists updated, like SpamCop?

SpamAssassin as a proxy is the same as when I used SpamPal which also
runs as a proxy. Both merely tag the SUSPECT e-mails (and Bayesian can
be added to SpamPal with an add-on). Neither of them can guarantee an
e-mail is spam or ham. It's a guessing scheme based on historical
weighting of some keywords. Both merely tag the suspect e-mails. It is
up to the user via rules to decide what to do with those suspect
e-mails. Might be better to your mindset if you stopped calling it spam
and started calling them suspect e-mails. Not everything that
SpamAssassin, K9, SpamPal w/Bayes, SpamBayes, or other Bayes-oriented
filtering schemes mark as spam is actually spam, nor might what is spam
to you or another user considered spam by the actual recipient of that
e-mail. Leave the choice to the recipient of the e-mail. You are
getting too much in the way of delivering their e-mails and of making
choices for them that they may not want or can interfere with their
wanted e-mails.
.
 
V

VanguardLH

lainy said:
I don't understand why you cannot delete emails prior to downloading by
highlighting it and then going to edit and hitting delete.

And just how do you see the email without it already being in Outlook. If
you can see it then it has been downloaded.
No matter what I do when I highlight the email it automatically downloads.
I even c hanged it in options to not automatically download when
highlighting?

If you see in in the messages pane, it's already been downloaded.
Highlighting doesn't do a mail poll. Highlighting doesn't download.
Highlighting selects the item (which will then display in the Preview pane
if you have it enabled). Highlighting is just, well, highlighting what is
already there.

You can configure Outlook to download headers only. However, that means
when you want to read an e-mail, you have to mark it and then manually
download the marked items. So, yes, you can get Outlook to download headers
only but it is a real pain in having to them manually individually mark each
item that you want to download its body and then manually download the
marked items.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top