Deleting junk email based on content

C

Charles Edwards

Hi, I'm getting plenty of spam, it fequently comes with little in the way for
me to create a rule that I can use to filter it out for any action, but
frequently enough it will contain hyperlinks.

is it possible to create a rule filter based on part of a Hyperlink?

E.g. I've had this one http://58e4.batbe.ru/ and I'd like to be able to
filter on the .ru, but I can't figure out how to do so as creating a filter
deleting messages with *.ru in the message area doesn't work.
 
V

VanguardLH

Charles said:
Hi, I'm getting plenty of spam, it fequently comes with little in the way for
me to create a rule that I can use to filter it out for any action, but
frequently enough it will contain hyperlinks.

is it possible to create a rule filter based on part of a Hyperlink?

E.g. I've had this one http://58e4.batbe.ru/ and I'd like to be able to
filter on the .ru, but I can't figure out how to do so as creating a filter
deleting messages with *.ru in the message area doesn't work.

Probably will only work if the e-mail is sent as plain text. If it is HTML
formatted, there are lots of ways of obfuscating the HTML code so strings
cannot be properly parsed. That is, what you see rendered for HTML code is
not necessary the same content in the code. HTML tags can be used to slice
up a string, portions of the string could be in different cells inside a
table, the string could be separated across divisions in the page, etc.
Besides, your filtering is very basic (and Microsoft doesn't support regular
expressions to give you more robust conditions on which to test for
strings). Checking for just ".ru" would also trigger on "that's a
bummer...rush to get a new valve". You don't get to specify what types of
strings or characters come before and after the string you are looking for.
With something like \bhttp(|s)://S+\.ru\b for regex you would be better
guaranteed to be only looking for URLs that ended with the .ru TLD (yeah, I
figure you don't know regex but decided to show an example anyway).
However, even regex cannot overcome the obfuscation possible in HTML
formatted e-mails. You'll be wasting a lot of your time trying to filter
spam based on the body of your e-mails. That's only feasible for good
e-mails where something inside is consistent and is plain text or nothing
funny is used in HTML, like weekly status reports you get at work.

Filtering spam based on its content is best left to a Bayes filter (like the
Junk filter built into Outlook 2003, and up, but there are other [better]
Bayes filters available in anti-spam products). Rather than attempt to
decipher the content of an e-mail, it is usually more accurate to detect
spam based on where it originated by using public DNSBLs (blacklists), like
from Spamhaus or SpamCop. Outlook doesn't support DNSBLs but many anti-spam
products do (e.g., SpamPal). If you decide to get decent anti-spam
filtering to augment your e-mail client and which uses DNSBLs, you might
choose to participate in reporting spam, like at SpamCop, to help update
their blacklist. They also send an abuse report to the sender's e-mail
service but don't rely on that having any effect. It's updating the
blacklist that is important. You help others eliminate getting the spam and
them reporting spam helps you not get it because of users updating the
blacklists.

Did you enable the anti-spam filter up on your e-mail account? Use the
webmail interface to your e-mail account and check its settings to check
that the e-mail provider's anti-spam filter is enabled (and how it is
configured). If they have a spam filter, it is very likely their webmail
client provides a "This is spam" button. If you configure your local e-mail
client to "leave messages on server" (only needed for POP, not for IMAP)
then you can use the webmail client to your account to mark that item as
spam. That will help update your e-mail provider's anti-spam filter.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top