Rule to test on blank body misfires

V

Vanguard

Outlook 2002 SP-2

I have the following rule to check if the Subject header is blank:

Apply this rule after the message arrives
permanently delete it
except if the subject contains 'a' ... 'z' or '0' ... '9'
stop processing more rules

Microsoft has yet to provide regular expressions so the above regarding the
list of characters is shorthand. It actually is a bunch of ORs of every
alphabetic and numeric character. If the Subject header has no alphanumeric
characters, that message gets permanently deleted. If the sender uses other
characters, like a Subject of ~!#$%^&*()_+" then I still don't want it since
it has an unintelligible Subject header.

The above rule works as expected so I defined a similar rule to test when
the body of the message is blank. It is identical except it tests using
"except if the body contains" clause. However, this body-is-blank rule
misfires. Out of, say, a dozen messages in my Inbox, about 2 will get
triggered on and deleted. Those messages have LOTS of text in them so they
are *not* blank. When testing, I change the "permanently delete" to just
"delete" so they get moved into the Deleted Items folder from where I can
retrieve them to test again.

It is almost as though the "except if subject contains" clause checks on
characters but the "except if body contains" clause is checking on words.
In the mails on which the body-is-blank rule misfires, there is no use of
the word "a" or "I" or any single-character in the mail. There are lots of
words in the message but no single characters. So it is as if the rule is
looking for words that are single characters instead of seeing that the
specified characters are used within words.
 
V

Vanguard

Diane Poremsky said:
Are the messages it misfires on HTML formatted? Outlook doesn't do well on
scanning bodies of HTML formatted messages.

I recommend using Sue's simple rules
(http://www.slipstick.com/rules/junkmail.htm#sue) along with your blank
rules.


I checked and, yep, they are HTML formatted. Outlook doesn't show me the
real raw source of an e-mail but instead its "converted" format (unlike in
OE where I can see the raw format). So I cannot tell if the body is just
text within <HTML>...</HTML> tags or if there is an actual MIME section for
the HTML part. However, not all text is contained with the tags; i.e., most
of the text is outside of the tag delimiters (obviously all text that I can
see when viewing the e-mail is not within the tags themselves).

It is surprising that Outlook cannot scan the text that it displays. While
it can be forgiven that it doesn't check its rule clauses against text
within a tag, there is no reason why it cannot check the text not within a
tag. Definitely some shortsightedness here. After all, HTML is not some
compiled binary format. HTML is just plain text itself. You can even write
HTML is Notepad, edlin, or any text-only editor. Guess I cannot define a
rule to check on a blank body in Outlook.

Thanks for the help.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top