Email Address Validation

  • Thread starter Thread starter Guest
  • Start date Start date
G

Guest

Hi everyone,

I have a text box bound to a memo field. My end users enter email
distribution lists into it and eventually click a button that sends an email
to these addresses.

Is there a "email address validation" routine?

The problem is that sometimes apostrophes, quotes, or worse, carriage
returns, are getting into this field, probably from my users copying/pasting
the addresses into this field from God knows what. These weird characters are
causing pretty bizzare behavior in Exchange, and I have to get this settled
once and for all. The business situation is such that I can't limit them to a
drop down list. There are hundreds of possible addressess, and a typical
email goes to 5 to 20 of them in just about every combination imaginable. Of
course this is a pain to get straight, which is why my end users are making
shortcut lists of common entries in Word, or copying from old email, or
whatever. Hence the weird characters.

So far, my amateur VBA skills allowed me to figure out I can use the InStr()
function to look for Chr(13) after the field is updated, and then give a
MsgBox to tell the user to fix it. I am hoping there is a way to avoid doing
this for every ascii character that is not a letter, number or puncutation
mark, but I will do it if necessary. However, this method does not guarantee
my end users will actually fix anything - or more to the point, will
understand exactly what needs fixing. I need a bullet proof solution. About 1
in 500 is blowing up, and that's enough to cause whole project to boil over.

What do you think?
 
You can place a Validation Rule on the field in your table.

Something like this:
Is Null Or ((Like "*?@?*.?*") And (Not Like "*[ ,;]*"))

Is Null means the field is not required. (Important in older versions of
Access.)

The Like part gives at least 1 character before the @, another before the
dot, and another after it.

The illegal characters go inside the square brackets in the Not Like part.
 
Jay wrote:
The problem is that sometimes apostrophes, quotes, or worse, carriage
returns, are getting into this field, probably from my users copying/pasting
the addresses into this field from God knows what.

AFAIK apostrophes/single quotes are legal in an email address.

You need validation rules (CHECK constraints) in the database to ensure
only valid data gets in. Here some suggestions (I'll break my own rule
and post ANSI mode only wildcard characters <g>):

CREATE TABLE ClientContacts (Email VARCHAR(255) NOT NULL);

ALTER TABLE ClientContacts ADD CONSTRAINT
ClientContacts__Email__pattern
CHECK (Email LIKE '%[A-Z]%@%[A-Z]%.%[A-Z]%')

ALTER TABLE ClientContacts ADD CONSTRAINT
ClientContacts__Email__characters
CHECK (Email NOT LIKE '%[!0-9A-Z''.@_-]%')

ALTER TABLE ClientContacts ADD CONSTRAINT
ClientContacts__Email__one_commercial_at
CHECK (Email NOT LIKE '%@%@%')

ALTER TABLE ClientContacts ADD CONSTRAINT
ClientContacts__Email__mailbox_name_length
CHECK ((INSTR(1, Email, '@') - 1) BETWEEN 1 AND 127)

ALTER TABLE ClientContacts ADD CONSTRAINT
ClientContacts__Email__domain_name_length
CHECK (LEN(Email) - INSTR(1, Email, '@') BETWEEN 3 AND 127)
These weird characters are
causing pretty bizzare behavior in Exchange, and I have to get this settled
once and for all.

OK, so if the apostrophes/single quotes cause specific problems then
add further rules e.g.

ALTER TABLE ClientContacts ADD CONSTRAINT
ClientContacts__Email__ms_exchange_characters
CHECK (Email NOT LIKE '%''%')

In contrast, it could be that you need to allow greater flexibility
than the standard email validation allows e.g. profile names/aliases
are legal, you are obliged to persist contact details even if it is
known to be wrong, etc. In which case, you'll need to drop the above
constraints and go for some more general ones. Be warned it is usually
easier to test for the presence of only legal characters than it is to
test for any illegal character.

Jamie.

--
 
Hi Jay,

It strikes me that you could simplify the problem. Checking whether an
arbitrary string of characters is a well-formed email address is
horribly difficult (the rules seem to be at
http://www.faqs.org/rfcs/rfc2822.html ) and even when you've done it you
still don't know whether it's valid in the sense of being a route to the
person you want to reach.

Instead, you might consider by looking for a "@" and then working
outwards from there until you find something that can't be part of an
"ordinary" email address. By "ordinary" I mean something like this
(e-mail address removed)
rather than
"Joe and Helen Blow" <[email protected]>
or one of the other many variations.

That's a comparatively simple regex job, maybe something like this
(based on one in _Mastering Regular Expressions_) would be enough:

(\w[-.&\w]*@[-0-9a-z]+(?:\.[-0-9a-z]+)*\.[a-z]{2,6})
 
I am replying to you since the original message was not read by my
newsreader.

If Jay were to search the internet then the results will include samples of
the kind that Jay asked for. Just don't Google for this since they might
lock you out. It is bizarre to me that Google locked me out for this. The
following were among the results when I used Yahoo.

http://aspfaqs.com/aspfaqs/ShowFAQ.asp?FAQID=47
http://lampwriter.com/email


John Nurick said:
Hi Jay,

It strikes me that you could simplify the problem. Checking whether an
arbitrary string of characters is a well-formed email address is
horribly difficult (the rules seem to be at
http://www.faqs.org/rfcs/rfc2822.html ) and even when you've done it you
still don't know whether it's valid in the sense of being a route to the
person you want to reach.

Instead, you might consider by looking for a "@" and then working
outwards from there until you find something that can't be part of an
"ordinary" email address. By "ordinary" I mean something like this
(e-mail address removed)
rather than
"Joe and Helen Blow" <[email protected]>
or one of the other many variations.

That's a comparatively simple regex job, maybe something like this
(based on one in _Mastering Regular Expressions_) would be enough:

(\w[-.&\w]*@[-0-9a-z]+(?:\.[-0-9a-z]+)*\.[a-z]{2,6})



Hi everyone,

I have a text box bound to a memo field. My end users enter email
distribution lists into it and eventually click a button that sends an
email
to these addresses.

Is there a "email address validation" routine?

The problem is that sometimes apostrophes, quotes, or worse, carriage
returns, are getting into this field, probably from my users
copying/pasting
the addresses into this field from God knows what. These weird characters
are
causing pretty bizzare behavior in Exchange, and I have to get this
settled
once and for all. The business situation is such that I can't limit them
to a
drop down list. There are hundreds of possible addressess, and a typical
email goes to 5 to 20 of them in just about every combination imaginable.
Of
course this is a pain to get straight, which is why my end users are
making
shortcut lists of common entries in Word, or copying from old email, or
whatever. Hence the weird characters.

So far, my amateur VBA skills allowed me to figure out I can use the
InStr()
function to look for Chr(13) after the field is updated, and then give a
MsgBox to tell the user to fix it. I am hoping there is a way to avoid
doing
this for every ascii character that is not a letter, number or puncutation
mark, but I will do it if necessary. However, this method does not
guarantee
my end users will actually fix anything - or more to the point, will
understand exactly what needs fixing. I need a bullet proof solution.
About 1
in 500 is blowing up, and that's enough to cause whole project to boil
over.

What do you think?
 

I tried each of these on two valid email addresses: one of my own, and
one from my addressbook. Nothing fancy, just
(e-mail address removed)
where xxx is a a string of characters in the range a-z, yyy is another
such, and tld is a valid top level domain (e.g. "com", "se").

The "lampwriter" page said they were both invalid. The "aspfaqs" code
accepted one but rejected the other.

Ah well, one out of four ain't bad<g>.
 
John Nurick said:
I tried each of these on two valid email addresses: one of my own, and
one from my addressbook. Nothing fancy, just
(e-mail address removed)
where xxx is a a string of characters in the range a-z, yyy is another
such, and tld is a valid top level domain (e.g. "com", "se").

The "lampwriter" page said they were both invalid. The "aspfaqs" code
accepted one but rejected the other.

Ah well, one out of four ain't bad<g>.


I don't understand. Were they truly valid email addresses? When you say "xxx
is a a string of characters in the range a-z, yyy is another such" you imply
that the characters were relatively random; that the rejected addresses were
not addresses that email would be accepted for.

The aspfaqs.com sample is sample code. It would be interesting to determine
why it rejected one.

The important issue is that there are many samples available for searching.
I did not look as thoroughly as I would if I needed an answer.
 
I don't understand. Were they truly valid email addresses? When you say "xxx
is a a string of characters in the range a-z, yyy is another such" you imply
that the characters were relatively random; that the rejected addresses were
not addresses that email would be accepted for.

Not at all. They were (and are) genuine working email addresses. I've
received a dozen messages on one of them overnight. I'm not going to
post them here, for obvious reasons.
The aspfaqs.com sample is sample code. It would be interesting to determine
why it rejected one.

No problems: the sample code rejects the new TLDs. The other code you
cited uses lookup techniques that probably worked ten years ago but are
routinely blocked for security reasons by modern email servers.
The important issue is that there are many samples available for searching.
I did not look as thoroughly as I would if I needed an answer.

Fair enough. But it's a safe bet that all the other samples out there
are equally ineffective when it comes to validating email addresses in
the only sense that really matters - namely discovering whether they
reach the person you want to reach.
 
John Nurick said:
Fair enough. But it's a safe bet that all the other samples out there
are equally ineffective when it comes to validating email addresses in
the only sense that really matters - namely discovering whether they
reach the person you want to reach.

When I reported to Microsoft that there is a bug in a sample program, they
just said that it is a sample program. In other words, it is up to us to fix
the bug.

A sample is a sample. People usually don't complain when a sample needs to
be fixed; they fix them themselves.

The main part of the aspfaqs.com sample is a regex. Was it not you that
suggested a regex?
 
The main part of the aspfaqs.com sample is a regex. Was it not you that
suggested a regex?

Up to a point. I suggested a regex as a possible solution to the OP's
problem, not as a way of actually vaidating email addresses. (Wiser
heads than mine have concluded that it's not possible for a regex to do
this, even in the weak sense of testing for RFC822 compliance as against
making sure that there is a mailbox with that address.)

But at least the regex I posted in this thread does "accept" the valid
address that the other two rejected<g>.
 
Avoid typing the same text again and again
Stop wasting your time on mouse movements
Open favorite web pages with a single hotkey press
Record keystrokes and play them back with a single hotkey press
 
Back
Top