Like operator with accented character

John · Jan 22, 2006

Hi

I am using Like "*[!-!0-9A-Z &,.@/'():+]*" in my select query to exclude all
characters listed but the query also excludes accented characters? Is there
a way that accented characters are not included in this filter?

Thank again.

Regards

Tom Ellison · Jan 22, 2006

Dear John:

First, I suggest you look at the Character Map utility (Start / Programs /
Accessories / System Tools / Character Map). There you can identify and
insert the desired characters into the expression. If the characters you
want are consecutive, you can insert a range of them using the hypenated
syntax.

I don't know if it's significant, but it bothers me how you open your set.
The "!-!" seems at least ambiguous. The initial ! is normally interpreted
as "NOT". A non-escaped hyphen is normally a range designator. But there
is no antecedent character to define the range, so it can't be a range. If
you mean it to be a literal hyphen, you should escape it.

Just for the sake of clarity, I suggest you put the ranges first:

[!0-9A-Z

and follow that with all the single characters you want. Don't forget to
escape any special characters.

Tom Ellison

John Vinson · Jan 22, 2006

Hi

I am using Like "*[!-!0-9A-Z &,.@/'():+]*" in my select query to exclude all
characters listed but the query also excludes accented characters? Is there
a way that accented characters are not included in this filter?

Yes. The ASCII value for accented characters is is different than the
non-accented characters: é is 233, e is 65.

I'm not certain how the LIKE operator handles upper vs. lower case (e
and E are different ASCII values as well), but try

Like "*[!-!0-9A-ZÀ-Ý &,.@/'():+]*"

You may want to use some VBA code to include the characters you want
to include, rather than excluding the thousands of possible Unicode
characters that might get typed by imaginative users. Šå?ï?fæç?ò?ÿ?

John W. Vinson[MVP]

Tom Ellison · Jan 22, 2006

Dear John:

Another idea might be to use the same keyboard map and font that your users
do in typing these characters. Then you'd be sure to type them so they
match what has been entered.

Tom Ellison

Tom Ellison said:
Dear John:

First, I suggest you look at the Character Map utility (Start / Programs /
Accessories / System Tools / Character Map). There you can identify and
insert the desired characters into the expression. If the characters you
want are consecutive, you can insert a range of them using the hypenated
syntax.

I don't know if it's significant, but it bothers me how you open your set.
The "!-!" seems at least ambiguous. The initial ! is normally interpreted
as "NOT". A non-escaped hyphen is normally a range designator. But there
is no antecedent character to define the range, so it can't be a range.
If you mean it to be a literal hyphen, you should escape it.

Just for the sake of clarity, I suggest you put the ranges first:

[!0-9A-Z

and follow that with all the single characters you want. Don't forget to
escape any special characters.

Tom Ellison

John said:

Hi

I am using Like "*[!-!0-9A-Z &,.@/'():+]*" in my select query to exclude
all characters listed but the query also excludes accented characters? Is
there a way that accented characters are not included in this filter?

Thank again.

Regards

Click to expand...

John · Jan 23, 2006

Hi John

The accented characters are being included in this filter without being
there. My problem is why they are being treated as being included in the
filter while I have not included them in this filter.

Thanks

Regards

John Vinson said:
Hi

I am using Like "*[!-!0-9A-Z &,.@/'():+]*" in my select query to exclude
all
characters listed but the query also excludes accented characters? Is
there
a way that accented characters are not included in this filter?

Click to expand...

Yes. The ASCII value for accented characters is is different than the
non-accented characters: é is 233, e is 65.

I'm not certain how the LIKE operator handles upper vs. lower case (e
and E are different ASCII values as well), but try

Like "*[!-!0-9A-ZÀ-Ý &,.@/'():+]*"

You may want to use some VBA code to include the characters you want
to include, rather than excluding the thousands of possible Unicode
characters that might get typed by imaginative users. Så?ï?fæç?ò?ÿ?

John W. Vinson[MVP]

John Nurick · Jan 23, 2006

Tom,

I was the one who recommended [!-... after reading the Help topic and
testing. ! in the first position negates the match, so the - is then the
first character in the class and as such matches itself and doesn't try
to be a range operator.

Also both the Help topic and experimentation suggest that the Like
operator in VBA doesn't have an escape mechanism in character classes.
What do you have in mind? Is this one of the superior features of
MSDE/SQLS?

John said:
Dear John:

First, I suggest you look at the Character Map utility (Start / Programs /
Accessories / System Tools / Character Map). There you can identify and
insert the desired characters into the expression. If the characters you
want are consecutive, you can insert a range of them using the hypenated
syntax.

I don't know if it's significant, but it bothers me how you open your set.
The "!-!" seems at least ambiguous. The initial ! is normally interpreted
as "NOT". A non-escaped hyphen is normally a range designator. But there
is no antecedent character to define the range, so it can't be a range. If
you mean it to be a literal hyphen, you should escape it.

Just for the sake of clarity, I suggest you put the ranges first:

[!0-9A-Z

and follow that with all the single characters you want. Don't forget to
escape any special characters.

Tom Ellison

John said:

Hi

I am using Like "*[!-!0-9A-Z &,.@/'():+]*" in my select query to exclude
all characters listed but the query also excludes accented characters? Is
there a way that accented characters are not included in this filter?

Thank again.

Regards

Click to expand...

Tom Ellison · Jan 23, 2006

Dear John:

Ouch! I didn't mean to step on your recommendation. I don't see that in
this thread. Was there a previous thread? Or has my OE just managed to
break up another thread without letting me know?

Sorry, but using the hyphen this way (since it's one of the special
characters in defining sets of characters inside the square brackets) just
looks wierd to me. So, the hyphen can be literal or used to define a
"range" of character values, without really clear context.

Yes, MSDE does use an escape character to remove any ambiguity. I believe
it would be required in all cases.

So, !-! doesn't mean from ! to !. I didn't expect it would. It just looks
like it does.

Well, if it is working except for the accented characters, that much is
good.

Now John (the OP) has written:

The accented characters are being included in this filter without being
there. My problem is why they are being treated as being included in the
filter while I have not included them in this filter.

So, I believe he is saying that a letter with an accent is being treated as
that letter without an accent. Is that right, John?

It makes sense that it would. That's actually the way I would expect it to
function. For sorting and searching functions, I beleive these characters
are being mapped. You may need some other kind of test if you need
something that deep. Probably you'll be writing a custom function for it.

Tom Ellison

John Nurick said:
Tom,

I was the one who recommended [!-... after reading the Help topic and
testing. ! in the first position negates the match, so the - is then the
first character in the class and as such matches itself and doesn't try
to be a range operator.

Also both the Help topic and experimentation suggest that the Like
operator in VBA doesn't have an escape mechanism in character classes.
What do you have in mind? Is this one of the superior features of
MSDE/SQLS?

Dear John:

First, I suggest you look at the Character Map utility (Start / Programs /
Accessories / System Tools / Character Map). There you can identify and
insert the desired characters into the expression. If the characters you
want are consecutive, you can insert a range of them using the hypenated
syntax.

I don't know if it's significant, but it bothers me how you open your set.
The "!-!" seems at least ambiguous. The initial ! is normally interpreted
as "NOT". A non-escaped hyphen is normally a range designator. But there
is no antecedent character to define the range, so it can't be a range.
If
you mean it to be a literal hyphen, you should escape it.

Just for the sake of clarity, I suggest you put the ranges first:

[!0-9A-Z

and follow that with all the single characters you want. Don't forget to
escape any special characters.

Tom Ellison

John said:

Hi

I am using Like "*[!-!0-9A-Z &,.@/'():+]*" in my select query to exclude
all characters listed but the query also excludes accented characters?
Is
there a way that accented characters are not included in this filter?

Thank again.

Regards

Click to expand...

Click to expand...

John Nurick · Jan 23, 2006

Hi Tom,

Ouch! I didn't mean to step on your recommendation.

I didn't think of it that way, just wanted to take my share of the

flak said:
I don't see that in
this thread. Was there a previous thread? Or has my OE just managed to
break up another thread without letting me know?

Another thread, subject "Selection by non-alphanumeric characters".

Sorry, but using the hyphen this way (since it's one of the special
characters in defining sets of characters inside the square brackets) just
looks wierd to me. So, the hyphen can be literal or used to define a
"range" of character values, without really clear context.

The logic seems clear to me: if it's the first or last character in the
character class it can't be defining a range so it must be itself.

Yes, MSDE does use an escape character to remove any ambiguity. I believe
it would be required in all cases.

I'm coming at this from the regex end, where escapes generally don't
work in character classes.

So, !-! doesn't mean from ! to !. I didn't expect it would. It just looks
like it does.

Well, if it is working except for the accented characters, that much is
good.

Now John (the OP) has written:

The accented characters are being included in this filter without being
there. My problem is why they are being treated as being included in the
filter while I have not included them in this filter.

So, I believe he is saying that a letter with an accent is being treated as
that letter without an accent. Is that right, John?

As I understand it, it's the other way round. VBA's Like operator not
only treats accented characters as different from their unaccented
relatives, but also sorts them according to their position in the
character set, not in the alphabet. So the class [A-Z] does not include
the character Á (A-acute).

According to Help, if you set Option Compare Text, Like then uses the
collation order from the current locale, which means (e.g.) that if
you're French Á goes between A and B. But it doesn't seem to do any good
in English-speaking locales.

It makes sense that it would. That's actually the way I would expect it to
function.

If only!

John Nurick said:
For sorting and searching functions, I beleive these characters
are being mapped. You may need some other kind of test if you need
something that deep. Probably you'll be writing a custom function for it.

Tom Ellison

John Nurick said:

Tom,

I was the one who recommended [!-... after reading the Help topic and
testing. ! in the first position negates the match, so the - is then the
first character in the class and as such matches itself and doesn't try
to be a range operator.

Also both the Help topic and experimentation suggest that the Like
operator in VBA doesn't have an escape mechanism in character classes.
What do you have in mind? Is this one of the superior features of
MSDE/SQLS?

Dear John:

First, I suggest you look at the Character Map utility (Start / Programs /
Accessories / System Tools / Character Map). There you can identify and
insert the desired characters into the expression. If the characters you
want are consecutive, you can insert a range of them using the hypenated
syntax.

I don't know if it's significant, but it bothers me how you open your set.
The "!-!" seems at least ambiguous. The initial ! is normally interpreted
as "NOT". A non-escaped hyphen is normally a range designator. But there
is no antecedent character to define the range, so it can't be a range.
If
you mean it to be a literal hyphen, you should escape it.

Just for the sake of clarity, I suggest you put the ranges first:

[!0-9A-Z

and follow that with all the single characters you want. Don't forget to
escape any special characters.

Tom Ellison

Hi

I am using Like "*[!-!0-9A-Z &,.@/'():+]*" in my select query to exclude
all characters listed but the query also excludes accented characters?
Is
there a way that accented characters are not included in this filter?

Thank again.

Regards

Click to expand...

Click to expand...

John · Jan 24, 2006

John/Tom

In like [!A-Z] does exclude for example é (e with accent) but not in
[!A-DF-Z]. I don't know how to tell access to treat é differently from e or
E in the query. Any ideas?

Thanks

Regards

John Vinson said:
Hi

I am using Like "*[!-!0-9A-Z &,.@/'():+]*" in my select query to exclude
all
characters listed but the query also excludes accented characters? Is
there
a way that accented characters are not included in this filter?

Click to expand...

Yes. The ASCII value for accented characters is is different than the
non-accented characters: é is 233, e is 65.

I'm not certain how the LIKE operator handles upper vs. lower case (e
and E are different ASCII values as well), but try

Like "*[!-!0-9A-ZÀ-Ý &,.@/'():+]*"

You may want to use some VBA code to include the characters you want
to include, rather than excluding the thousands of possible Unicode
characters that might get typed by imaginative users. Så?ï?fæç?ò?ÿ?

John W. Vinson[MVP]

John Nurick · Jan 24, 2006

John/John/Tom:

In like [!A-Z] does exclude for example é (e with accent) but not in
[!A-DF-Z]. I don't know how to tell access to treat é differently from e or
E in the query. Any ideas?

On my system (UK regional settings), Like *does* treats accented
characters differently from the unaccented equivalents, e.g.:

?"a" like "[!A-Z]"
False
?"á" like "[!A-Z]"
True
?"E" like "[!A-Z]"
False
?"É" like "[!A-Z]"
True
?"É" like "E"
False

So I think we must be at cross purposes, or else have different regional
settings (see what I've posted about Option Compare Text and Locale
elsewhere in this thread and your other one).

With English-speaking regional settings, there seems no way of getting
Like to match accented characters with their unaccented counterparts.
The VBScript regular expression engine, which is often an alternative,
doesn't help here (though the more modern engines in Perl and .NET take
it in their stride).

So you'll have to provide Like with either a character class that
specifies the acceptable characters, or a negated (i.e. [!...]) class
that specifies the unacceptable ones, using multiple ranges within the
class, e.g this, which i think copes with most western European
languages.
[!0-9A-ZÀ-ÏÑ-Ýß]
But you'll need to test using your fonts/character sets and data. I
think someone has already recommended charmap.exe.

Michel Walsh · Jan 24, 2006

Hi,

Depends of who evaluates the expression:

? eval(" 'à' LIKE '[!A-Z]' ")
0

? "à" LIKE "[!A-Z]"
True

Vanderghast, Access MVP

John Nurick said:
John/John/Tom:

In like [!A-Z] does exclude for example é (e with accent) but not in
[!A-DF-Z]. I don't know how to tell access to treat é differently from e
or
E in the query. Any ideas?

Click to expand...

On my system (UK regional settings), Like *does* treats accented
characters differently from the unaccented equivalents, e.g.:

?"a" like "[!A-Z]"
False
?"á" like "[!A-Z]"
True
?"E" like "[!A-Z]"
False
?"É" like "[!A-Z]"
True
?"É" like "E"
False

So I think we must be at cross purposes, or else have different regional
settings (see what I've posted about Option Compare Text and Locale
elsewhere in this thread and your other one).

With English-speaking regional settings, there seems no way of getting
Like to match accented characters with their unaccented counterparts.
The VBScript regular expression engine, which is often an alternative,
doesn't help here (though the more modern engines in Perl and .NET take
it in their stride).

So you'll have to provide Like with either a character class that
specifies the acceptable characters, or a negated (i.e. [!...]) class
that specifies the unacceptable ones, using multiple ranges within the
class, e.g this, which i think copes with most western European
languages.
[!0-9A-ZÀ-ÏÑ-Ýß]
But you'll need to test using your fonts/character sets and data. I
think someone has already recommended charmap.exe.

John Spencer · Jan 24, 2006

VBA has a LIKE operator and SQL has a Like operator. They are similar but
do behave differently.

Create a module with a function "LIKE" the following.

Option Compare Binary 'Note this is needed to force Binary comparison.
Option Explicit

Public Function fAlike(StrA, StrLike)
'John Spencer
'Binary sensitive like comparison.
'Not fully tested
'If Nulls are involved want to return NULL vice true or false
'so this is consistent with other functions

If IsNull(StrA) Or IsNull(StrLike) Then
fAlike = Null
Else
fAlike = StrA Like StrLike
End If

End Function

In SQL (query)
"AAbc" Like "a*" is True

Using the above VBA function
fAlike("AAbc","a*) will return false.

So the OP needs to create this function (or one similar) in a module and
call it in his query.

fAlike([SomeField], "*[!-!0-9A-Z a-z&,.@/'():+]*" )

Michel Walsh said:
Hi,

Depends of who evaluates the expression:

? eval(" 'à' LIKE '[!A-Z]' ")
0

? "à" LIKE "[!A-Z]"
True

Vanderghast, Access MVP

John Nurick said:

John/John/Tom:

In like [!A-Z] does exclude for example é (e with accent) but not in
[!A-DF-Z]. I don't know how to tell access to treat é differently from e
or
E in the query. Any ideas?

Click to expand...

On my system (UK regional settings), Like *does* treats accented
characters differently from the unaccented equivalents, e.g.:

?"a" like "[!A-Z]"
False
?"á" like "[!A-Z]"
True
?"E" like "[!A-Z]"
False
?"É" like "[!A-Z]"
True
?"É" like "E"
False

So I think we must be at cross purposes, or else have different regional
settings (see what I've posted about Option Compare Text and Locale
elsewhere in this thread and your other one).

With English-speaking regional settings, there seems no way of getting
Like to match accented characters with their unaccented counterparts.
The VBScript regular expression engine, which is often an alternative,
doesn't help here (though the more modern engines in Perl and .NET take
it in their stride).

So you'll have to provide Like with either a character class that
specifies the acceptable characters, or a negated (i.e. [!...]) class
that specifies the unacceptable ones, using multiple ranges within the
class, e.g this, which i think copes with most western European
languages.
[!0-9A-ZÀ-ÏÑ-Ýß]
But you'll need to test using your fonts/character sets and data. I
think someone has already recommended charmap.exe.

Click to expand...

John Nurick · Jan 24, 2006

Depends of who evaluates the expression:

? eval(" 'à' LIKE '[!A-Z]' ")
0

? "à" LIKE "[!A-Z]"
True

Thanks for that, Michel. I'd never got that far into the Eval()
documentation.

Michel Walsh · Jan 25, 2006

Hi,

It is not because of eval(), by itself, own importance, but because it is an
open door to what Jet-SQL would understand. A typical example, with iif,
where Jet sees it as a statement and executes only the required parts; in
VBA, iif is a standard function and thus evaluates each argument BEFORE
running the body of the function.

? eval("iif(-1, 1, 1/0)")
1
? iif(-1, 1, 1/0)

? eval("iif(0, 1, 1/0)")

The first sentence works fine, the second and third sentences produce the
same error, since they both need to evaluate 1/0; while the first sentence
DOES NOT. The use of EVAL() shows what Jet-SQL would do, kind of.

Vanderghast, Access MVP

John Nurick said:
Depends of who evaluates the expression:

? eval(" 'à' LIKE '[!A-Z]' ")
0

? "à" LIKE "[!A-Z]"
True

Click to expand...

Thanks for that, Michel. I'd never got that far into the Eval()
documentation.

UTF-8 and diacritics combining characters	5	Dec 19, 2008
TXT Files with French accented characters	2	Oct 19, 2007
WHERE's against words with accents	5	Jul 2, 2009
Accent insensitive	6	Nov 16, 2009
characters missing	5	Oct 27, 2008
Accented characters	1	Sep 27, 2007
Un-accent characters	1	Nov 16, 2004
Weird: Union query turns number into character	2	Feb 3, 2004

Like operator with accented character

John

Tom Ellison

John Vinson

Tom Ellison

John

John Nurick

Tom Ellison

John Nurick

John

John Nurick

Michel Walsh

John Spencer

John Nurick

Michel Walsh

Ask a Question

Similar Threads