String.Contains case insensitive?

M

mp

Is there a way to make String.Contains do a case insensitive comparison?
like an equivalent of vb6 Option CompareText in a code module?
I've read the help on the overload of .Contains that uses a
IEqualityComparer object but don't understand how to make that work
from the help:
<quote
This interface allows the implementation of customized equality comparison
for collections. That is, you can create your own definition of equality for
type T, and specify that this definition be used with a collection type that
accepts the IEqualityComparer<(Of <(T>)>) generic interface. In the .NET
Framework, constructors of the Dictionary<(Of <(TKey, TValue>)>) generic
collection type accept this interface.

end quote>

but i'm comparing a string and a substring rather than a collection so don't
know enough how to translate that to

for example in the following snippet how can i eliminate the else if by
making the comparison case insensitive

particularly if the string is not hard coded but comes from a run time
variable

string line;

//populate line variable

//look for substring

if (line.Contains("defun" ))

{

listBox1.Items.Add(line);

}

else if (line.Contains("Defun"))

{

listBox1.Items.Add(line);

}
 
S

Sreenivas

Is there a way to make String.Contains do a case insensitive comparison?
like an equivalent of vb6 Option CompareText in a code module?
I've read the help on the overload of .Contains that uses a
IEqualityComparer object but don't understand how to make that work
from the help:
<quote
This interface allows the implementation of customized equality comparison
for collections. That is, you can create your own definition of equality for
type T, and specify that this definition be used with a collection type that
accepts the IEqualityComparer<(Of <(T>)>) generic interface. In the .NET
Framework, constructors of the Dictionary<(Of <(TKey, TValue>)>) generic
collection type accept this interface.

end  quote>

but i'm comparing a string and a substring rather than a collection so don't
know enough how to translate that to

for example in the following snippet how can i eliminate the else if by
making the comparison case insensitive

particularly if the string is not hard coded but comes from a run time
variable

string line;

//populate line variable

//look for substring

if (line.Contains("defun" ))

{

listBox1.Items.Add(line);

}

else if (line.Contains("Defun"))

{

listBox1.Items.Add(line);



}- Hide quoted text -

- Show quoted text -


Go for Regular Expressions,

using System.Text.RegularExpressions;

if( Regex.IsMatch(inputStringObject,
stringToMatch,RegexOptions.IgnoreCase) )
{

}
 
S

Sreenivas

Is there a way to make String.Contains do a case insensitive comparison?
like an equivalent of vb6 Option CompareText in a code module?
I've read the help on the overload of .Contains that uses a
IEqualityComparer object but don't understand how to make that work
from the help:
<quote
This interface allows the implementation of customized equality comparison
for collections. That is, you can create your own definition of equality for
type T, and specify that this definition be used with a collection type that
accepts the IEqualityComparer<(Of <(T>)>) generic interface. In the .NET
Framework, constructors of the Dictionary<(Of <(TKey, TValue>)>) generic
collection type accept this interface.

end  quote>

but i'm comparing a string and a substring rather than a collection so don't
know enough how to translate that to

for example in the following snippet how can i eliminate the else if by
making the comparison case insensitive

particularly if the string is not hard coded but comes from a run time
variable

string line;

//populate line variable

//look for substring

if (line.Contains("defun" ))

{

listBox1.Items.Add(line);

}

else if (line.Contains("Defun"))

{

listBox1.Items.Add(line);



}- Hide quoted text -

- Show quoted text -
I think using RegularExpressions are overkill ,in this case .
do in this way,
if(line.ToLower().Contains())
{
listBox1.Items.Add(line);
}
 
P

Peter Duniho

Is there a way to make String.Contains do a case insensitive comparison?
like an equivalent of vb6 Option CompareText in a code module?

It is odd that Contains() doesn't include an overload that takes a
StringComparison value. It's so odd, we only a month ago had this same
discussion:
http://groups.google.com/group/micr...read/thread/ebc4258ed2c4270f/c48b6c15df01e8bf
I've read the help on the overload of .Contains that uses a
IEqualityComparer object but don't understand how to make that work
from the help:

Careful there. That's not an overload. It's an extension method. The
String class has only one Contains() method. But, String implements
IEnumerable<Char>, and the Enumerable class has a Contains<T>() extension
method that can be used with an IEnumerable<Char>.

As for the IEqualityComparer said:
<quote
This interface allows the implementation of customized equality
comparison
for collections. That is, you can create your own definition of equality
for
type T, and specify that this definition be used with a collection type
that
accepts the IEqualityComparer<(Of <(T>)>) generic interface. In the .NET
Framework, constructors of the Dictionary<(Of <(TKey, TValue>)>) generic
collection type accept this interface.

end quote>

but i'm comparing a string and a substring rather than a collection so
don't
know enough how to translate that to

Actually, if you use the Enumerable.Contains() method, you are comparing a
collection, even if you didn't realize it. :) But, this particular
method isn't useful as a replacement for String.Contains(), because it can
only find a single element in the collection, and a single element is just
one character.

Even if Enumerable contained a method that allowed you to look for a
subsequence in a given sequence, there would be other issues related to
implementing the IEqualityComparer<char> interface. But those issues are
moot, since as far as I can tell from your question, you are looking to
search for multi-character strings within other strings.

Anyway, hopefully the previous message thread we had last month will be
useful to you.

Pete
 
P

Peter Duniho

I think using RegularExpressions are overkill ,in this case .
do in this way,
if(line.ToLower().Contains())
{
listBox1.Items.Add(line);
}

Regex might be a little bit of overkill as compared to the most efficient
alternatives, but simply changing the case is a much poorer solution.
Please refer to the previous discussion (I provided a link in my other
reply to the OP) for details on why.

I would definitely use Regex before I would be willing to simply convert
the case.

Pete
 
M

Morten Wennevik [C# MVP]

Hi,

It's very easy, actually

List<string> list = new List<string> { "hello" };
bool b = list.Contains("Hello", StringComparer.CurrentCultureIgnoreCase);
 
M

Morten Wennevik [C# MVP]

Oh, my bad,

I thought you said list.Contains()

String Contains is almost as easy, although you are forced to write your own
extension as the provided comparer can only to a single char comparison.

Writing your own extensions is easy, though:

static class MyExtensions
{
public static bool ContainsIgnoreCase(this string source, string target)
{
return source.ToUpper().Contains(target.ToUpper());
}
}

....

string s = "Hello";
bool b = s.ContainsIgnoreCase("ELLO");

--
Happy Coding!
Morten Wennevik [C# MVP]


Morten Wennevik said:
Hi,

It's very easy, actually

List<string> list = new List<string> { "hello" };
bool b = list.Contains("Hello", StringComparer.CurrentCultureIgnoreCase);

--
Happy Coding!
Morten Wennevik [C# MVP]


mp said:
Is there a way to make String.Contains do a case insensitive comparison?
like an equivalent of vb6 Option CompareText in a code module?
I've read the help on the overload of .Contains that uses a
IEqualityComparer object but don't understand how to make that work
from the help:
<quote
This interface allows the implementation of customized equality comparison
for collections. That is, you can create your own definition of equality for
type T, and specify that this definition be used with a collection type that
accepts the IEqualityComparer<(Of <(T>)>) generic interface. In the .NET
Framework, constructors of the Dictionary<(Of <(TKey, TValue>)>) generic
collection type accept this interface.

end quote>

but i'm comparing a string and a substring rather than a collection so don't
know enough how to translate that to

for example in the following snippet how can i eliminate the else if by
making the comparison case insensitive

particularly if the string is not hard coded but comes from a run time
variable

string line;

//populate line variable

//look for substring

if (line.Contains("defun" ))

{

listBox1.Items.Add(line);

}

else if (line.Contains("Defun"))

{

listBox1.Items.Add(line);

}
 
P

Peter Duniho

[...]
static class MyExtensions
{
public static bool ContainsIgnoreCase(this string source, string
target)
{
return source.ToUpper().Contains(target.ToUpper());
}
}

I reiterate, one more time: do not use case conversions as a way of
implementing case-insensitive comparisons. It's unreliable as a
general-purpose solution.
 
M

Morten Wennevik [C# MVP]

I disagree.

The articles you refer to claims case rules in .Net breaks for Turkish. As
far as I can tell, my code works as expected, even for Turkish.

The letter 'i' is equal to lower case 'I' in english Culture, but not in
Turkish culture.
If a Turkish word containing 'i' was input into an application running
English culture The application should follow English casing rules. If the
application was meant to work in Turkey, the application should use the
Turkish culture. If the application needs to server mixed cultures, case
comparison between cultures is a bad idea no matter how you do it.

If there are cultures where ToUpper or ToLower that would be a bug and
should be fixed.
--
Happy Coding!
Morten Wennevik [C# MVP]


Peter Duniho said:
[...]
static class MyExtensions
{
public static bool ContainsIgnoreCase(this string source, string
target)
{
return source.ToUpper().Contains(target.ToUpper());
}
}

I reiterate, one more time: do not use case conversions as a way of
implementing case-insensitive comparisons. It's unreliable as a
general-purpose solution.
 
M

mp

Is there a way to make String.Contains do a case insensitive comparison?
I think using RegularExpressions are overkill ,in this case .
do in this way,
if(line.ToLower().Contains())
{
listBox1.Items.Add(line);
}


that's a good idea, thanks
mark
 
P

Peter Duniho

I disagree.

You can disagree all you like. It doesn't undo the documented problem.

It is true that if you can limit the input to ASCII, or some particular
culture, it is possible to get away with using case-conversion. But .NET
operates on Unicode strings, and provides no concise way to represent
those limitations. Furthermore, when you are posting to a public
newsgroup, you have no way to guarantee that code that uses
case-conversion will be used in this limited way even by convention.

It is nearly as easy to do the comparison _correctly_ as it is to do it
incorrectly. It makes no sense to prefer to write the code to do it
incorrectly.

Pete
 
H

Harlan Messinger

Peter said:
You can disagree all you like. It doesn't undo the documented problem.

It's the same problem no matter how you factor it. In the other thread
you wrote that "It's the fact that no matter what the culture, simply
changing the case of the string and comparing will not necessarily
produce the same results as doing an actual "case insensitive"
comparison for that culture." I don't see that a "case insensitive
comparison" between a target character and a character that has been
read in is different from comparing the target character to both the
read-in character and its alternate-case counterpart.

If you've got input that's permitted to be fuzzy on case, allowing "X"
as well as "x" and "N" instead of "n", then your code is going to have
to be able to figure out that "PORTRAIT" and "pOrTrAiT" and "porTRaIT"
are all the same as "portrait". Whatever comparison you do is going to
incorporate some assumption as to which characters correspond in an
upper- and lower-case relationship, whether you have
s1.ToLower().Contains(s2.ToLower()) or you use a case-insensitive
regular expression or you hard-code a letter-by-letter comparison. You
can't handle every case: your Turk could type "portrait" or "portraıt"
or "PORTRAIT" or "PORTRAÄ°T" into the configuration file. So you have to
make an assumption, and make it explicit what that assumption is, and
then feel free to code against what that assumption is and treat
anything else as an error condition.

Or you can specify that configuration data and user input are going to
be treated as case sensitive and then not worry about case at all.
 
S

Sreenivas

Regex might be a little bit of overkill as compared to the most efficient 
alternatives, but simply changing the case is a much poorer solution.  
Please refer to the previous discussion (I provided a link in my other  
reply to the OP) for details on why.

I would definitely use Regex before I would be willing to simply convert  
the case.
Pete

Pete , i have a question. Does RegexOptions.IgnoreCase( in the code
snippet i posted ) guarantee the comparison of all Unicode characters?
What i am asking is , is my code snippet works for all unicode
characters?
Thanks,
Sreenivas Reddy Thatiparthy.
 
P

Peter Duniho

Pete , i have a question. Does RegexOptions.IgnoreCase( in the code
snippet i posted ) guarantee the comparison of all Unicode characters?
What i am asking is , is my code snippet works for all unicode
characters?

It had better. The Regex class is defined to operate on .NET strings,
which are Unicode. If you find it doesn't, you should report that as a
bug.
 
M

Morten Wennevik [C# MVP]

Yes, there is documentation that say this is a problem, and I agree for many
it really is a problem, but it is more a problem of ignorance and rather than
the code language, or how unicode works. If you rely on learning "safe ways"
to deal with multiple cultures you will be none the wiser if you stumble upon
a new scenario. Instead be aware that cultures are different. Language
rules differ. Sorting rules differ. Number styles differ. Comma is not
always a thousand separator. Dot is not always a decimal separator, etc. If
you try to sort "bbb" "aaa" "ccc" alphabetically you won't even get the same
sort order. If you expect cross culture input you need to be aware of this
and not do cross culture comparison, there are simply so many rules that
differ that you will have to stick to a single culture.
 
D

D A

Another option is to use IndexOf. Allows for argument to determine whether comparison is case sensitive or not...
2 cents...
Dave
Is there a way to make String.Contains do a case insensitive comparison?
like an equivalent of vb6 Option CompareText in a code module?
I have read the help on the overload of .Contains that uses a
IEqualityComparer object but do not understand how to make that work
from the help:
<quote
This interface allows the implementation of customized equality comparison
for collections. That is, you can create your own definition of equality for
type T, and specify that this definition be used with a collection type that
accepts the IEqualityComparer<(Of <(T>)>) generic interface. In the .NET
Framework, constructors of the Dictionary<(Of <(TKey, TValue>)>) generic
collection type accept this interface.

end quote>

but i'm comparing a string and a substring rather than a collection so do not
know enough how to translate that to

for example in the following snippet how can i eliminate the else if by
making the comparison case insensitive

particularly if the string is not hard coded but comes from a run time
variable

string line;

//populate line variable

//look for substring

if (line.Contains("defun" ))

{

listBox1.Items.Add(line);

}

else if (line.Contains("Defun"))

{

listBox1.Items.Add(line);

}
On Wednesday, September 23, 2009 1:36 AM Peter Duniho wrote:
It is odd that Contains() does not include an overload that takes a
StringComparison value. it is so odd, we only a month ago had this same
discussion:
http://groups.google.com/group/micr...read/thread/ebc4258ed2c4270f/c48b6c15df01e8bf


Careful there. That's not an overload. it is an extension method. The
String class has only one Contains() method. But, String implements
IEnumerable<Char>, and the Enumerable class has a Contains<T>() extension
method that can be used with an IEnumerable<Char>.

As for the IEqualityComparer<T> interface:


Actually, if you use the Enumerable.Contains() method, you are comparing a
collection, even if you did not realize it. :) But, this particular
method is not useful as a replacement for String.Contains(), because it can
only find a single element in the collection, and a single element is just
one character.

Even if Enumerable contained a method that allowed you to look for a
subsequence in a given sequence, there would be other issues related to
implementing the IEqualityComparer<char> interface. But those issues are
moot, since as far as I can tell from your question, you are looking to
search for multi-character strings within other strings.

Anyway, hopefully the previous message thread we had last month will be
useful to you.

Pete
On Wednesday, September 23, 2009 2:04 AM Morten Wennevik [C# MVP] wrote:
Hi,

it is very easy, actually

List<string> list = new List<string> { "hello" };
bool b = list.Contains("Hello", StringComparer.CurrentCultureIgnoreCase);

--
Happy Coding!
Morten Wennevik [C# MVP]


"mp" wrote:
On Wednesday, September 23, 2009 2:17 AM Morten Wennevik [C# MVP] wrote:
Oh, my bad,

I thought you said list.Contains()

String Contains is almost as easy, although you are forced to write your own
extension as the provided comparer can only to a single char comparison.

Writing your own extensions is easy, though:

static class MyExtensions
{
public static bool ContainsIgnoreCase(this string source, string target)
{
return source.ToUpper().Contains(target.ToUpper());
}
}

...

string s = "Hello";
bool b = s.ContainsIgnoreCase("ELLO");

--
Happy Coding!
Morten Wennevik [C# MVP]


"Morten Wennevik [C# MVP]" wrote:
On Wednesday, September 23, 2009 3:52 AM Morten Wennevik [C# MVP] wrote:
I disagree.

The articles you refer to claims case rules in .Net breaks for Turkish. As
far as I can tell, my code works as expected, even for Turkish.

The letter 'i' is equal to lower case 'I' in english Culture, but not in
Turkish culture.
If a Turkish word containing 'i' was input into an application running
English culture The application should follow English casing rules. If the
application was meant to work in Turkey, the application should use the
Turkish culture. If the application needs to server mixed cultures, case
comparison between cultures is a bad idea no matter how you do it.

If there are cultures where ToUpper or ToLower that would be a bug and
should be fixed.
--
Happy Coding!
Morten Wennevik [C# MVP]


"Peter Duniho" wrote:
On Friday, September 25, 2009 3:58 AM Morten Wennevik [C# MVP] wrote:
Yes, there is documentation that say this is a problem, and I agree for many
it really is a problem, but it is more a problem of ignorance and rather than
the code language, or how unicode works. If you rely on learning "safe ways"
to deal with multiple cultures you will be none the wiser if you stumble upon
a new scenario. Instead be aware that cultures are different. Language
rules differ. Sorting rules differ. Number styles differ. Comma is not
always a thousand separator. Dot is not always a decimal separator, etc. If
you try to sort "bbb" "aaa" "ccc" alphabetically you will not even get the same
sort order. If you expect cross culture input you need to be aware of this
and not do cross culture comparison, there are simply so many rules that
differ that you will have to stick to a single culture.

--
Happy Coding!
Morten Wennevik [C# MVP]


"Peter Duniho" wrote:
 
M

mp

D A said:
Another option is to use IndexOf. Allows for argument to determine whether
comparison is case sensitive or not...
2 cents...
Dave

thanks for the thoughts
and coincidentally now 14 months after that original post, i'm back looking
at similar issues.
mark
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top