Subtle String Question

M

Mike Labosh

In T-SQL, Consider this table:

CREATE TABLE stringTest (
String1 VARCHAR(5),
String2 VARCHAR(5)
)
GO
INSERT StringTest VALUES ('', 'A')
INSERT StringTest VALUES ('A', '')
GO
SELECT
CASE
WHEN String1 > String2
THEN 'String1 Is Greater'
WHEN String2 > String1
THEN "String2 Is Greater
END
GO

Because in T-SQL, it's faster to say:
WHERE String1 > String2
than it is to say:
WHERE String1 <> String2

I believe it's faster because > is SARGable and <> is not. And I only use
this technique when I want to compare to a ZLS, as in

String1 > '' -- faster than String1 <> ''

SOOOOO Then...


Consider this VB Snip:

Private Sub stringGreaterThan()

Dim s1 As String = ""
Dim s2 As String = "A"

Console.WriteLine(Convert.ToBoolean(s1 > s2))
Console.WriteLine(Convert.ToBoolean(s2 > s1))

End Sub

This method yields the same equivalent of it's T-SQL cousin, but does the
performance have the same behavior? i.e.

Are either of these actually faster than the other?

s1 > ""
vs
s1 <> ""

I know I'm nit-picking here, but I have this giant name parser program,
that, under certain conditions needs to sniff individual characters and
strings. I'm already using ByRef and StringBuilder where I can to get
speed, but anything I can conjure up to squeak a few more cycles per
iteration out of this big fat loop would help. 1 ms per iteration processing
a million records would make this thing fly.

--
Peace & happy computing,

Mike Labosh, MCSD

"When you kill a man, you're a murderer.
Kill many, and you're a conqueror.
Kill them all and you're a god." -- Dave Mustane
 
D

david

Consider this VB Snip:

Private Sub stringGreaterThan()

Dim s1 As String = ""
Dim s2 As String = "A"

Console.WriteLine(Convert.ToBoolean(s1 > s2))
Console.WriteLine(Convert.ToBoolean(s2 > s1))

End Sub

This method yields the same equivalent of it's T-SQL cousin, but does the
performance have the same behavior? i.e.

Are either of these actually faster than the other?

If you're just comparing to the empty string, the fastest is...

If s.Length > 0 Then
....

If comparing two different pre-ordered strings, then you're right...

s1 > s2
is a bit faster than

s1 <> s2

of course, this only works if you know the string order already. If you
have to do two comparisons, then which is faster is entirely dependent
on the strings.
I know I'm nit-picking here,

You are, which is fine, but I strongly suspect you're trying to squeeze
performance out of the wrong place.
but I have this giant name parser program,
that, under certain conditions needs to sniff individual characters and
strings. I'm already using ByRef and

I can't imagine why ByRef would speed you up. If anything I suspect
it's an insignificantly tiny bit slower.
 
D

David Browne

Mike Labosh said:
In T-SQL, Consider this table:

CREATE TABLE stringTest (
String1 VARCHAR(5),
String2 VARCHAR(5)
)
GO
INSERT StringTest VALUES ('', 'A')
INSERT StringTest VALUES ('A', '')
GO
SELECT
CASE
WHEN String1 > String2
THEN 'String1 Is Greater'
WHEN String2 > String1
THEN "String2 Is Greater
END
GO

Because in T-SQL, it's faster to say:
WHERE String1 > String2
than it is to say:
WHERE String1 <> String2

I believe it's faster because > is SARGable and <> is not. And I only use
this technique when I want to compare to a ZLS, as in

String1 > '' -- faster than String1 <> ''

SOOOOO Then...


Consider this VB Snip:

Private Sub stringGreaterThan()

Dim s1 As String = ""
Dim s2 As String = "A"

Console.WriteLine(Convert.ToBoolean(s1 > s2))
Console.WriteLine(Convert.ToBoolean(s2 > s1))

End Sub

This method yields the same equivalent of it's T-SQL cousin, but does the
performance have the same behavior? i.e.

Are either of these actually faster than the other?

s1 > ""
vs
s1 <> ""

You can sometimes get a small improvement by checking the relative lengths
of strings before you compare them.

Private Function IsEqual(ByVal s1 As String, ByVal s2 As String) As Boolean
Return s1.Length = s2.Length AndAlso s1 = s2
End Function

David
 
M

Mike Labosh

If you're just comparing to the empty string, the fastest is...
If s.Length > 0 Then

Interestingly enough, I expanded my experiment and looked at the MSIL:

// 6 Instructions plus 1 Call
//000010: Dim f1 As Boolean = s1 = s2
IL_000f: ldloc.s s1
IL_0011: ldloc.s s2
IL_0013: ldc.i4.0
IL_0014: call int32
[Microsoft.VisualBasic]Microsoft.VisualBasic.CompilerServices.StringType::StrCmp(string,

string,

bool)
IL_0019: ldc.i4.0
IL_001a: ceq
IL_001c: stloc.0

// 6 Instructions plus 1 Call
//000011: Dim f2 As Boolean = s1 > s2
IL_001d: ldloc.s s1
IL_001f: ldloc.s s2
IL_0021: ldc.i4.0
IL_0022: call int32
[Microsoft.VisualBasic]Microsoft.VisualBasic.CompilerServices.StringType::StrCmp(string,

string,

bool)
IL_0027: ldc.i4.0
IL_0028: cgt
IL_002a: stloc.1

//8 Instructions plus 1 Call
//000012: Dim f3 As Boolean = s1 <> s2
IL_002b: ldloc.s s1
IL_002d: ldloc.s s2
IL_002f: ldc.i4.0
IL_0030: call int32
[Microsoft.VisualBasic]Microsoft.VisualBasic.CompilerServices.StringType::StrCmp(string,

string,

bool)
IL_0035: ldc.i4.0
IL_0036: ceq
IL_0038: ldc.i4.0
IL_0039: ceq
IL_003b: stloc.2

// 4 Instructions plus 1 Virtual Call (whatever that means)
//000013: Dim f4 As Boolean = s2.Length > 0
IL_003c: ldloc.s s2
IL_003e: callvirt instance int32 [mscorlib]System.String::get_Length()
IL_0043: ldc.i4.0
IL_0044: cgt
IL_0046: stloc.3

// WE HAVE A WINNER I dug this out of CompilerServices Namespace
// 3 Instructions plus 1 Call
//000014: Dim f5 As Boolean = StringType.StrCmp(s1, s2, False) = -1
IL_0047: ldloc.s s1
IL_0049: ldloc.s s2
IL_004b: ldc.i4.0
IL_004c: call int32
[Microsoft.VisualBasic]Microsoft.VisualBasic.CompilerServices.StringType::StrCmp(string,

string,

bool)

--
Peace & happy computing,

Mike Labosh, MCSD

"When you kill a man, you're a murderer.
Kill many, and you're a conquerer.
Kill them all and you're a god." -- Dave Mustane
 
D

David Browne

Mike Labosh said:
If you're just comparing to the empty string, the fastest is...
If s.Length > 0 Then

Interestingly enough, I expanded my experiment and looked at the MSIL:

// 6 Instructions plus 1 Call
//000010: Dim f1 As Boolean = s1 = s2
IL_000f: ldloc.s s1
IL_0011: ldloc.s s2
IL_0013: ldc.i4.0
IL_0014: call int32
[Microsoft.VisualBasic]Microsoft.VisualBasic.CompilerServices.StringType::StrCmp(string,

string,

bool)
IL_0019: ldc.i4.0
IL_001a: ceq
IL_001c: stloc.0

// 6 Instructions plus 1 Call
//000011: Dim f2 As Boolean = s1 > s2
IL_001d: ldloc.s s1
IL_001f: ldloc.s s2
IL_0021: ldc.i4.0
IL_0022: call int32
[Microsoft.VisualBasic]Microsoft.VisualBasic.CompilerServices.StringType::StrCmp(string,

string,

bool)
IL_0027: ldc.i4.0
IL_0028: cgt
IL_002a: stloc.1

//8 Instructions plus 1 Call
//000012: Dim f3 As Boolean = s1 <> s2
IL_002b: ldloc.s s1
IL_002d: ldloc.s s2
IL_002f: ldc.i4.0
IL_0030: call int32
[Microsoft.VisualBasic]Microsoft.VisualBasic.CompilerServices.StringType::StrCmp(string,

string,

bool)
IL_0035: ldc.i4.0
IL_0036: ceq
IL_0038: ldc.i4.0
IL_0039: ceq
IL_003b: stloc.2

// 4 Instructions plus 1 Virtual Call (whatever that means)
//000013: Dim f4 As Boolean = s2.Length > 0
IL_003c: ldloc.s s2
IL_003e: callvirt instance int32 [mscorlib]System.String::get_Length()
IL_0043: ldc.i4.0
IL_0044: cgt
IL_0046: stloc.3

// WE HAVE A WINNER I dug this out of CompilerServices Namespace
// 3 Instructions plus 1 Call

You can't judge the relative performance this way. The costs of the invoked
methods could be vastly different.

[mscorlib]System.String::get_Length()

may be 10000x cheaper than

[Microsoft.VisualBasic]Microsoft.VisualBasic.CompilerServices.StringType::StrCmp(string,string, bool)David
 
D

david

Interestingly enough, I expanded my experiment and looked at the MSIL:
// 4 Instructions plus 1 Virtual Call (whatever that means)
//000013: Dim f4 As Boolean = s2.Length > 0
IL_003c: ldloc.s s2
IL_003e: callvirt instance int32 [mscorlib]System.String::get_Length()
IL_0043: ldc.i4.0
IL_0044: cgt
IL_0046: stloc.3

// WE HAVE A WINNER I dug this out of CompilerServices Namespace
// 3 Instructions plus 1 Call
//000014: Dim f5 As Boolean = StringType.StrCmp(s1, s2, False) = -1
IL_0047: ldloc.s s1
IL_0049: ldloc.s s2
IL_004b: ldc.i4.0
IL_004c: call int32
[Microsoft.VisualBasic]Microsoft.VisualBasic.CompilerServices.StringType::StrCmp(string,
string,
bool)

That's an awfully silly way to compare things. You don't really care
about the number of instructions in the statement, it's the function
call that's taking all the time.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top