String instance count

J

Jon

I want to count the number of instances of a certain string(delimiter) in
another string. I didn't see a function to do this in the framework (if
there is, please point me to it). If not, could someone let me know if the
method I've used below is efficient or if there is a better way to do it, as
these will be rather large strings I'm searching in. Thanks

Public Shared Function CountDelimiter(ByVal strInput As String, ByVal
strDelimiter As String) As Int32
Dim iStart As Int32, iCount As Int32, iResult As Int32

'Set our vars to base values
iStart = 1
iCount = 0

Do
'iResult becomes the position where delimiter is found. If 0, not
found.
iResult = InStr(iStart, strInput, strDelimiter)
If iResult = 0 Then Exit Do
'Increment our count var for each time it is found
iCount += 1
'Increment our next start position to be the next char after the
currently found position
iStart = iResult + 1
Loop

Return iCount
End Function
 
J

Jay B. Harlow [MVP - Outlook]

Jon,
I would use a variation of your function, something like:

Public Shared Function CountDelimiter(ByVal input As String, _
ByVal delimiter As String) As Integer
Dim count, index As Integer

index = input.IndexOf(delimiter)
Do Until index < 0
count += 1
index = input.IndexOf(delimiter, index + 1)
Loop

Return count
End Function

Is the delimiter going to be a single character or two or more characters?
If its going to be a single Character I would make the delimiter parameter a
Char instead.

Public Shared Function CountDelimiter(ByVal input As String, _
ByVal delimiter As Char) As Integer
Dim count, index As Integer

index = input.IndexOf(delimiter)
Do Until index < 0
count += 1
index = input.IndexOf(delimiter, index + 1)
Loop

Return count
End Function

The Char version should perform better as it is looking for a single
character instead of 1 or more characters. In fact you can define them both
have an overloaded function.

Another option if the delimiter is a single Char is to use a for loop and
check each character.

Public Shared Function CountDelimiter(ByVal input As String, _
ByVal delimiter As Char) As Integer
Dim count As Integer

For Each ch As Char In input
If ch = delimiter Then
count += 1
End If
Next ch

Return count
End Function

I would expect the performance of the two to be about the same.

Hope this helps
Jay
 
S

Stephen

you all seem to be going about this as hard as you can
InStr returns the first instance of a string within a string starting at
given position
just write a loop that starts at position 0 at first, then every time
string1 is found it starts at its position+1

How simple is that?
 
J

Jay B. Harlow [MVP - Outlook]

Stephen,
you all seem to be going about this as hard as you can
'you all' who?

What you stated is the way Jon asked, and what I showed a variation of.

The difference between the way you & Jon solved it and the way I solved it.
Is I used Framework functions, where you & Jon used VB runtime functions. No
big deal as the net result is the same.

I also included the For Each Character variation posted in the C# newsgroup.
I suspect the For Each Character to be quicker than our Find Each Occurrence
method, But I could be wrong. Either way I prefer the Find Each Occurrence
method.

Just a thought
Jay
 
J

Jon

Thanks, that looks better. And, it would be better form to use framework
functions.
 
C

Chris Dunaway

Jon,
I would use a variation of your function, something like:

How about this method:


'\\\\\\\
Imports System.Text.RegularExpressions


Public Shared Function CountDelimiter(ByVal input As String, _
ByVal delimiter As String) As Integer

Dim rx As New Regex(delimiter)
Return rx.Matches(input).Count

End Function
'///////
 
J

Jay B. Harlow [MVP - Outlook]

Chris,
Its about 100x slow than either of the methods I demonstrated (the For Each
Occurrence & the For Each Character methods).

I'm not sure how the RegEx method compares to the Split method speed wise,
both will consume more memory then the above two as more objects are
involved. The RegEx method was demonstrated to be slower based a thread in
the C# newsgroup from 25 September. I would expect the Split method to also
be slower.

Hope this helps
Jay
 
J

Jay B. Harlow [MVP - Outlook]

Chris,
I should add, of course there are times when you need to count the
occurrences of a pattern!

For example I need to know how many times the number 1 immediately follows
any lower case letter. In which case your code is required.

Dim i As Integer = CountDelimiterPattern(input, "[a-z]1")

As indicated by the different name above, I would implement both if needed.
One that used String.IndexOf to find non-pattern delimiters and one that
used RegEx to find pattern delimiters.

Hope this helps
Jay
 
C

Cor

Jay B,
I did make the tests, they have to be in this newsgroup as a new thread.
If you will make a good regex it is easy to add to the test.
The one from Chris I was testing did not give the good results, with a
string I think.
With a char it was 15 times slower in my test,
I just tested that some seconds ago.
Cor
 
C

Cor

Jon,
That split method came only in my head yesterdaynight (here in Europe),
because it needs only two lines of writing.
For the rest it is a real stupid method.
:)
Cor
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top