Working with strings and arrays

  • Thread starter Bryan Dickerson
  • Start date
B

Bryan Dickerson

I have a string that is delimited, e.g.
"SOD;NAME;ADDRESS1;ADDRESS2;PHONE;CITY;STATE;ZIP;EOD", that I need to strip
off the "SOD;" at the beginning and the ";EOD" at the end. I am well
familiar with the Split and Join functions that can split a string like this
out to an array and vice versa, but is there a way that I can split the
string into an array and then join it back to a string along with telling it
to ignore the first and last element? Or even some way to ReDim the "split"
array to ignore certain elements? I know I can do it the old fashioned way
with .IndexOf methods, but I was thinking that the Split and Join functions
would be somewhat faster. Would a stringbuilder array and a loop be much
faster yet (these strings can get to be pretty good size--thousands of
bytes)? Just thinking "out loud" as I wrote.

Thanx in advance for your thoughts!
 
C

Chris

Bryan said:
I have a string that is delimited, e.g.
"SOD;NAME;ADDRESS1;ADDRESS2;PHONE;CITY;STATE;ZIP;EOD", that I need to strip
off the "SOD;" at the beginning and the ";EOD" at the end. I am well
familiar with the Split and Join functions that can split a string like this
out to an array and vice versa, but is there a way that I can split the
string into an array and then join it back to a string along with telling it
to ignore the first and last element? Or even some way to ReDim the "split"
array to ignore certain elements? I know I can do it the old fashioned way
with .IndexOf methods, but I was thinking that the Split and Join functions
would be somewhat faster. Would a stringbuilder array and a loop be much
faster yet (these strings can get to be pretty good size--thousands of
bytes)? Just thinking "out loud" as I wrote.

Thanx in advance for your thoughts!


If you just want to find the first and last items then split and join
will not be fast by any means. Using split it goes through each
character in your array, finds the index item copies each item to an
array. if you just do a String.Indexof and String.LastIndexof and then
do a substring, the number of comparisons and the number of copies that
have to take place are set at 2 each. that's the way you should do this.

Chris
 
J

Jay B. Harlow [MVP - Outlook]

Bryan,
|I have a string that is delimited, e.g.
| "SOD;NAME;ADDRESS1;ADDRESS2;PHONE;CITY;STATE;ZIP;EOD", that I need to
strip
| off the "SOD;" at the beginning and the ";EOD" at the end.
With "fixed" prefixes & suffixes I would use StartsWith & EndsWith coupled
with SubString.

Something like:

Private Function TrimContent(ByVal input As String, ByVal prefix As
String, ByVal suffix As String) As String
Dim startIndex As Integer = 0
Dim length As Integer = input.Length

If input.StartsWith(prefix) Then
startIndex += prefix.Length
length -= prefix.Length
End If
If input.EndsWith(suffix) Then
length -= suffix.Length
End If

Return input.Substring(startIndex, length)
End Function

Public Sub Main()
Dim input As String =
"SOD;NAME;ADDRESS1;ADDRESS2;PHONE;CITY;STATE;ZIP;EOD"
Dim output As String = TrimContent(input, "SOD;", ";EOD")
Debug.WriteLine(input, "input")
Debug.WriteLine(output, "output")
End Sub

The advantage of TrimContent is that it does not create any temporary string
objects, using Split & Join you would be creating a number of temporary
string objects that may be putting undue pressure on the GC.

The StartsWith & EndsWith are there more to ensure that valid data is not
trimmed accidentally.

You would need to profile whether StartsWith/EndWith, IndexOf, or Split/Join
would be faster, however I would find StartsWith/EndsWith or IndexOf to be
the faster routines, plus more importantly more obvious what you are
attempting to do.

--
Hope this helps
Jay [MVP - Outlook]
..NET Application Architect, Enthusiast, & Evangelist
T.S. Bradley - http://www.tsbradley.net


|I have a string that is delimited, e.g.
| "SOD;NAME;ADDRESS1;ADDRESS2;PHONE;CITY;STATE;ZIP;EOD", that I need to
strip
| off the "SOD;" at the beginning and the ";EOD" at the end. I am well
| familiar with the Split and Join functions that can split a string like
this
| out to an array and vice versa, but is there a way that I can split the
| string into an array and then join it back to a string along with telling
it
| to ignore the first and last element? Or even some way to ReDim the
"split"
| array to ignore certain elements? I know I can do it the old fashioned
way
| with .IndexOf methods, but I was thinking that the Split and Join
functions
| would be somewhat faster. Would a stringbuilder array and a loop be much
| faster yet (these strings can get to be pretty good size--thousands of
| bytes)? Just thinking "out loud" as I wrote.
|
| Thanx in advance for your thoughts!
|
| --
| TFWBWY...A
|
|
 
B

Bryan Dickerson

Thanx! After I posted this, I thought, "I can just do some playing myself,"
but I figured the post would do well at seeing if there were any new ways
that you could do this. So I had fairly settled on a method very similar to
this after trying to use Split and re-Join or "String-Build" it back
together & seeing that I didn't gain much speed. I guess ultimately old
methods are not necessarily to be thrown out, I just need to be smart in how
& where I implement them. It's interesting, though, 'cause in VB6 I had
some routines that used simple substringing ("Mid$" functions, etc) to break
apart long strings and when I changed them to use Split, Join and dynamic
arrays, they were measurably speeded up. Any ideas as to why? And any
ideas as to why they don't bring the same measure of speed in .Net?


Jay B. Harlow said:
Bryan,
|I have a string that is delimited, e.g.
| "SOD;NAME;ADDRESS1;ADDRESS2;PHONE;CITY;STATE;ZIP;EOD", that I need to
strip
| off the "SOD;" at the beginning and the ";EOD" at the end.
With "fixed" prefixes & suffixes I would use StartsWith & EndsWith coupled
with SubString.

Something like:

Private Function TrimContent(ByVal input As String, ByVal prefix As
String, ByVal suffix As String) As String
Dim startIndex As Integer = 0
Dim length As Integer = input.Length

If input.StartsWith(prefix) Then
startIndex += prefix.Length
length -= prefix.Length
End If
If input.EndsWith(suffix) Then
length -= suffix.Length
End If

Return input.Substring(startIndex, length)
End Function

Public Sub Main()
Dim input As String =
"SOD;NAME;ADDRESS1;ADDRESS2;PHONE;CITY;STATE;ZIP;EOD"
Dim output As String = TrimContent(input, "SOD;", ";EOD")
Debug.WriteLine(input, "input")
Debug.WriteLine(output, "output")
End Sub

The advantage of TrimContent is that it does not create any temporary
string
objects, using Split & Join you would be creating a number of temporary
string objects that may be putting undue pressure on the GC.

The StartsWith & EndsWith are there more to ensure that valid data is not
trimmed accidentally.

You would need to profile whether StartsWith/EndWith, IndexOf, or
Split/Join
would be faster, however I would find StartsWith/EndsWith or IndexOf to be
the faster routines, plus more importantly more obvious what you are
attempting to do.

--
Hope this helps
Jay [MVP - Outlook]
.NET Application Architect, Enthusiast, & Evangelist
T.S. Bradley - http://www.tsbradley.net


|I have a string that is delimited, e.g.
| "SOD;NAME;ADDRESS1;ADDRESS2;PHONE;CITY;STATE;ZIP;EOD", that I need to
strip
| off the "SOD;" at the beginning and the ";EOD" at the end. I am well
| familiar with the Split and Join functions that can split a string like
this
| out to an array and vice versa, but is there a way that I can split the
| string into an array and then join it back to a string along with
telling
it
| to ignore the first and last element? Or even some way to ReDim the
"split"
| array to ignore certain elements? I know I can do it the old fashioned
way
| with .IndexOf methods, but I was thinking that the Split and Join
functions
| would be somewhat faster. Would a stringbuilder array and a loop be
much
| faster yet (these strings can get to be pretty good size--thousands of
| bytes)? Just thinking "out loud" as I wrote.
|
| Thanx in advance for your thoughts!
|
| --
| TFWBWY...A
|
|
 
1

1388-2/HB

For the sake of conversation if you want to solve this with split/join,
here's one way you could do it:

Dim wholeArray() as string
Dim trimmedArray() as string

wholeArray = myString.Split(";")
Array.Copy(wholeArray, 1, trimmedArray, wholeArray.Length - 2)
myString = trimmedArray.Join(";")

But if you mean to say that "SOD;" is literally at the beginning of the
string and ";EOD" is literally at the end, and you wish to obtain a new
string without them, this would probably be both easier and faster:

myString = myString.Substr(4, myString.Length - 8)
 
C

Cor Ligthert [MVP]

Bryan,

If I understand you well, than it looks very easy to me.

Find the startindex from ";" with indexof
Find the lastIndex from ";" with LastIndexof
Create a newstring using myOldString.Substring(startindex+1, lastindexof +
1 - startindex)

I hope this helps,

Cor
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top