How to extract multiple occurrences of a substring


G

Guest

Hello.

Using VS.NET 2003 VB. If i have a string similar to the attached, how would
i extract the "Truckname=" data from it in a loop and stay in the loop until
the end of the string is reached ? As you can see the first truckname is
"284165". The next truckname is "284193"

Any help would be gratefully appreciated.

Thanks,
Tony

<TruckConduitDataObject><Truck XVIN="67112637" TruckName="284165"
OrganizationID="1214" OrganizationName="Croydon Dry"
DateOfQuery="2006-04-19T21:56:08.3700000-05:00" IsActive="true"
Odo="1874679.8" OdoAsOf="2006-04-19T20:43:00.0000000-05:00"
FormattedDateTime="04/19/06 04:43p" Axles="2" Berth="false"
HasOnBoardPlatform="true" HasDIU="true" /><Truck XVIN="67112638"
TruckName="284193" OrganizationID="1214" OrganizationName="Croydon Dry"
DateOfQuery="2006-04-19T21:56:08.3700000-05:00" IsActive="true"
Odo="1633058.2" OdoAsOf="2006-04-19T20:21:00.0000000-05:00"
FormattedDateTime="04/19/06 04:21p" Axles="3" Berth="false"
HasOnBoardPlatform="true" HasDIU="true" /><Truck XVIN="67112639"
TruckName="294934" OrganizationID="1214" OrganizationName="Croydon Dry"
DateOfQuery="2006-04-19T21:56:08.3700000-05:00" IsActive="true"
Odo="128325.5" OdoAsOf="2006-04-19T14:43:00.0000000-05:00"
FormattedDateTime="04/19/06 10:43a" Axles="3" Berth="false"
HasOnBoardPlatform="true" HasDIU="true" /><Truck XVIN="67112640"
TruckName="241486" OrganizationID="1214" OrganizationName="Croydon Dry"
DateOfQuery="2006-04-19T21:56:08.3700000-05:00" IsActive="true"
Odo="249414.5" OdoAsOf="2006-04-19T04:00:00.0000000-05:00"
FormattedDateTime="04/19/06 12:00a" Axles="3" Berth="false"
HasOnBoardPlatform="true" HasDIU="true" /><Truck XVIN="67112641"
TruckName="447859" OrganizationID="1214" OrganizationName="Croydon Dry"
DateOfQuery="2006-04-19T21:56:08.3700000-05:00" IsActive="true" Odo="283377"
OdoAsOf="2006-04-19T20:44:00.0000000-05:00" FormattedDateTime="04/19/06
04:44p" Axles="3" Berth="false" HasOnBoardPlatform="true" HasDIU="true"
/><Truck XVIN="67112642" TruckName="425218" OrganizationID="1214"
OrganizationName="Croydon Dry"
DateOfQuery="2006-04-19T21:56:08.3700000-05:00" IsActive="true" Odo="86711.9"
OdoAsOf="2006-04-19T20:40:00.0000000-05:00" FormattedDateTime="04/19/06
04:40p" Axles="3" Berth="false" HasOnBoardPlatform="true" HasDIU="true"
/><Truck XVIN="67112662" TruckName="211103" OrganizationID="1214"
OrganizationName="Croydon Dry"
DateOfQuery="2006-04-19T21:56:08.3700000-05:00" IsActive="true"
Odo="185236.2" OdoAsOf="2006-04-19T04:00:00.0000000-05:00"
FormattedDateTime="04/19/06 12:00a" Axles="3" Berth="false"
HasOnBoardPlatform="true" HasDIU="true" /><Truck XVIN="67112667"
TruckName="251638" OrganizationID="1214" OrganizationName="Croydon Dry"
DateOfQuery="2006-04-19T21:56:08.3700000-05:00" IsActive="true" Odo="0"
OdoAsOf="1990-01-01T00:00:00.0000000-06:00" FormattedDateTime="12/31/89
07:00p" Axles="3" Berth="false" HasOnBoardPlatform="false" HasDIU="false"
/></TruckConduitDataObject>
 
Ad

Advertisements

H

Homer J Simpson

Tony Girgenti said:
Hello.

Using VS.NET 2003 VB. If i have a string similar to the attached, how
would
i extract the "Truckname=" data from it in a loop and stay in the loop
until
the end of the string is reached ? As you can see the first truckname is
"284165". The next truckname is "284193"
Is this all in one line?
 
S

ShaneO

Tony said:
Hello.

Using VS.NET 2003 VB. If i have a string similar to the attached, how would
i extract the "Truckname=" data from it in a loop and stay in the loop until
the end of the string is reached ? As you can see the first truckname is
"284165". The next truckname is "284193"

Any help would be gratefully appreciated.
Tony, I have made a couple of assumptions with the following code -

1. You are reading data from a file
2. The TruckName data is always 6 chars long

I hope you'll be able to apply what's written here to your specific
requirements.

Dim sFileName As String = "C:\Temp\TruckData.txt"
Dim sA As String = File.ReadAllText(sFileName)
Dim X As Integer = 0

Do
X = sA.IndexOf("TruckName=", X)
If X <> -1 Then
X += 11
Debug.Print(sA.Substring(X, 6))
Else
Exit Do
End If
Loop

I tested it on your supplied data and it works perfectly.

Trust this helps.

ShaneO

There are 10 kinds of people - Those who understand Binary and those who
don't.
 
S

ShaneO

ShaneO said:
Dim sFileName As String = "C:\Temp\TruckData.txt"
Dim sA As String = File.ReadAllText(sFileName)
Dim X As Integer = 0

Do
X = sA.IndexOf("TruckName=", X)
If X <> -1 Then
X += 11
Debug.Print(sA.Substring(X, 6))
Else
Exit Do
End If
Loop
Oops! Forgot to tell you to add the NameSpace -

Imports System.IO

ShaneO

There are 10 kinds of people - Those who understand Binary and those who
don't.
 
G

Guest

Shane.

Thanks alot. That worked beautifully.
Excellent.

Thanks again,
Tony
 
J

JDMils

And everyone forgets the power of RegEx!!!

This is off the top of my head:

Dim strHtml As String = "Your string here...................."

' Capture the TruckName.
Dim regTruckName As New RegularExpressions.Regex( _
"TruckName\=\""(\d{6})\""", _
Options:=RegularExpressions.RegexOptions.Singleline)

Dim m As RegularExpressions.Match

For Each m In regTruckName.Matches(strHtml)
'Trace.WriteLine(strNewLine)
Dim mLink As RegularExpressions.Match

And gives the output of all truck names!
 
Ad

Advertisements

C

Cerebrus

And everyone forgets the power of RegEx!!!

Yeah ! This is a perfect candidate for the application of Regex !

Regards,

Cerebrus.
 
S

ShaneO

JDMils said:
And everyone forgets the power of RegEx!!!
Hmmm... For simplicity, speed and readability I personally don't believe
the RegEx Class was needed in this case, and besides, who among us has
really had the time (or inclination) to learn all the RegEx
Methods/Properties??

Just my opinion!

ShaneO

There are 10 kinds of people - Those who understand Binary and those who
don't.
 
C

Cerebrus

For simplicity, speed and readability I personally don't believe
I don't know about the performance comparison between using String
methods and using the Regex engine. I haven't been able to find any
comparisons out there, so if you know of any, please let me know.

They aren't that many, you know. And when you do learn them, you will
wield a very powerful tool in your hands !

Regards,

Cerebrus.
 
C

Claes Bergefall

Looks like XML to me, so how about taking advantage of that

Dim doc As New System.Xml.XmlDocument
doc.LoadXml(myString)
Dim baseNode As System.Xml.XmlNode =
doc.SelectSingleNode("TruckConduitDataObject")
Dim nodes As System.Xml.XmlNodeList = baseNode.SelectNodes("Truck")

For Each node As System.Xml.XmlNode In nodes
Dim name As System.Xml.XmlAttribute = node.Attributes("TruckName")
If name IsNot Nothing Then
' Do stuff with the name
End If
Next

/claes
 
C

Claes Bergefall

Sorry, noticed you're using 2003. Replace the If statement with the
following:
If Not name Is Nothing Then
 
Ad

Advertisements

S

ShaneO

Cerebrus said:
I don't know about the performance comparison between using String
methods and using the Regex engine. I haven't been able to find any
comparisons out there, so if you know of any, please let me know.
Like me, I'm sure you can Google to find any number of references to the
performance issues users face with using RegEx.

But as I wrote: "in this case", I feel using in-built String Functions
to achieve the desired result was better than loading an entire Class.
Then there's the time required for the system to construct the regular
expression due to parsing. Also, RegEx maintains explicit stacks for
backtracking which involves many more CPU instructions/cycles for every
processed character when compared to a simple function call stack.

On the point of Parsing - As the RegEx Class caches regular expressions
to try to improve speed there is some debate as to the potential for
memory leaks as it's not clear if/when the cache is purged.

So, in this case, I don't believe RegEx offers a better solution.

ShaneO

There are 10 kinds of people - Those who understand Binary and those who
don't.
 
S

ShaneO

Cerebrus said:
I don't know about the performance comparison between using String
methods and using the Regex engine. I haven't been able to find any
comparisons out there, so if you know of any, please let me know.
Another point I forgot to mention in my reply:

In your RegEx example, the returned string includes "TruckName=" which
would then require the use of additional String Functions (.Substring /
..Mid) to extract the required six digits.

I'm not in anyway familiar with RegEx, but if the above is true (and I
haven't been able to find a way around it) then why not use String
Functions in the first place? Aren't you just using RegEx for the sake
of using RegEx in this case?

ShaneO

There are 10 kinds of people - Those who understand Binary and those who
don't.
 
Ad

Advertisements

C

Cerebrus

Hi,

Thanks for your enlightening views on this topic.

Ok, I agree with your first point. Since the data is rigid in this
case, it might be better to parse it using String functions, rather
than Regex. If, however there was a chance of the data varying even
slightly, your program would break. Thus, your assumptions are valid in
this case, and I guess there would be a performance gain.

That example was posted by JDMils and not me. And to correct you, he
used a capturing group, by enclosing the \d{6} within parenthesis.
Therefore the six digits would be queried by the Group / Capture
property of the Match object, and you wouldn't have to use additional
string functions like Substring etc. Also, he mentioned that he was
just posting a snippet off the top of his head, and therefore it would
not be fair to use it in a performance comparison test.

Also, repeated runs of the same parsing on this data might be faster,
since the Regex object offers the "Compiled" option.

Regards,

Cerebrus.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top