Regular Expression Help!

S

Schorschi

Not having used regular expressions much, I need some help.

Given a string... "This\0Guy\0Needs\0Some\0Help\0\0\0\0\0"
Need result as array of strings... "This","Guy", "Needs", "Some",
"Help"

Where '\0' is a literal zero-byte.

I think I need two regular expressions? One to strip the multiple
instances of '\0' bytes, and another to split the string.

Regex.Split works fine for the split, but I am having trouble getting
a regular expression to find the multiple '\0' instances, but leave
the single '\0' references alone.

Thanks!
 
S

steve

bad advice.

| use the vb.net function split, not regex
| Frank
| | > Not having used regular expressions much, I need some help.
| >
| > Given a string... "This\0Guy\0Needs\0Some\0Help\0\0\0\0\0"
| > Need result as array of strings... "This","Guy", "Needs", "Some",
| > "Help"
| >
| > Where '\0' is a literal zero-byte.
| >
| > I think I need two regular expressions? One to strip the multiple
| > instances of '\0' bytes, and another to split the string.
| >
| > Regex.Split works fine for the split, but I am having trouble getting
| > a regular expression to find the multiple '\0' instances, but leave
| > the single '\0' references alone.
| >
| > Thanks!
|
|
 
S

steve

this is a classic!

"((.)\1*)" ' any single character followed by same character 0 or more
times.

that's a very general approach...but just for your \0...

"((\0)\1*)" ' any \0 followed by 0 or more occurances of \0

hth,

steve

btw, tons of regex tools out there for .net...google for ".net regex gui". i
use "the regulator" and "regex coach" quite a bit when testing regex
patterns.


| Not having used regular expressions much, I need some help.
|
| Given a string... "This\0Guy\0Needs\0Some\0Help\0\0\0\0\0"
| Need result as array of strings... "This","Guy", "Needs", "Some",
| "Help"
|
| Where '\0' is a literal zero-byte.
|
| I think I need two regular expressions? One to strip the multiple
| instances of '\0' bytes, and another to split the string.
|
| Regex.Split works fine for the split, but I am having trouble getting
| a regular expression to find the multiple '\0' instances, but leave
| the single '\0' references alone.
|
| Thanks!
 
J

Jens Christian Mikkelsen

Schorschi said:
Given a string... "This\0Guy\0Needs\0Some\0Help\0\0\0\0\0"
Need result as array of strings... "This","Guy", "Needs", "Some",
"Help"

This single regular expression + code will do the job:

Dim oRegex As Regex
Dim sInput As String
Dim oCaptures As CaptureCollection
Dim asResult As String()

sInput = "This" & Chr(0) & "Guy" & Chr(0) & "Needs" & Chr(0) & _
"Some" & Chr(0) & "Help" & Chr(0) & Chr(0) & Chr(0) & Chr(0) & Chr(0)

oRegex = New Regex("^\0*((?<substring>[^\0]+)\0*)*$")
oCaptures = oRegex.Match(sInput).Groups("substring").Captures
ReDim asResult(oCaptures.Count - 1)
For i As Integer = 0 To oCaptures.Count - 1
asResult(i) = oCaptures(i).ToString
Next


/Jens
 
S

steve

| Isn't it precisely what Schorschi wants?

no, it's not precisely what he wants...this is:

| > | > Regex.Split works fine for the split, but I am having trouble
getting
| > | > a regular expression to find the multiple '\0' instances, but leave
| > | > the single '\0' references alone.

you're making assumptions about the string he's processing. albeit, your
assumption is probably fairly accurate. however, not enough info is provided
in order to make a logically correct "leap". for instance, what if he know's
the string has exactly what should turn out to be three elements in his
final array? but, these three elements could have the repetative \0's any
number of times w/n the original string at any location in the string. were
he to just use the split method, he'd have to loop through the array to get
all the elements and see which one's had something other than "". if he can
get his final array to constant dimensions, i.e. three, then he can create
constants by which he can refer to each element of the final array...ex.,
NAME, DESCRIPTION, ID, etc. makes the code easier to follow and maintain. i
digress...as it is, you have to give him the ability to leave only one
instance of \0 where it is immediately followed by a continuous repetion of
\0...unless you want to give him something other than what he "precisely"
wants.
 
J

Jay B. Harlow [MVP - Outlook]

Schorschi,
In addition to the other comments.
Regex.Split works fine for the split, but I am having trouble getting
a regular expression to find the multiple '\0' instances, but leave
the single '\0' references alone.
You can use "\0+" to match on one or more occurrences of a null char.


Remember there are three Split functions in .NET:

Use Microsoft.VisualBasic.Strings.Split if you need to split a string based
on a specific word (string). It is the Split function from VB6.

Use System.String.Split if you need to split a string based on a collection
of specific characters. Each individual character is its own delimiter.

Use System.Text.RegularExpressions.RegEx.Split to split based
on matching patterns.

In your example I would consider using RegEx.Split as it sounds like you do
not want duplicate empty elements.

Something like:

Const pattern As String = "\0+"
Dim rex As New Regex(pattern)
Dim input As String = "This\0Guy\0Needs\0Some\0Help\0\0\0\0\0"
input = input.Replace("\0", ChrW(0))

Dim words() As String = rex.Split(input)
For Each word As String In words
Debug.WriteLine(word)
Next

The one caveat is the trailing "\0" causes an empty element (which sounds
like its understandable). I would use String.Trim before the Split if this
is a problem...

Note the above works for "This\0\0\0Guy\0Needs\0\0Some\0Help\0\0\0\0\0"

Hope this helps
Jay
 
F

Frank

Steve, you still have not told me why the vb.net split function is bad
advice. I'd like to know.
Thanks
Frank
 
G

Guest

why not just use replace

string.replace("\0","")
then use regex or normal split?

just my 2 cents

WStorey II
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Top