Regex. Split or Split

L

lgbjr

Hi All,

I'm trying to split a string on every character. The string happens to be a
representation of a hex number. So, my regex expression is ([A-F,0-9]).
Seems simple, but for some reason, I'm not getting the results I expect.

Dim SA as string()
Dim S as string

S="FBE"
SA=RegularExpressions.Regex.Split(S,"([A-F,0-9])")

I expect that SA will contain 3 elements: SA(0)="F", SA(1)="B", SA(2)="E",
but, what I'm getting is 7 elements: SA(0)="", SA(1)="F", SA(2)="",
SA(3)="B", SA(4)="", SA(5)="E", SA(6)="".

If I change the expression to [A-F,0-9] (no parentheses), I get: SA(0)="",
SA(1)="", SA(2)="", SA(3)="".

Just for my own sanity, I've checked the pattern in Expresso and it returns
what I would expect.

I suppose I should mention I'm using VB.NET 2005 (just in case there's a
known issue with Regex in 2005).

TIA
Lee
 
J

Jay B. Harlow [MVP - Outlook]

Lee,
| I expect that SA will contain 3 elements: SA(0)="F", SA(1)="B", SA(2)="E",
| but, what I'm getting is 7 elements: SA(0)="", SA(1)="F", SA(2)="",
| SA(3)="B", SA(4)="", SA(5)="E", SA(6)="".
I would expect it to contain 4 elements, SA(0)="", SA(1)="", SA(2)="", and
SA(3)="", as your string only contains delimiters. RegEx.Split returns the
strings between the delimiters, unless you use capturing groups (the
parenthesis in your expression) in which case it returns both the strings
between the delimiters & the delimiters.

The pattern "[A-F,0-9]" returns the 4 that I expect. The capturing groups in
your expression is causing RegEx to return the 4 strings between the
delimiters, plus the 3 delimiters, ergo 7 values.


It sounds like you really want to return the list of matches, rather then
the stuff between the delimiters... Try RegEx.Matches, something like:

Dim input As String = "FBE"

Const pattern As String = "([A-F,0-9])"
Static parser As New Regex(pattern)

For Each match As Match In parser.Matches(input)
Debug.WriteLine(match.Value)
Next

Hope this helps
Jay



| Hi All,
|
| I'm trying to split a string on every character. The string happens to be
a
| representation of a hex number. So, my regex expression is ([A-F,0-9]).
| Seems simple, but for some reason, I'm not getting the results I expect.
|
| Dim SA as string()
| Dim S as string
|
| S="FBE"
| SA=RegularExpressions.Regex.Split(S,"([A-F,0-9])")
|
| I expect that SA will contain 3 elements: SA(0)="F", SA(1)="B", SA(2)="E",
| but, what I'm getting is 7 elements: SA(0)="", SA(1)="F", SA(2)="",
| SA(3)="B", SA(4)="", SA(5)="E", SA(6)="".
|
| If I change the expression to [A-F,0-9] (no parentheses), I get: SA(0)="",
| SA(1)="", SA(2)="", SA(3)="".
|
| Just for my own sanity, I've checked the pattern in Expresso and it
returns
| what I would expect.
|
| I suppose I should mention I'm using VB.NET 2005 (just in case there's a
| known issue with Regex in 2005).
|
| TIA
| Lee
|
|
 
L

lgbjr

Thanks Jay,

I'm an idiot!! an hour or so ago, I was working on a split (really a split),
and just continued with split when I should have been using Matches. LOL! Of
course, Expresso was giving me the result I expected, because it wasn't
trying to split the string!

thanks for pointing out what should have been an obvious mistake.

Lee
 
L

Larry Lard

By the way, the comma in your regex pattern is not part of the syntax
of [] (ie what you actually mean is [A-F0-9] - at the moment you would
match , as a hex digit)
Thanks Jay,

I'm an idiot!! an hour or so ago, I was working on a split (really a split),
and just continued with split when I should have been using Matches. LOL! Of
course, Expresso was giving me the result I expected, because it wasn't
trying to split the string!

thanks for pointing out what should have been an obvious mistake.

Lee

Lee,
| I expect that SA will contain 3 elements: SA(0)="F", SA(1)="B",
SA(2)="E",
| but, what I'm getting is 7 elements: SA(0)="", SA(1)="F", SA(2)="",
| SA(3)="B", SA(4)="", SA(5)="E", SA(6)="".
I would expect it to contain 4 elements, SA(0)="", SA(1)="", SA(2)="", and
SA(3)="", as your string only contains delimiters. RegEx.Split returns the
strings between the delimiters, unless you use capturing groups (the
parenthesis in your expression) in which case it returns both the strings
between the delimiters & the delimiters.

The pattern "[A-F,0-9]" returns the 4 that I expect. The capturing groups
in
your expression is causing RegEx to return the 4 strings between the
delimiters, plus the 3 delimiters, ergo 7 values.


It sounds like you really want to return the list of matches, rather then
the stuff between the delimiters... Try RegEx.Matches, something like:

Dim input As String = "FBE"

Const pattern As String = "([A-F,0-9])"
Static parser As New Regex(pattern)

For Each match As Match In parser.Matches(input)
Debug.WriteLine(match.Value)
Next

Hope this helps
Jay



| Hi All,
|
| I'm trying to split a string on every character. The string happens to
be
a
| representation of a hex number. So, my regex expression is ([A-F,0-9]).
| Seems simple, but for some reason, I'm not getting the results I expect.
|
| Dim SA as string()
| Dim S as string
|
| S="FBE"
| SA=RegularExpressions.Regex.Split(S,"([A-F,0-9])")
|
| I expect that SA will contain 3 elements: SA(0)="F", SA(1)="B",
SA(2)="E",
| but, what I'm getting is 7 elements: SA(0)="", SA(1)="F", SA(2)="",
| SA(3)="B", SA(4)="", SA(5)="E", SA(6)="".
|
| If I change the expression to [A-F,0-9] (no parentheses), I get:
SA(0)="",
| SA(1)="", SA(2)="", SA(3)="".
|
| Just for my own sanity, I've checked the pattern in Expresso and it
returns
| what I would expect.
|
| I suppose I should mention I'm using VB.NET 2005 (just in case there's a
| known issue with Regex in 2005).
|
| TIA
| Lee
|
|
 
G

Guest

Why can't you use the .ToCharArray method of Strings?

lgbjr said:
Thanks Jay,

I'm an idiot!! an hour or so ago, I was working on a split (really a split),
and just continued with split when I should have been using Matches. LOL! Of
course, Expresso was giving me the result I expected, because it wasn't
trying to split the string!

thanks for pointing out what should have been an obvious mistake.

Lee

Jay B. Harlow said:
Lee,
| I expect that SA will contain 3 elements: SA(0)="F", SA(1)="B",
SA(2)="E",
| but, what I'm getting is 7 elements: SA(0)="", SA(1)="F", SA(2)="",
| SA(3)="B", SA(4)="", SA(5)="E", SA(6)="".
I would expect it to contain 4 elements, SA(0)="", SA(1)="", SA(2)="", and
SA(3)="", as your string only contains delimiters. RegEx.Split returns the
strings between the delimiters, unless you use capturing groups (the
parenthesis in your expression) in which case it returns both the strings
between the delimiters & the delimiters.

The pattern "[A-F,0-9]" returns the 4 that I expect. The capturing groups
in
your expression is causing RegEx to return the 4 strings between the
delimiters, plus the 3 delimiters, ergo 7 values.


It sounds like you really want to return the list of matches, rather then
the stuff between the delimiters... Try RegEx.Matches, something like:

Dim input As String = "FBE"

Const pattern As String = "([A-F,0-9])"
Static parser As New Regex(pattern)

For Each match As Match In parser.Matches(input)
Debug.WriteLine(match.Value)
Next

Hope this helps
Jay



| Hi All,
|
| I'm trying to split a string on every character. The string happens to
be
a
| representation of a hex number. So, my regex expression is ([A-F,0-9]).
| Seems simple, but for some reason, I'm not getting the results I expect.
|
| Dim SA as string()
| Dim S as string
|
| S="FBE"
| SA=RegularExpressions.Regex.Split(S,"([A-F,0-9])")
|
| I expect that SA will contain 3 elements: SA(0)="F", SA(1)="B",
SA(2)="E",
| but, what I'm getting is 7 elements: SA(0)="", SA(1)="F", SA(2)="",
| SA(3)="B", SA(4)="", SA(5)="E", SA(6)="".
|
| If I change the expression to [A-F,0-9] (no parentheses), I get:
SA(0)="",
| SA(1)="", SA(2)="", SA(3)="".
|
| Just for my own sanity, I've checked the pattern in Expresso and it
returns
| what I would expect.
|
| I suppose I should mention I'm using VB.NET 2005 (just in case there's a
| known issue with Regex in 2005).
|
| TIA
| Lee
|
|
 
L

lgbjr

Dennis,

Previously, that is exactly what I was doing:

Dim SA as Array
Dim S as String
S="FBE"
SA=S.ToCharArray

This works fine. but, I'm trying to use Options Strict now, which means I
can't use

Dim SA as Array

I have to use

Dim SA as String()

And 1-dimensional array of Char can not be converted to 1-dimensional array
of String. So, I decided to use a Regex Match to convert the string to a
string array.

LOL! As I was typing this, I just realized I can do Dim SA as Char(), then
use .ToCharArray!!

Thanks!!

Lee

Dennis said:
Why can't you use the .ToCharArray method of Strings?

lgbjr said:
Thanks Jay,

I'm an idiot!! an hour or so ago, I was working on a split (really a
split),
and just continued with split when I should have been using Matches. LOL!
Of
course, Expresso was giving me the result I expected, because it wasn't
trying to split the string!

thanks for pointing out what should have been an obvious mistake.

Lee

Jay B. Harlow said:
Lee,
| I expect that SA will contain 3 elements: SA(0)="F", SA(1)="B",
SA(2)="E",
| but, what I'm getting is 7 elements: SA(0)="", SA(1)="F", SA(2)="",
| SA(3)="B", SA(4)="", SA(5)="E", SA(6)="".
I would expect it to contain 4 elements, SA(0)="", SA(1)="", SA(2)="",
and
SA(3)="", as your string only contains delimiters. RegEx.Split returns
the
strings between the delimiters, unless you use capturing groups (the
parenthesis in your expression) in which case it returns both the
strings
between the delimiters & the delimiters.

The pattern "[A-F,0-9]" returns the 4 that I expect. The capturing
groups
in
your expression is causing RegEx to return the 4 strings between the
delimiters, plus the 3 delimiters, ergo 7 values.


It sounds like you really want to return the list of matches, rather
then
the stuff between the delimiters... Try RegEx.Matches, something like:

Dim input As String = "FBE"

Const pattern As String = "([A-F,0-9])"
Static parser As New Regex(pattern)

For Each match As Match In parser.Matches(input)
Debug.WriteLine(match.Value)
Next

Hope this helps
Jay



| Hi All,
|
| I'm trying to split a string on every character. The string happens
to
be
a
| representation of a hex number. So, my regex expression is
([A-F,0-9]).
| Seems simple, but for some reason, I'm not getting the results I
expect.
|
| Dim SA as string()
| Dim S as string
|
| S="FBE"
| SA=RegularExpressions.Regex.Split(S,"([A-F,0-9])")
|
| I expect that SA will contain 3 elements: SA(0)="F", SA(1)="B",
SA(2)="E",
| but, what I'm getting is 7 elements: SA(0)="", SA(1)="F", SA(2)="",
| SA(3)="B", SA(4)="", SA(5)="E", SA(6)="".
|
| If I change the expression to [A-F,0-9] (no parentheses), I get:
SA(0)="",
| SA(1)="", SA(2)="", SA(3)="".
|
| Just for my own sanity, I've checked the pattern in Expresso and it
returns
| what I would expect.
|
| I suppose I should mention I'm using VB.NET 2005 (just in case
there's a
| known issue with Regex in 2005).
|
| TIA
| Lee
|
|
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top