Regular expression : Grouping decimal values and double quote

  • Thread starter Ahmad A. Rahman
  • Start date
A

Ahmad A. Rahman

Hi all,

I have a problem constructing a regular expression using .net.

I have a string, separated with comma, and I want to group the string
together but, I failed to group a numeric character with decimal values.

Example string : 1, 2.3, "two"," three"

So, I want to group this string into 4 groups (1), (2.3), (two) and (three)

The best regular expression that I have so far is:
(?:^|\s*\,\s*)((?:"(?<SubString>(?:""|[^"])*)")+)|((?<SubString>(\d))+)

But this regex will return (1), (2), (3), (two) and (three).

So, what is the right regular expression to do this? Please help.

Thanks.
 
A

Ahmad A. Rahman

No. Consider this string:
string s = "1, 2.3, \"ab,\"\"c\", \"e.ff,;$\"";

I want to split it into (1), (2.3), (ab,""c) and (e.ff,;$).

Got my point? Please help.
 
S

Sherif ElMetainy

Hello

Try this expression
(?:^|\s*\,\s*)(?:(?:"(?<SubString>(?:""|[^"])*)"\s*)|(?:\s*(?<SubString>(?:\
s*[^\s,]+)*)\s*))

Best regards
Sherif
 
A

Ahmad A. Rahman

Thanx a lot Metainy!

It works. But that regex also matches invalid decimal values. Like, it still
match 1.23aa value. And same case also happened to the double-quote
character, of which I wanted it to start and end with double-quote, with no
trailing character except a comma or no char at all. Got my point?

Just to get it clear:
- 1.23, "abc"qwe" = valid
- 1.23x, "abc"qwe" = invalid
- 1.23, "abc"qwe"xx = invalid

And one more thing is, any good (but free) resource of regex tutorial? ebook
or website.

Still hoping for assistance here.

Thank you.
 
S

Sherif ElMetainy

Hello Ahmad

Try this one
^(?:(?:(?:^|,)\s*)(?:(?:"(?<SubString>(""|[^"])*)")|(?<SubString>\d+(?:\.\d+
)?))(?=\s*(?:$|,))\s*)+$

Best regards,
Sherif
 
A

Ahmad A. Rahman

Hi ElMetainy,

That one does work, but I also need the double-quote character to be in
between the double quote.
Llike my previous post:

1.23, "abc"qwe" = valid (and by using MatchCollection on <SubString>, this
will return [1.23] and [abc"qwe])
1.23, "abc"qwe"xx = invalid
1.23xx, "abc"qwe" = invalid

Can you help me...just a little bit more? :) You almost got it right.

p/s: Sorry, I still got no time to learn regex. But I really need a quick
solution right now.
 
S

Sherif ElMetainy

Hello

This can be too complicated

How do I treat the double quote and comma. A ',' between double quotes is
considered a part of the string and a double quote between double quotes is
also considered a part of the string
Imagine this

1.23,"aa,ddd",123 this should match [1.23], [aa,ddd] and [123]

1.23,"abc"qwe,"ee",124 should match [1.23], [abc"qwe,"ee] and [124] or be
considered invalid??
To take this decision you have to understand the nature of the data (for
example being able to distinguish a contact's first name from his
nickname) which is not possible with regular expressions.

This is why it is difficult to match one double quote between 2 double
quotes.

Here is where the "" resolves the ambiguity
1.23,"abc""qwe,""ee",124 should match [1.23], [abc"qwe,"ee] and [124]
meaning that 2 consecutive double quotes within double quotes should be
treated as a one double quote which is a part of the string. The "" is
standard in formats like csv.


Best regards,
Sherif


Ahmad A. Rahman said:
Hi ElMetainy,

That one does work, but I also need the double-quote character to be in
between the double quote.
Llike my previous post:

1.23, "abc"qwe" = valid (and by using MatchCollection on <SubString>, this
will return [1.23] and [abc"qwe])
1.23, "abc"qwe"xx = invalid
1.23xx, "abc"qwe" = invalid

Can you help me...just a little bit more? :) You almost got it right.

p/s: Sorry, I still got no time to learn regex. But I really need a quick
solution right now.

Sherif ElMetainy said:
Hello Ahmad

Try this one
)?))(?=\s*(?:$|,))\s*)+$

Best regards,
Sherif

with
no
 
A

Ahmad A. Rahman

Hi,

I know that it was too complicated, that's why I'm here.

But, I think I have my way out now. I can use MatchColelction and break the
string apart between the comma, and use another regex to validate every
broken string. :)

Anyway, you've been a great help ElMetainy. Thanks a lot.

Bye.

Sherif ElMetainy said:
Hello

This can be too complicated

How do I treat the double quote and comma. A ',' between double quotes is
considered a part of the string and a double quote between double quotes is
also considered a part of the string
Imagine this

1.23,"aa,ddd",123 this should match [1.23], [aa,ddd] and [123]

1.23,"abc"qwe,"ee",124 should match [1.23], [abc"qwe,"ee] and [124] or be
considered invalid??
To take this decision you have to understand the nature of the data (for
example being able to distinguish a contact's first name from his
nickname) which is not possible with regular expressions.

This is why it is difficult to match one double quote between 2 double
quotes.

Here is where the "" resolves the ambiguity
1.23,"abc""qwe,""ee",124 should match [1.23], [abc"qwe,"ee] and [124]
meaning that 2 consecutive double quotes within double quotes should be
treated as a one double quote which is a part of the string. The "" is
standard in formats like csv.


Best regards,
Sherif


Ahmad A. Rahman said:
Hi ElMetainy,

That one does work, but I also need the double-quote character to be in
between the double quote.
Llike my previous post:

1.23, "abc"qwe" = valid (and by using MatchCollection on <SubString>, this
will return [1.23] and [abc"qwe])
1.23, "abc"qwe"xx = invalid
1.23xx, "abc"qwe" = invalid

Can you help me...just a little bit more? :) You almost got it right.

p/s: Sorry, I still got no time to learn regex. But I really need a quick
solution right now.

Sherif ElMetainy said:
Hello Ahmad

Try this one
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top