Regex help

B

Bob Dankert

I am trying to create a regular expression to split a string at each
newline, unless the newline is within a set of quotes. Unfortunately, I
have little experience with Regular Expressions and have been having a hard
time creating a pattern to match only newlines not within quotes. Any help
would be appreciated with this!

Thanks,

Bob Dankert
 
M

Mike Gage

Bob,

There are several options for attacking this. You can probably write an
efficient regex by taking advantage of particular requirements (possibly
involving whitespace, where quotes are allowed, or ...). One approach
would be to search for \", then check to see if the next 2 characters
are \n\" and otherwise continue to \n. Another, probably the most
straightforward approach, is to capture everything up to \n, and, once
\n is found, peek behind from the prior character to see if the sequence
is \"\n\", and otherwise capture the string.

There are a lot of things to be careful about in building your regex.
Not only are there subtle ways to get results that you don't want, but
there are subtle ways to cause the engine to include scans that you
don't need. Even though it seems like a lot of effort, I recommend that
you read a book like Mastering Regular Expressions (from O'Reilly). I
think if you read through that book you will have a clear understanding
of where to go with this.
 
B

Bob Dankert

Mike,

I have read a few of the books (I have not fully read Mastering Regular
Expressions yet, but I did get it over the weekend). I think this problem
is very similar to the whole CSV parsing issue, which I remember was quite a
difficult problem back in the perl days. If anyone could give me a specific
working example of how I would create a regular expression for something
like this, I would greatly appreciate it, as I am still quite confused how
to handle this after my readings thus far. (OReily pocket reference and
Sams Regular Expressions).

Thanks,

Bob Dankert
 
J

Jeffrey Tan[MSFT]

Hi Bob,

Thanks for your feedback.

Where is your key obstacle? Can you show me your problem more detail? I
think Mike has provided you 2 suggestion of using Regular Expression to
handle your problem.

In .Net, we can use classes in System.Text.RegularExpressions namespace to
manipulate Regular Expression. It is easy to use these classes. I think you
only need to write correct regular expression syntax and then you can get
what you want.

Please feel free to feedback. Thanks

Best regards,
Jeffrey Tan
Microsoft Online Partner Support
Get Secure! - www.microsoft.com/security
This posting is provided "as is" with no warranties and confers no rights.
 
B

Bob Dankert

I am just trying to create a Regex to split a string at all newlines, unless
the newline is within a set of quotes (ignoring all escaped quotes). Take
the following:

This is a string\nSplit here\nBut "DO \"not split\nhere because\" it is in
the quotes"

would result in:

1. This is a string
2. Split here
3. But "DO \"not split\nhere because\" it is in the quotes"

I am having a hard time being able to determine whether the newline is
within a set of quotes or not, while ignoring the escaped quotes.

Thanks for any help,

Bob Dankert
 
Y

Yan-Hong Huang[MSFT]

Hi Bob,

I will notify Jeffery it and we will follow up here as soon as possible.
Thanks.

Best regards,
Yanhong Huang
Microsoft Community Support

Get Secure! ¨C www.microsoft.com/security
This posting is provided "AS IS" with no warranties, and confers no rights.
 
J

Jeffrey Tan[MSFT]

Hi Bob ,

Thanks for your feedback.

Oh, yes, because your quote is not stay together with newline, so I think
Mike's suggestion does not meet your need.

For your issue, I think this is not regex problem, what you need is a good
algorithm to parse your string.
After some thinking, I may find one algorithm for you. Maybe it is not the
best one, but it should work for you:

First, you may split the input string into a string array with "\n"

Then, you should loop through each short string array to count each string
array item's quotes number, the key point is to determine if the array's
quote number is odd number or even number.

At last, you may concatenate the string array through the rule below:
1. Once you find an array item has even number of quotes, just leave and
jump it.
2. If it has odd number of quotes, mark it as start concatenate array item,
find next array item that has odd number of quotes. Then concatenate all
the items between 2 items that have odd number quotes.
3. continue the above 2 steps.

I hope I have explained this algorithm.

=========================================
Please apply my suggestion above and let me know if it helps resolve your
problem.

Thank you for your patience and cooperation. If you have any questions or
concerns, please feel free to post it in the group. I am standing by to be
of assistance.

Best regards,
Jeffrey Tan
Microsoft Online Partner Support
Get Secure! - www.microsoft.com/security
This posting is provided "as is" with no warranties and confers no rights.
 
B

Bob Dankert

Thanks a lot Jeffrey!

This worked to resolve the problem. I guess without much regular expression
experience comes the problem that I do not necessarily know when it is time
to use regular expressions and when it is not time. I appreciate the help
from everyone!

Thanks,

Bob Dankert
 
J

Jeffrey Tan[MSFT]

Hi Bob,

Thanks very much for your feedback.

I am glad I can help you. You are welcome!!

If you need further help, please feel free to post. Thanks

Best regards,
Jeffrey Tan
Microsoft Online Partner Support
Get Secure! - www.microsoft.com/security
This posting is provided "as is" with no warranties and confers no rights.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

regex challenge 9
simple Regex does not match 2
Regex help 1
Newbie question about Regex 8
RegEx Format Help 4
Regex : handling single quotes while parsing csv file 4
Regex at Index == 0 5
Regex help needed 1

Top