Also, I prepended your pattern to test for comments first:
@"(/\*.*?\*/|//.*?(?=\r|\n))|(([""']).+\<2>)"
After prefixing the commenting part, comments are picked up but your
literal string part is completely ignored. For example:
Nothing is matched (should have gotten the "C"):
String str = "extern \"C\"\r\n";
The whole line is correctly matched for a comment:
String str = "//extern \"C\"\r\n";
Strangely enough the old pattern did work in this aspect:
@"(/\*.*?\*/|//.*?(?=\r|\n))|(@?""""|@?"".*?(?!\\).""|''|'.*?(?!\\).')"
Unfortunately it fails to correctly end literal strings ending with a
back-slash (unlike yours, which does work).
Thanks
Bob said:
Your Regex works very well Ken, thanks. Can you explain what exactly the
<2> does? It looks like a grouping construct, but it isn't in the format
of (?<group>.*?). I couldn't find any reference to this at
http://msdn.microsoft.com/library/en-us/cpgenref/html/cpconregularexpressionslanguageelements.asp.
Thanks again.
Ken Arway said:
Bob wrote:
I need to create a Regex to extract all strings (including quotations)
from a C# or C++ source file. After being unsuccessful myself, I found
this sample on the internet:
@"@?""""|@?"".*?(?!\\).""|''|'.*?(?!\\).'"
I am inputting the entire source file string and using it with
RegexOptions.Singleline. This works OK with, unless the string ends
with a back-slash. For example: "This is a test\\". Can anybody see
how to fix this sample so that back-slashes are considered?
Without examples of desired behaviour, here's what I came up with, using
backreferences:
Regex regex = new Regex(@"(([""']).+\<2>)", (RegexOptions) 0);
Sample input:
"This is a test\\"
This is also a test
Here's another "test"
'Now for another\\'
Using 'single quotes'
// Here 's a comment.
// And a "quoted" one.
Sample output:
Matching: "This is a test\\"
1 =»"This is a test\\"«=
2 =»"«=
Matching: This is also a test
No Match
Matching: Here's another "test"
1 =»"test"«=
2 =»"«=
Matching: 'Now for another\\'
1 =»'Now for another\\'«=
2 =»'«=
Matching: Using 'single quotes'
1 =»'single quotes'«=
2 =»'«=
Matching: // Here 's a comment.
No Match
Matching: // And a "quoted" one.
1 =»"quoted"«=
2 =»"«=
You'd want the group 1....