Problem with Regular Expressions in .NET

G

Gianluca

Hi,

I'm using regular expressions to extract some information from my
vb.net source code files.

I have something like this:

1: '<class name="xyz" description="xxxxxx"/>
2: Class xyz
... other lines of code ...
y: End Class

I want to extract with regular expression a string that contains the
file from line 1 to line y. To do this I use the following regular
expression pattern:

'\s*<class.*End\s+Class

But it doesn't work, it returns zero matches. I tried to set the
IgnoreCase flag, the MultiLine flag, but there is nothing to do. Note
that if I use the following pattern:

'\s*<class.*/>

to extract only line 1 it works.
So it seems that the .* (It should match all characters between
'<Class and End Class) doesn't work when there are multiple lines.

What can I do?

Thanks in advance for the help.

Bye

Gianluca
 
N

Niki Estner

1. You need the Singleline-Flag, or the '.' character won't match newlines
2. Use something like: "\bclass\b.*\bEnd\s+Class\b" "\b" ensures word
boundary
3. This won't work with nested classes, or more than one class in per file
4. This won't recognize comments or strings - If your code contains a string
literal "end class" or a comment ' end class, the regex will think this is
the end of the class.

Correct handling of these cases is possible, but quite complex: If you need
this, maybe a parsing the file line-by-line would be easier.

Niki
 
C

Cor Ligthert

Niki,

I wrote 10 minutes ago in the general group I did not see messages from you
some days, a mistake I see.

Cor
 
G

Gianluca

You are right, setting the Singleline flag it works.
I thought that this flag was the default, that is the reason why I
didn't tried this flag before.

Thanks for your suggestions.

Bye

Gianluca
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top