some help with regex

  • Thread starter Thread starter ilPostino
  • Start date Start date
I

ilPostino

hi,

I'm trying to write my own code-dom because I don't like the ms one ;-)

I want to look for this expression, ( using someword; ) but I can only
figure out basic things like searching for words that start a-zA-Z etc based
on my regex books and online documentation.

I will also need to look for things like;

public class somename (....); this kind of line might be on two lines ....
so I guess its some expression like

anyword CLASS anyname (some params);

can anyone help?

thanks
-c

(you can msg me via www.typemismatch.com)
 
ilPostino said:
hi,

I'm trying to write my own code-dom because I don't like the ms one ;-)

I want to look for this expression, ( using someword; ) but I can only
figure out basic things like searching for words that start a-zA-Z etc
based on my regex books and online documentation.

I will also need to look for things like;

public class somename (....); this kind of line might be on two lines ....
so I guess its some expression like

anyword CLASS anyname (some params);

can anyone help?

"\b(public\s*|private\s*|internal\s*|)class\s+(\w)+\b([^{])*{" worked in
expresso;

Just to make sure: You do know that you won't be able to parse C# using
regular expressons? It's a recursive language, MS regex's can do a little
nested-paren matching, but for C# you will almost definitely need a
full-blown parser.

Niki
 
thanks, can you recommend any parsers then?

I just need to parse out down to the Method level inside a class or the code
inside an event etc, I never need to actually
parse the code itself, just definitions.

That regex makes sense, thanks.

-c



Niki Estner said:
ilPostino said:
hi,

I'm trying to write my own code-dom because I don't like the ms one ;-)

I want to look for this expression, ( using someword; ) but I can only
figure out basic things like searching for words that start a-zA-Z etc
based on my regex books and online documentation.

I will also need to look for things like;

public class somename (....); this kind of line might be on two lines
.... so I guess its some expression like

anyword CLASS anyname (some params);

can anyone help?

"\b(public\s*|private\s*|internal\s*|)class\s+(\w)+\b([^{])*{" worked in
expresso;

Just to make sure: You do know that you won't be able to parse C# using
regular expressons? It's a recursive language, MS regex's can do a little
nested-paren matching, but for C# you will almost definitely need a
full-blown parser.

Niki
 
I thought maybe ms's codecom could parse it but they didn't provide a parser
.... :(

-c

Niki Estner said:
ilPostino said:
hi,

I'm trying to write my own code-dom because I don't like the ms one ;-)

I want to look for this expression, ( using someword; ) but I can only
figure out basic things like searching for words that start a-zA-Z etc
based on my regex books and online documentation.

I will also need to look for things like;

public class somename (....); this kind of line might be on two lines
.... so I guess its some expression like

anyword CLASS anyname (some params);

can anyone help?

"\b(public\s*|private\s*|internal\s*|)class\s+(\w)+\b([^{])*{" worked in
expresso;

Just to make sure: You do know that you won't be able to parse C# using
regular expressons? It's a recursive language, MS regex's can do a little
nested-paren matching, but for C# you will almost definitely need a
full-blown parser.

Niki
 
ilPostino said:
thanks, can you recommend any parsers then?

I just need to parse out down to the Method level inside a class or the
code inside an event etc, I never need to actually
parse the code itself, just definitions.

That regex makes sense, thanks.

The problem is that classes/namespaces/structs can be nested: a class can
contain another class, which can contain another class, to any level.
Regex's aren't good tools for that kind of input. If you only want to get a
"flat" layout of a file, without nesting information, and if you're not
interested whether the file is syntactically correct, then regex's are
probably fine, otherwise you'll have to build a parser.

I must admit I never did that in C#, but Yacc or Antlr used to be good tools
in C++-times, maybe you can find C# versions of these.

Niki
 
Back
Top