Regex to recognize math/string functions

T

Tim Conner

Hi,

Thanks to Peter, Chris and Steven who answered my previous answer about
regex to split a string. Actually, it was as easy as create a regex with the
pattern "/*-+()," and most of my string was splitted.
I am fascinated to the powerfull use of this RegEx class, so I wonder if it
could go a step further.

As a question, can regex be used to valid a set of different functions ?
Example : Suppose I have to verify the correctness of an input string, which
may contains one or more of the following functions :

Round ( NumericValue, Decimals)
Lower( StringValue )
Upper( StringValue )
Abs(NumericValue)

.... it will be like 15 functions, but let's name just this three.

Note : I just want to validate the input, I don't pretend to perform the
resolving part of this functions, just validate the input in terms of :
1.- Data type of parameters.
2.- Pairing parenthesis.
(the resolution of the of the functions will be done by 3rd party's code).


So, if I receive :
Abs("VB is great").

I would reject that sentense due the characters between parenthesis are a
string, not numeric values.

But, instead if I receive :
Upper( "C# is the best thing since sliced bread")

I would accept the sentence because the parameter is of the proper type.

Also:
Round( 1234.56, 2

would be invalid, due the missing parenthesis.

Finally, the functions can be nested.


So, the question is : can Regex handle this ? or should I start to go for
the parsers libraries ?


Thanks in advance,
 
D

Dmitriy Lapshin [C# / .NET MVP]

Hi Tim,

I think you COULD use RegExp to perform such a validation, but there are
more suitable tools for such tasks - lexical analyzers. These are state
machines controlled by so called syntax graphs describing what is valid for
the grammar and what is not. I suppose RegExp uses a similar engine behind
the scenes by building a syntax graph from the regular expression you
provide, but it's just the expression can grow enormously for complex
grammars.
 
C

Chris R. Timmons

Hi,

Thanks to Peter, Chris and Steven who answered my previous
answer about regex to split a string. Actually, it was as easy
as create a regex with the pattern "/*-+()," and most of my
string was splitted. I am fascinated to the powerfull use of
this RegEx class, so I wonder if it could go a step further.

As a question, can regex be used to valid a set of different
functions ? Example : Suppose I have to verify the correctness
of an input string, which may contains one or more of the
following functions :

Round ( NumericValue, Decimals)
Lower( StringValue )
Upper( StringValue )
Abs(NumericValue)

... it will be like 15 functions, but let's name just this
three.

Note : I just want to validate the input, I don't pretend to
perform the resolving part of this functions, just validate the
input in terms of : 1.- Data type of parameters.
2.- Pairing parenthesis.
(the resolution of the of the functions will be done by 3rd
party's code).


So, if I receive :
Abs("VB is great").

I would reject that sentense due the characters between
parenthesis are a string, not numeric values.

But, instead if I receive :
Upper( "C# is the best thing since sliced bread")

I would accept the sentence because the parameter is of the
proper type.

Also:
Round( 1234.56, 2

would be invalid, due the missing parenthesis.

Finally, the functions can be nested.


So, the question is : can Regex handle this ? or should I start
to go for the parsers libraries ?

Tim,

Taken individually, each function's form could be validated by a
regular expression. For 15 functions, you would need to write 15
regexes.

Taken together, however, the complexity of matching arbitrarily
nested function calls will quickly turn any regex-based solution into
an unmaintainable mess. This is assuming it's even possible to do
with regexes. Assuming the following would be valid input in your
system, I have no idea of how to write a generic regex to validate
this:

Upper(Lower(Upper(Lower("())()(((()()())"))))

I would suggest investigating lexers and parsers. They're not that
hard to write, and can handle the above input with ease (and much
more complex input as well). For a gentle introduction to writing a
parser from scratch, here's a good site:

"Let's Build a Compiler" by Jack Crenshaw:
http://compilers.iecc.com/crenshaw/

It's written in Pascal, but it shouldn't be too hard to port to C#.

Chris.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top