problem with regular expression and parsing a sql-statement

  • Thread starter Thread starter robert kurz
  • Start date Start date
R

robert kurz

hallo ng,

i am trying to parse a sql-statement with regular expressions.
my goal is to get the parts of the statement. in my thoughts the
group-functionality of the regular expressions should do this.

my pattern looks like:

A) string strPattern = "(select.+)(from.+)(where.+)?(order.+)?";

my searchstrings are like this:

1) string strText = "select * from x";
2) string strText = "select * from x where x.a=1";
3) string strText = "select * from x where x.a=1 order by x.a";
4) string strText = "select * from x order by x.a";

the resulting groups are:

3)
group 0: the statement
group 1: select *
group 2: from x where x.a=1 order by x.a

i don't understand the behaviour for the optional grouping. i am
expecting, that the optional patterns, e.g. (where.+)?, would be
grouped, if they are, and disappear, if they are not.

where is my fault?

whithout the ? in (where.+)? and (order.+)? im getting the wanted
result for 3).

thanks for helping, robert
 
robert kurz wrote:

i am trying to parse a sql-statement with regular expressions.
my goal is to get the parts of the statement. in my thoughts the
group-functionality of the regular expressions should do this.

my pattern looks like:

A) string strPattern = "(select.+)(from.+)(where.+)?(order.+)?";

my searchstrings are like this:

1) string strText = "select * from x";
2) string strText = "select * from x where x.a=1";
3) string strText = "select * from x where x.a=1 order by x.a";
4) string strText = "select * from x order by x.a";

the resulting groups are:

3)
group 0: the statement
group 1: select *
group 2: from x where x.a=1 order by x.a

i don't understand the behaviour for the optional grouping. i am
expecting, that the optional patterns, e.g. (where.+)?, would be
grouped, if they are, and disappear, if they are not.

where is my fault?

Matching by default is greedy, meaning as much as possible is matched,
if you use
(from.+?)
then non-greedy matching meaning after "from" at least one abritrary
(".") character is matched but not all that are possible.
 
Martin Honnen said:
robert kurz wrote:



Matching by default is greedy, meaning as much as possible is matched,
if you use
(from.+?)
then non-greedy matching meaning after "from" at least one abritrary
(".") character is matched but not all that are possible.

hallo martin,

thank you for your answer.

in my opinion the problem is devided in two parts. the first is, that
non-optional groups are one level higher than optional. the second is,
that .+ is greedy, so the optional group is not taken.

do i think the right way or am i wrong? if i'm right, i don't have an
idea to solve my sql-problem.

robert
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Back
Top