Solution: Parsing a search string and creating an SQL where clause

J

John H.

I needed to parse a user search string and then create an SQL where-
clause by applying the user’s tokenized search criteria to a list of
fields from a table or view. I fully admit that I may have missed
something and reinvented a wheel here, but neither the string.split()
method or the RegEx() object seemed to be able to accomplish what I
wanted as far as search string parsing. If this can be accomplished
by functionality already in the .Net framework I’d be interested in
find out how.

My solution consists of a class with three methods
-------------------------------------------------------------------------
- string[] ParseSearchString(string p_InputString)
- bool IsSearchTermToken(string token)
- string CreateWhereClause(string[] fields, string[] tokens)

Basic rules of parsing search string
----------------------------------------------------
- A space between two search terms implies an ‘and’. For example,
‘John Smith’ would mean ‘John’ and ‘Smith’
- Use double quotes to create a search term with imbedded parentheses
or spaces. For example, “(c) John Smith” would be interpreted as a
single search term.
- Parentheses can be used to group search terms.
- Parentheses may or may not be separated from other tokens with
spaces.

================
The Code
================

static public class StringParser
{
public static string[] ParseSearchString(string p_InputString)
{
char[] delimiters = new char[] { ' ', '(', ')' };
ArrayList searchTokens = new ArrayList();
int parensOpen = 0;
int parensClose = 0;
string token = "";
int doubleQuoteChars = 0;
Char[] chars = p_InputString.ToCharArray();

for (int c = 0; c < chars.Count(); c++)
{ if (chars[c] == '\"')
{ if (doubleQuoteChars % 2 == 0) // 0 = opening
quote char
//
1 = closing quote char
{ // start tracking a new quoted search term
doubleQuoteChars++;
}
else
{ // mark end of quoted search term
doubleQuoteChars++;
searchTokens.Add(token);
token = "";
}
}
else if (doubleQuoteChars % 2 == 0 &&
delimiters.Contains(chars[c]))
{ if (token != "")
{ searchTokens.Add(token);
token = "";
}
if (chars[c] == '(')
{ searchTokens.Add("(");
parensOpen++;
}
if (chars[c] == ')')
{ searchTokens.Add(")");
parensClose++;
}
}
else
{ token += chars[c];
}
}
if (token != "")
{ searchTokens.Add(token); // add the trailing token
}
if (doubleQuoteChars % 2 == 1)
{ throw new Exception("Invalid search string: Unmatched
quotes. "
+ doubleQuoteChars.ToString() + " double quote
characters found in search string.");
}
if (parensOpen != parensClose)
{ throw new Exception("Invalid search string: Unmatched
parentheses"
+ Environment.NewLine + Environment.NewLine
+ "The search string has " + parensOpen.ToString()
+ " open parentheses, and "
+ parensClose.ToString() + " close parentheses.");
}
for (int t = 1; t < searchTokens.Count; t++)
{ /* Replace implied "and" with explicit "and"
*
* 1st-Token 2nd-Token Insert "and"
* --------- --------- ------------
* SrchTerm SrchTerm Yes
* ')' SrchTerm Yes
* SrchTerm '(' Yes
* '(' SrchTerm No
* SrchTerm ')' No
*/
bool IsToken1SearchTerm =
(IsSearchTermToken(searchTokens[t - 1].ToString()));
bool IsToken2SearchTerm =
(IsSearchTermToken(searchTokens[t].ToString()));
if ( (IsToken1SearchTerm == true &&
(IsToken2SearchTerm == true ||
searchTokens[t].ToString() == "(") ) ||
(searchTokens[t-1].ToString() == ")" &&
IsToken2SearchTerm == true ))
{
searchTokens.Insert(t, "and");
}
}

return (string[])searchTokens.ToArray(typeof(string));
}

public static bool IsSearchTermToken(string token)
{
if (token.ToLower() == "and" || token.ToLower() == "or"||
token == "(" || token == ")")
{ return false;
}
return true;
}

public static string CreateWhereClause(string[] fields,
string[] tokens)
{
string whereClause = "";

for (int t = 0; t < tokens.Count(); t++)
{ if
(JrccWinLib.StringParser.IsSearchTermToken(tokens[t]))
{ whereClause += " (";
for (int f = 0; f < fields.Count(); f++)
{ if (f > 0)
{ whereClause += " OR ";
}
whereClause += "LOWER(" + fields[f] + ") LIKE
'%" + tokens[t].ToLower() + "%'";
}
whereClause += ")";
}
else
{ // token is a '(', ')', "and", "or"
whereClause += " " + tokens[t];
}
}
if (whereClause != "")
{ whereClause = "WHERE" + whereClause;
}
return whereClause;
}
}
 
J

James A. Fortune

I needed to parse a user search string and then create an SQL where-
clause by applying the user’s tokenized search criteria to a list of
fields from a table or view.  I fully admit that I may have missed
something and reinvented a wheel here, but neither the string.split()
method or the RegEx() object seemed to be able to accomplish what I
wanted as far as search string parsing.  If this can be accomplished
by functionality already in the .Net framework I’d be interested in
find out how.

I have a comment. A common desirable feature for search forms is the
ability to search for a word. Using a SQLServer WHERE clause of the
fashion "LIKE '%searchword%'" will bring up 'chair' when looking for
'hair'.

http://www.regular-expressions.info/refflavors.html

shows that .NET RegEx supports the word boundary features \b (at the
beginning or end of a word) and \B (NOT at the beginning or end of a
word). Perhaps consider including those features instead of relying
only on LIKE, as a way to obtain more relevant search results. A
WHERE clause returns basically true or false, so even complex logical
expressions can be created provided enough parentheses are used to
prevent ambiguity with what you intend.

James A. Fortune
(e-mail address removed)
 
R

RayLopez99

I needed to parse a user search string and then create an SQL where-
clause by applying the user’s tokenized search criteria to a list of
fields from a table or view.  I fully admit that I may have missed
something and reinvented a wheel here, but neither the string.split()
method or the RegEx() object seemed to be able to accomplish what I
wanted as far as search string parsing.  If this can be accomplished
by functionality already in the .Net framework I’d be interested in
find out how.


Not clear what you want here. Apparently, if I understood, you have a
home grown method that works, but you want to know if RegEx or some
built-in NET method also does the trick, correct? If so, I have
advice: just use your home grown method. RegEx is very very slow,
being "interpreted", and even if compiled. I don't know why, some
experts here can tell you.

BTW, switch to the Entity Framework in Visual Studio or Linq-to-Sql
and you'll never have to play with the dinosaur SQL language again--
it's all handled behind the scenes.

RL
 
J

James A. Fortune

BTW, switch to the Entity Framework in Visual Studio or Linq-to-Sql
and you'll never have to play with the dinosaur SQL language again--
it's all handled behind the scenes.

The more joins you have, the better Entity Framework looks. If you
only interact with a few main tables at a time and the schemata for
the tables is relatively stable, then some ad hoc (one off) T-SQL
using Connection and Command objects should be adequate. Back in
2000, I did a large (over 186,000 products) e-commerce solution in
classic ASP that used ADODB with such dynamically generated T-SQL
strings to a SQL Server backend. Usually just the Parameters had to
be changed. The resulting data access was so fast that planned SQL
Server specific optimization was not even required. I like the fact
that the OP is trying to come up with something that is agnostic to
the kind of backend being used. LINQ is just a genetically enhanced
form of the same SQL dinosaur :), but Entity Framework helps hide the
fact that, underneath, it is still a dinosaur. Entity Framework
bridges the gap from relational to object representations instead of
operating on the object representation directly. The relational
representation of data will still be around for quite a while, so
Entity Framework can be though of as way to transition eventually away
from SQL and LINQ-to-SQL as better data representations are developed.

James A. Fortune
(e-mail address removed)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top