String Substitution - Performance

  • Thread starter Thread starter Ben Dewey
  • Start date Start date
B

Ben Dewey

Anyone,

I am trying to do a string replace of a custom Html Tag that is Case
Insensitive and Fast, I will be calling this function a bunch of times.

Any thoughts about using maybe a StringBuilder or StringReader/Writer to
increase performance.

Please feel free to mark up my code and send it back. Thanks for any
help....

public string FormattedHtml
{
get
{
string formattedHtml = _html;
formattedHtml = ReplaceHtmlTag(formattedHtml, "heading", "h" +
_headingLevel.ToString());
return formattedHtml;
}
}
private string ReplaceHtmlTag(string source, string tag, string newTag)
{
string outHtml = source;
string startTag = "<" + tag.ToLowerInvariant();
string endTag = "</" + tag.ToLowerInvariant();
int pos = outHtml.ToLowerInvariant().IndexOf(startTag);
while (pos >= 0)
{
string begin = outHtml.Substring(0, pos);
outHtml = begin + "<" + newTag + outHtml.Substring(pos + startTag.Length,
outHtml.Length - pos - startTag.Length);
pos = outHtml.ToLowerInvariant().IndexOf(startTag, pos);
}
pos = outHtml.ToLowerInvariant().IndexOf(endTag);
while (pos >= 0)
{
string begin = outHtml.Substring(0, pos);
outHtml = begin + "</" + newTag + outHtml.Substring(pos + endTag.Length,
outHtml.Length - pos - endTag.Length);
pos = outHtml.ToLowerInvariant().IndexOf(endTag, pos);
}
return outHtml;
}
 
Ben Dewey said:
I am trying to do a string replace of a custom Html Tag that is Case
Insensitive and Fast, I will be calling this function a bunch of times.

Any thoughts about using maybe a StringBuilder or StringReader/Writer to
increase performance.

Define "a bunch of times". Have you benchmarked it with real world data
and found it to be a significant bottlneck? If not, go with the
simplest, most readable code.
 
Well in that case. Is there an easy, readable String.Replace that is Case
Insensitive?
 
Ben Dewey said:
Well in that case. Is there an easy, readable String.Replace that is Case
Insensitive?

Not that I can see - but you could use IndexOf with an appropriate
StringOptions to do a case-insensitive lookup, rather than lower-casing
the whole string.

I suspect you may well find it just as easy to read a version which
uses StringBuilder as the version which doesn't, however. It's just a
case of finding the next start of the tag, appending the section from
the end of the last tag to that next index to the StringBuilder, then
appending the new tag, rinse and repeat. Just be careful with the edge
cases (i.e. make sure it works if there's a tag at the start, if
there's a tag at the end, and if there are two tags together).

Is there any chance you'll have a <tag /> kind of tag, rather than
<tag>stuff</tag>?
 
How about this:

new Value = String.Replace(source.ToLower(), tag.ToLower(), newTag);

--
HTH,

Kevin Spencer
Microsoft MVP
Professional Numbskull

Hard work is a medication for which
there is no placebo.
 
If you plan to code against the xhtml 1.0 standard, then the markup is
required to be lower case.
 
Ben Dewey said:
Anyone,

I am trying to do a string replace of a custom Html Tag that is Case
Insensitive and Fast, I will be calling this function a bunch of times.

Any thoughts about using maybe a StringBuilder or StringReader/Writer to
increase performance.

Please feel free to mark up my code and send it back. Thanks for any
help....

An alternative is to perform the lower casing once and use this lower cased
string to find the changes that you need to make on the original source
string. The code below is a quick hack of this concept as I do the string
construction on both the lower case and the source so that the SubString
returns the correct value. I would also extract a method for the start tag
and end tag code as it is highly duplicated.

SP
string lowerC = source.ToLowerInvariant();

string startTag = "<" + tag.ToLowerInvariant();

string endTag = "</" + tag.ToLowerInvariant();

int pos = lowerC.IndexOf(startTag);

while(pos >= 0)

{

string begin = lowerC.Substring(0, pos);

lowerC = begin + "<" + newTag + lowerC.Substring(pos + startTag.Length,
lowerC.Length - pos - startTag.Length);

source = begin + "<" + newTag + source.Substring(pos + startTag.Length,
source.Length - pos - startTag.Length);

pos = lowerC.IndexOf(startTag, pos);

}

pos = lowerC.IndexOf(endTag);

while(pos >= 0)

{

string begin = lowerC.Substring(0, pos);

lowerC = begin + "</" + newTag + lowerC.Substring(pos + endTag.Length,
lowerC.Length - pos - endTag.Length);

source = begin + "</" + newTag + source.Substring(pos + endTag.Length,
source.Length - pos - endTag.Length);

pos = lowerC.IndexOf(endTag, pos);

}

return source;
 
Sorry, I meant:

new Value = source.ToLower().Replace( tag.ToLower(), newTag);



--
HTH,

Kevin Spencer
Microsoft MVP
Professional Numbskull

Hard work is a medication for which
there is no placebo.
 
Back
Top