String Substitution - Performance

Ben Dewey · Apr 17, 2006

Anyone,

I am trying to do a string replace of a custom Html Tag that is Case
Insensitive and Fast, I will be calling this function a bunch of times.

Any thoughts about using maybe a StringBuilder or StringReader/Writer to
increase performance.

Please feel free to mark up my code and send it back. Thanks for any
help....

public string FormattedHtml
{
get
{
string formattedHtml = _html;
formattedHtml = ReplaceHtmlTag(formattedHtml, "heading", "h" +
_headingLevel.ToString());
return formattedHtml;
}
}
private string ReplaceHtmlTag(string source, string tag, string newTag)
{
string outHtml = source;
string startTag = "<" + tag.ToLowerInvariant();
string endTag = "</" + tag.ToLowerInvariant();
int pos = outHtml.ToLowerInvariant().IndexOf(startTag);
while (pos >= 0)
{
string begin = outHtml.Substring(0, pos);
outHtml = begin + "<" + newTag + outHtml.Substring(pos + startTag.Length,
outHtml.Length - pos - startTag.Length);
pos = outHtml.ToLowerInvariant().IndexOf(startTag, pos);
}
pos = outHtml.ToLowerInvariant().IndexOf(endTag);
while (pos >= 0)
{
string begin = outHtml.Substring(0, pos);
outHtml = begin + "</" + newTag + outHtml.Substring(pos + endTag.Length,
outHtml.Length - pos - endTag.Length);
pos = outHtml.ToLowerInvariant().IndexOf(endTag, pos);
}
return outHtml;
}

Jon Skeet [C# MVP] · Apr 17, 2006

Ben Dewey said:
I am trying to do a string replace of a custom Html Tag that is Case
Insensitive and Fast, I will be calling this function a bunch of times.

Any thoughts about using maybe a StringBuilder or StringReader/Writer to
increase performance.

Define "a bunch of times". Have you benchmarked it with real world data
and found it to be a significant bottlneck? If not, go with the
simplest, most readable code.

Ben Dewey · Apr 17, 2006

Well in that case. Is there an easy, readable String.Replace that is Case
Insensitive?

Jon Skeet [C# MVP] · Apr 17, 2006

Ben Dewey said:
Well in that case. Is there an easy, readable String.Replace that is Case
Insensitive?

Not that I can see - but you could use IndexOf with an appropriate
StringOptions to do a case-insensitive lookup, rather than lower-casing
the whole string.

I suspect you may well find it just as easy to read a version which
uses StringBuilder as the version which doesn't, however. It's just a
case of finding the next start of the tag, appending the section from
the end of the last tag to that next index to the StringBuilder, then
appending the new tag, rinse and repeat. Just be careful with the edge
cases (i.e. make sure it works if there's a tag at the start, if
there's a tag at the end, and if there are two tags together).

Is there any chance you'll have a <tag /> kind of tag, rather than
<tag>stuff</tag>?

Kevin Spencer · Apr 17, 2006

How about this:

new Value = String.Replace(source.ToLower(), tag.ToLower(), newTag);

--
HTH,

Kevin Spencer
Microsoft MVP
Professional Numbskull

Hard work is a medication for which
there is no placebo.

Ben Dewey · Apr 17, 2006

Thats no good, because it returns the whole output as lowercase.

Chris Dunaway · Apr 17, 2006

If you plan to code against the xhtml 1.0 standard, then the markup is
required to be lower case.

SP · Apr 17, 2006

Ben Dewey said:
Anyone,

I am trying to do a string replace of a custom Html Tag that is Case
Insensitive and Fast, I will be calling this function a bunch of times.

Any thoughts about using maybe a StringBuilder or StringReader/Writer to
increase performance.

Please feel free to mark up my code and send it back. Thanks for any
help....

An alternative is to perform the lower casing once and use this lower cased
string to find the changes that you need to make on the original source
string. The code below is a quick hack of this concept as I do the string
construction on both the lower case and the source so that the SubString
returns the correct value. I would also extract a method for the start tag
and end tag code as it is highly duplicated.

SP
string lowerC = source.ToLowerInvariant();

string startTag = "<" + tag.ToLowerInvariant();

string endTag = "</" + tag.ToLowerInvariant();

int pos = lowerC.IndexOf(startTag);

while(pos >= 0)

{

string begin = lowerC.Substring(0, pos);

lowerC = begin + "<" + newTag + lowerC.Substring(pos + startTag.Length,
lowerC.Length - pos - startTag.Length);

source = begin + "<" + newTag + source.Substring(pos + startTag.Length,
source.Length - pos - startTag.Length);

pos = lowerC.IndexOf(startTag, pos);

}

pos = lowerC.IndexOf(endTag);

while(pos >= 0)

{

string begin = lowerC.Substring(0, pos);

lowerC = begin + "</" + newTag + lowerC.Substring(pos + endTag.Length,
lowerC.Length - pos - endTag.Length);

source = begin + "</" + newTag + source.Substring(pos + endTag.Length,
source.Length - pos - endTag.Length);

pos = lowerC.IndexOf(endTag, pos);

}

return source;

Kevin Spencer · Apr 18, 2006

Sorry, I meant:

new Value = source.ToLower().Replace( tag.ToLower(), newTag);

--
HTH,

Kevin Spencer
Microsoft MVP
Professional Numbskull

Hard work is a medication for which
there is no placebo.

String Substitution - Performance

Ben Dewey

Jon Skeet [C# MVP]

Ben Dewey

Jon Skeet [C# MVP]

Kevin Spencer

Ben Dewey

Chris Dunaway

SP

Kevin Spencer