Hi Zoro,
I'm not familiar with Dephi, so I may be misinterpreting the meaning of the
functions here. Assuming that when you mention "pattern" you are talking
about a Regular Expression-type pattern, I can see how such functions might
indeed be useful. As I may need such functions in the future as well, I've
taken the liberty of writing a few .Net methods for doing what you're
talking about:
/// <summary>
/// Returns all Indices of Regex Matches in a string
/// </summary>
/// <param name="input">string to evaluate</param>
/// <param name="pattern">pattern to match</param>
/// <returns>Array of indices of all matches</returns>
/// <remarks>If no matches are found, zero-length array is
returned</remarks>
public static int[] IndicesOf(string input, string pattern)
{
int[] returnVal;
Regex rx = new Regex(pattern);
MatchCollection matches = rx.Matches(input);
returnVal = new int[matches.Count];
for (int i = 0; i < matches.Count; i++)
returnVal = matches.Index;
return returnVal;
}
/// <summary>
/// Returns first index of a Regex match in a string
/// </summary>
/// <param name="input">string to evaluate</param>
/// <param name="pattern">pattern to match</param>
/// <returns>Index of first match in input string</returns>
/// <remarks>Returns -1 if no match is found</remarks>
public static int IndexOf(string input, string pattern)
{
Regex rx = new Regex(pattern);
if (!rx.IsMatch(input)) return -1;
return rx.Match(input).Index;
}
/// <summary>
/// Returns the index of the last match of a pattern in an input string
/// </summary>
/// <param name="input">string to evaluate</param>
/// <param name="pattern">pattern to match</param>
/// <returns>Index of the last match of a pattern in the input
string</returns>
public static int LastIndexOf(string input, string pattern)
{
int[] vals = IndicesOf(input, pattern);
if (vals.Length == 0) return -1;
return vals[vals.Length - 1];
}
/// <summary>
/// Returns a Substring of a string
/// before or after the first occurrence of a pattern in the string
/// </summary>
/// <param name="input">string to evaluate</param>
/// <param name="pattern">pattern to match</param>
/// <param name="before">Get the Substring before the pattern?</param>
/// <returns>Substring of input string, starting from the beginning
/// of the string and ending before the first character of the match,
/// or, if before is false, starting from the end of the match, and ending
/// at the end of the string.</returns>
/// <remarks>If there is no match, returns the input string.
/// If before is false, returns the substring after the match</remarks>
public static string Substring(string input, string pattern, bool before)
{
int i;
if (before)
{
i = IndexOf(input, pattern);
if (i > -1) return input.Substring(0, i);
}
else
{
Regex rx = new Regex(pattern);
MatchCollection matches = rx.Matches(input);
if (matches.Count > 0)
{
i = matches[matches.Count - 1].Index + matches[matches.Count -
1].Value.Length;
return input.Substring(i);
}
}
return input;
}
/// <summary>
/// Finds a substring of ain input string between 2 pattern matches
/// </summary>
/// <param name="input">string to evaluate</param>
/// <param name="pattern1">first pattern</param>
/// <param name="pattern2">second pattern</param>
/// <returns>Substring of input string between the 2 patterns</returns>
/// <remarks><para>The order of the patterns is only important if both
/// paterns are found, and are not identical patterns.
/// If the patterns are different, and both patterns are found,
/// the substring returned will be the substring between them
/// regardless of the order in which they appear in the input text</para>
/// <para>If both patterns are found, but their matches overlap, there
/// is nothing between them, and a blank string is returned</para>
/// <para>If both patterns are found, and they are the same pattern,
/// The method will look for a second occurrence of the pattern, and
/// attempt to return the substring between the first and second match
/// of the pattern used. If there is not a second match, the patterns
/// overlap, as they occupy the same space,
/// and there is nothing between.</para>
/// <para>If the first pattern is found, but the second pattern
/// is not found, the substring will be either the substring
/// of the input string after the first match of pattern1,
/// or if the first pattern matches the end of the string,
/// the substring of the string after the first match of pattern1</para>
/// <para>If the second pattern is found, but the first pattern is not,
/// the substring will be either the substring of the input string
/// before the beginning of the first match of the second pattern,
/// or if the second pattern is the beginning of the string, the
/// substring of the string after the end of the first match of the
/// second pattern.</para>
/// <para>If neither pattern is found, the entire input string will
/// be returned.</para>
/// </remarks>
public static string SubstringBetween(string input,
string pattern1, string pattern2)
{
// indices of 2 matches matching 2 patterns
int index1 = -2, index2 = -2;
// 2 Matches to use in calculation
Match m1 = null, m2 = null;
int len1, len2;
// Calculate first match
if (!Regex.IsMatch(input, pattern1)) index1 = -1;
else
{
m1 = Regex.Match(input, pattern1);
index1 = m1.Index;
}
// Calculate second match
if (!Regex.IsMatch(input, pattern2)) index2 = -1;
else
{
m2 = Regex.Match(input, pattern2);
index2 = m2.Index;
}
// if neither is found, return input
if (index1 == -1 && index2 == -1) return input;
// Otherwise, at least 1 is found. Return a substring
// pattern1 not found.
if (index1 == -1)
{
if (index2 > 0)
return input.Substring(0, index2); // treat as second
else
return input.Substring(index2 + m2.Length); // treat as first
}
// Used for no pattern2, identical patterns, and overlaps
// Length of input to end of m1
len1 = index1 + m1.Length;
//pattern2 not found.
if (index2 == -1)
{
if (len1 < input.Length)
return input.Substring(len1); // treat as first
else
return input.Substring(0, index1); // treat as second
}
// Length of input to end of m2
len2 = index2 + m2.Length;
//Test for identical patterns
if (pattern1 == pattern2)
{
int[] indices = IndicesOf(input, pattern1);
// overlap, as both are the same
if (indices.Length == 1) return "";
return input.Substring(len1, indices[1] - len1);
}
// Not identical patterns. Test for overlap
// Test for overlap (index2 falls inside m1)
if (index2 >= index1 && index2 <= len1) return "";
// Test for overlap (index1 falls inside m2)
if (index1 >= index2 && index1 <= len2) return "";
// No overlap. See which one is first, and get value between
// m1 is first match
if (index2 < index1)
return input.Substring(len2, index1 - len2);
// m2 is first match
// Length of input to end of m1
len1 = index1 + m1.Length;
return input.Substring(len1, index2 - len1);
}
/// <summary>
/// Returns a Substring of a string
/// before the first occurrence of a pattern in the string
/// </summary>
/// <param name="input">string to evaluate</param>
/// <param name="pattern">pattern to match</param>
/// <returns>Substring of input string, starting from the beginning
/// of the string and ending before the first character of the
pattern</returns>
/// <remarks>If there is no match, returns the input string.
public static string Substring(string input, string pattern)
{
return Substring(input, pattern, true);
}
A couple of notes: You will need to reference the
System.Text.RegularExpressions NameSpace to use these. You may want to
change the names of the methods for clarity. I have them in a class for
doing Regular Expression functions, so the class name is sufficient for my
needs. Also, carefully examine the Substring method in particular. The rules
for it are fairly complex, and may not conform to the same rules in Delphi.
I have commented it quite a bit for clarity. It is not primarily concerned
about the order of the 2 patterns, unless one of them is not found. It
returns the entire string if neither of them is found. If only one pattern
is not found, it attempts first to use the order in which they appear, but
the rule changes if, for example, the first pattern matches, but at the end
of the string, or the second pattern matches, but at the beginning of the
string. In essence, it treats a non-match as if it were a blank string.
Of course, these methods could be expanded an extended quite a bit. Some of
them only look for a single match. But they should give you (or anyone) a
good starting point for your own class library.
--
HTH,
Kevin Spencer
Microsoft MVP
..Net Developer
Ambiguity has a certain quality to it.