My first attempt at this and I'm searching formulas like so
RIGHT(TEXT(A15,'yy'),1)*1000+A15-CONCATENATE(1,'-','jan','-',TEXT
(A15,'yy'))+1
I want to extract the row / col coordinates (A15 in above)
so I'm using this [a-z]+\d+\d*
however I want unique ones and not 3 copies of A15 as the above
returns.
How do I do that? I was reading that (?!) is used for uniqueness
but I cannot seem to get it to work here
Dave,
(?!) is the negative lookahead construct. I don't think it would be
applicable here.
What might work best is to get all of the matches, then put them into
a hash table to get just the unique ones.
The regex "[a-z]+\d+\d*" probably matches too much (also the last \d*
isn't needed). If this is for an Excel spreadsheet, note that Excel
has a limit of 256 columns (A thru IV) and 65,536 rows per worksheet.
So a regular expression to match Excel cell references should take
this into account so it doesn't match things that can't be valid cell
references.
For example:
string regex = @"
#
# This regex matches strings like A15, IV3491, C9 or DP138.
#
\b # Match a word boundary.
(?:[a-z]|[a-i][a-z]?) # Match cell IDs A thru IV.
\d{1,5} # Match between one and five numbers.
\b # Match a word boundary.
";
string s =
"RIGHT(TEXT(A15,'yy'),1)*1000+A15-CONCATENATE(1,'-','jan'," +
"'-',TEXT(A15,'yy'))+1";
// Get all of the matches.
MatchCollection mc = Regex.Matches(s, regex,
RegexOptions.Singleline |
RegexOptions.IgnorePatternWhitespace |
RegexOptions.IgnoreCase);
Console.WriteLine("All cell IDs = {0}", mc.Count);
// Put all of the non-unique matches into the keys of
// a hash table.
Hashtable uniqueCells = new Hashtable(mc.Count);
foreach (Match m in mc)
uniqueCells[m.Value.ToUpper()] = null;
Console.WriteLine("Unique cell IDs = {0}", uniqueCells.Count);
// Print the unique cell IDs.
foreach (DictionaryEntry de in uniqueCells)
Console.WriteLine(de.Key.ToString());