P 
		
								
				
				
			
		Pavils Jurjans
Hello,
Here's an excerpt from msdn online documentation:
An index is the position of a Char, not a Unicode character, in a String. An
index is a zero-based, nonnegative number starting from the first position
in the string, which is index position zero. Consecutive index values might
not correspond to consecutive Unicode characters because a Unicode character
might be encoded as more than one Char. To work with each Unicode character
instead of each Char, use the System.Globalization.StringInfo class.
I did some testing:
string test = "%BC";
Console.WriteLine((long) test[0]);
Console.WriteLine((long) test[1]);
Console.WriteLine((long) test[2]);
Console.WriteLine(test.IndexOf("%"));
Console.WriteLine(test.IndexOf("B"));
Console.WriteLine(test.IndexOf("C"));
// Where "%" is actually a japanese character (code=38283), the cs file is
saved with UTF-8 encoding.
So, from the doc, I should get something like 0, 3, 4. But I actually get
normal 0, 1, 2.
So, where's explanation? Why documentation is warning about some difference
of indexes and unicode characters, while I can't really detect one?
-- Pavils
				
			Here's an excerpt from msdn online documentation:
An index is the position of a Char, not a Unicode character, in a String. An
index is a zero-based, nonnegative number starting from the first position
in the string, which is index position zero. Consecutive index values might
not correspond to consecutive Unicode characters because a Unicode character
might be encoded as more than one Char. To work with each Unicode character
instead of each Char, use the System.Globalization.StringInfo class.
I did some testing:
string test = "%BC";
Console.WriteLine((long) test[0]);
Console.WriteLine((long) test[1]);
Console.WriteLine((long) test[2]);
Console.WriteLine(test.IndexOf("%"));
Console.WriteLine(test.IndexOf("B"));
Console.WriteLine(test.IndexOf("C"));
// Where "%" is actually a japanese character (code=38283), the cs file is
saved with UTF-8 encoding.
So, from the doc, I should get something like 0, 3, 4. But I actually get
normal 0, 1, 2.
So, where's explanation? Why documentation is warning about some difference
of indexes and unicode characters, while I can't really detect one?
-- Pavils
