Encoding.GetEncode Problem

T

Tamir Khason

I have Windows Form application recieved data from clipboard and convert its
encoding based on some ruls. So doing following:
//from source[source] to multiple targets[targ1,targ2]
System.Text.Encoding targ1 = Encoding.GetEncoding("Target_1_code_name");
System.Text.Encoding targ2 = Encoding.GetEncoding("Target_2_code_name");
System.Text.Encoding source = Encoding.GetEncoding("Source_code_name");
byte[] sourceBytes = source.GetBytes(source);
byte[] targ1Bytes = Encoding.Convert(source, targ1, sourceBytes, 0,
sourceBytes.Length);
byte[] targ2Bytes = Encoding.Convert(source, targ2, sourceBytes, 0,
sourceBytes.Length);
string targ1String = source.GetString(targ1Bytes,0,targ1Bytes.Length);
string targ2String = source.GetString(targ2Bytes,0,targ2Bytes.Length);

BUT Nothing happens....(the source strings continue to be looking like it
was, target undepended). e.g.targ1String = targ2String =source;

What can be a problem
 
D

Dmitriy Lapshin [C# / .NET MVP]

Hi,

Strings (I mean System.String) are always Unicode - hence targ1String ==
targ2String == source.
But if you compare sourceBytes, targ1Bytes and targ2Bytes, these should be
different.
 
J

Jon Skeet [C# MVP]

Tamir Khason said:
I have Windows Form application recieved data from clipboard and convert its
encoding based on some ruls.

If you've received the data as a string, you don't need to do any
encoding conversion. It's already in Unicode.
 
T

Tamir Khason

So how to do it?

--
Tamir Khason
You want dot.NET? Just ask:
"Please, www.dotnet.us "

Dmitriy Lapshin said:
Hi,

Strings (I mean System.String) are always Unicode - hence targ1String ==
targ2String == source.
But if you compare sourceBytes, targ1Bytes and targ2Bytes, these should be
different.

--
Sincerely,
Dmitriy Lapshin [C# / .NET MVP]
Bring the power of unit testing to the VS .NET IDE today!
http://www.x-unity.net/teststudio.aspx

Tamir Khason said:
I have Windows Form application recieved data from clipboard and convert
its
encoding based on some ruls. So doing following:
//from source[source] to multiple targets[targ1,targ2]
System.Text.Encoding targ1 = Encoding.GetEncoding("Target_1_code_name");
System.Text.Encoding targ2 = Encoding.GetEncoding("Target_2_code_name");
System.Text.Encoding source = Encoding.GetEncoding("Source_code_name");
byte[] sourceBytes = source.GetBytes(source);
byte[] targ1Bytes = Encoding.Convert(source, targ1, sourceBytes, 0,
sourceBytes.Length);
byte[] targ2Bytes = Encoding.Convert(source, targ2, sourceBytes, 0,
sourceBytes.Length);
string targ1String = source.GetString(targ1Bytes,0,targ1Bytes.Length);
string targ2String = source.GetString(targ2Bytes,0,targ2Bytes.Length);

BUT Nothing happens....(the source strings continue to be looking like it
was, target undepended). e.g.targ1String = targ2String =source;

What can be a problem
 
T

Tamir Khason

Following ALL options tried - Nothing works
HOW TO DI IT?
------------------------------------BIG
CHUCK---------------------------------
static string ConvertFormat(string source,int sourceFormat, int
targetFormat)
{
if(source == null)
return source;

System.Text.Encoding sourceEnc = Encoding.GetEncoding(sourceFormat);
System.Text.Encoding targetEnc = Encoding.GetEncoding(targetFormat);
byte[] sourceBytes = sourceEnc.GetBytes(source);
byte[] targetBytes = Encoding.Convert(sourceEnc,targetEnc,sourceBytes);

char[] targetChars = new char[targetEnc.GetCharCount(targetBytes)];
targetEnc.GetChars(targetBytes, 0, targetBytes.Length, targetChars, 0);
return new string(targetChars);


//return sourceEnc.GetString(targetBytes);
}

static string ConvertEncoding(string source, int sourceFormat, int
targetFormat)
{
if(source==null)
return source;
System.Text.Encoding sourceEnc = Encoding.GetEncoding(sourceFormat);
System.Text.Encoding targetEnc = Encoding.GetEncoding(targetFormat);
byte[] sourceBytes = sourceEnc.GetBytes(source);
return targetEnc.GetString(sourceBytes);
}

public static string ConvertEncode(string source, int targetFormat)
{
if(source==null)
return source;

System.Text.Encoding targetEnc = Encoding.GetEncoding(targetFormat);
byte[] bytes = targetEnc.GetBytes(source);
char[] chars = new char[bytes.Length];
for(int index=0; index<bytes.Length; index++)
{
chars[index] = Convert.ToChar(bytes[index]);
}

string s = new string(chars);
return s;
}

------------------------------------END
CHUNK--------------------------------
 
J

Jon Skeet [C# MVP]

Tamir Khason said:
Following ALL options tried - Nothing works
HOW TO DI IT?

It would help if you'd say exactly what you were trying to do.

The string is already in Unicode - you shouldn't be trying to convert
it to anything else, as strings are *always* in Unicode in .NET.
 
T

Tamir Khason

I have to convert from different formats to different formats
Just for example: from
20866 to

1251

Following the source string:

ÉÎÔÅÌÌÉÇÅÎÔËÁ
 
T

Tamir Khason

But I need conversion from one code page to another, does it means it's
impossible?

--
Tamir Khason
You want dot.NET? Just ask:
"Please, www.dotnet.us "

Dmitriy Lapshin said:
Hi,

Strings (I mean System.String) are always Unicode - hence targ1String ==
targ2String == source.
But if you compare sourceBytes, targ1Bytes and targ2Bytes, these should be
different.

--
Sincerely,
Dmitriy Lapshin [C# / .NET MVP]
Bring the power of unit testing to the VS .NET IDE today!
http://www.x-unity.net/teststudio.aspx

Tamir Khason said:
I have Windows Form application recieved data from clipboard and convert
its
encoding based on some ruls. So doing following:
//from source[source] to multiple targets[targ1,targ2]
System.Text.Encoding targ1 = Encoding.GetEncoding("Target_1_code_name");
System.Text.Encoding targ2 = Encoding.GetEncoding("Target_2_code_name");
System.Text.Encoding source = Encoding.GetEncoding("Source_code_name");
byte[] sourceBytes = source.GetBytes(source);
byte[] targ1Bytes = Encoding.Convert(source, targ1, sourceBytes, 0,
sourceBytes.Length);
byte[] targ2Bytes = Encoding.Convert(source, targ2, sourceBytes, 0,
sourceBytes.Length);
string targ1String = source.GetString(targ1Bytes,0,targ1Bytes.Length);
string targ2String = source.GetString(targ2Bytes,0,targ2Bytes.Length);

BUT Nothing happens....(the source strings continue to be looking like it
was, target undepended). e.g.targ1String = targ2String =source;

What can be a problem
 
D

Dmitriy Lapshin [C# / .NET MVP]

Wow, the source string looks like it is in Russian :)
And it looks like you need to convert it from CP 866 (DOS Cyrillic) to Win
1251 (Win Cyrillic). Well, let me explain. The targ1Bytes and targ2Bytes
*ARE* the results of the conversion - they contain the actual bytes
representing the string in the target encoding. For example, you can write
these bytes to a file and you'll have the file with the string in the target
encoding.

So the question is - what you are going to do with the results?

--
Sincerely,
Dmitriy Lapshin [C# / .NET MVP]
Bring the power of unit testing to the VS .NET IDE today!
http://www.x-unity.net/teststudio.aspx

Tamir Khason said:
I have to convert from different formats to different formats
Just for example: from
20866 to

1251

Following the source string:

ÉÎÔÅÌÌÉÇÅÎÔËÁ
 
J

Jon Skeet [C# MVP]

Tamir Khason said:
I have to convert from different formats to different formats
Just for example: from
20866 to

1251

Following the source string:

ÉÎÔÅÌÌÉÇÅÎÔËÁ

But a string effectively doesn't *have* an encoding - it's just in
Unicode. While you have the data in a string, you don't need to worry
about an encoding. You only need to worry about an encoding if you need
to convert from/to a byte array.
 
T

Tamir Khason

I'm russian for this case ;) , but the project should work with more then
600 codepages ;)
Anyway. I'm going to display it in windows form or set to clipboard.
Following source are for sourceEnc - as well textbox and clipboard

--
Tamir Khason
You want dot.NET? Just ask:
"Please, www.dotnet.us "

Dmitriy Lapshin said:
Wow, the source string looks like it is in Russian :)
And it looks like you need to convert it from CP 866 (DOS Cyrillic) to Win
1251 (Win Cyrillic). Well, let me explain. The targ1Bytes and targ2Bytes
*ARE* the results of the conversion - they contain the actual bytes
representing the string in the target encoding. For example, you can write
these bytes to a file and you'll have the file with the string in the
target encoding.

So the question is - what you are going to do with the results?

--
Sincerely,
Dmitriy Lapshin [C# / .NET MVP]
Bring the power of unit testing to the VS .NET IDE today!
http://www.x-unity.net/teststudio.aspx
 
T

Tamir Khason

Let me try to explain:
I have some text in textbox/clipboard in some encoding (codepage) and should
display it in other textbox or replace clipoard buffer with the same string,
but in other encoding

So I have to convert to byte from from byte to destinatiopn string

--
Tamir Khason
You want dot.NET? Just ask:
"Please, www.dotnet.us "

Tamir Khason said:
I have to convert from different formats to different formats
Just for example: from
20866 to

1251

Following the source string:

ÉÎÔÅÌÌÉÇÅÎÔËÁ

But a string effectively doesn't *have* an encoding - it's just in
Unicode. While you have the data in a string, you don't need to worry
about an encoding. You only need to worry about an encoding if you need
to convert from/to a byte array.
 
J

Jon Skeet [C# MVP]

Tamir Khason said:
Let me try to explain:
I have some text in textbox/clipboard in some encoding (codepage) and should
display it in other textbox or replace clipoard buffer with the same string,
but in other encoding

So I have to convert to byte from from byte to destinatiopn string

No, you really don't. .NET does all the conversion for you - if the
text has been correctly placed on the clipboard, you should just be
able to get it as a string and display it. There's no such concept as
"the same string but in other encoding".
 
Top