c# XOR equals ( ^= ) bug?

J

Jason

Could someone please tell me if this is a bug or is it by
design?

I am using the triple XOR swap trick for two integers. I
show three "techniques", but only the first one works.
All three work in C++, but I get weird results in c#.

technique 1: (works in C#)
x ^= y;
y ^= x;
x ^= y;

technique 2 (doesn't work correctly in C#, does in C++)
x ^= y ^= x ^= y;

technique 3 doesn't work correctly in C#, does in C++)
x ^= (y ^= (x ^= y));


Here is some code you can cut and paste into vs.net.
Uncomment the parts in the Swap() function to try each
technique and see that only the first one works.


class Class1
{
[STAThread]
static void Main(string[] args)
{
int x = 1;
int y = 2;
Console.WriteLine("before swap");
Console.WriteLine("x: {0}", x);
Console.WriteLine("y: {0}", y);

Swap(ref x, ref y);

Console.WriteLine("\nafter swap");
Console.WriteLine("x: {0}", x);
Console.WriteLine("y: {0}", y);
}

public static void Swap(ref int x, ref int y) {

//Technique 1
// this works perfectly, just as in c++
x ^= y;
y ^= x;
x ^= y;

//Technique 2
// Put all on a single line.
// However, this produces incorrect results in
// C#, but works correctly in C++
//
//x ^= y ^= x ^= y;

//Technique 3
// This also doesn't work. C# should evaluate
// from the right (like C++), but something
// isn't working.
//
//x ^= (y ^= (x ^= y));
}
}
 
J

Jason

two things more:

1. I am using VS.Net 2003

2. There seems to be a bug in the debugger.
Highlighting an expression like "x ^= y" with the mouse
and then hovering over the operator will display a
tooltip with the answer. However the original values
change and from that point on the values may not be
correct!

To reproduce in the code from my previous post:

1. Load the code from my previous post. Uncomment the
first technique. comment the others out.

2. Put a breakpoint on the first of the three lines of
the triple XOR swap.

public static Swap( ref int x, ref int y)
{
breakPoint--> x ^= y;
y ^= x;
x ^= y;
}

3. run the code in debug mode. the program will stop on
the breakpoint.

4. now, without advancing (no F10), highlight the one of
the expressions with the mouse and then hover over the
operator (^=) in that line. The tooltip will show the
results of the evaluation, but now the values in x and y
are changed permanently.

NOTE: when highlighting, make sure not to highlight the
semicolon at the end of the line or the tooltip won't
show.

Jason P.
 
J

Jason

Michael,
Thanks for answering back so soon.
In C# it looks like what is happening is that x is being xored with its
original value in the last step, because the result I
get is 0.

I too get the 0 (zero) for the result, but as you know
it's wrong.
And what we have here is basically the old difference between
call-by-reference and call-by-value. In C++, variables are memory
locations. In C#, they are values. So in C#, the compiler first grabs the
current values of x and y, filling them in for the appropriate rvalue
variables, and then does the execution.

Yes you seem to be on to something, however all three
techniques are functionally equivilant. So they sould all
work. C# is supposed to evaluate from the right in this
situation (just as C++ would).


Also, take a look at my other post about a bug in the
debugger. See what you think.
Jason P.
 
J

Jason

Thanks Mr. Culley.

The thread made sense, especially this post:

-------------------------------
I believe there is a difference how C++ and C# consider
compound assignment
operators. In C++ left operand is evaluated as lvalue and
stored, after that
right operand is evaluated as rvalue and then compound
assignment operator
is applied to lvalue and rvalue.

On the other hand C# expands compound assignment operator
(A^=B => A=A^B)
and evaluates left operand (A) first time as lvalue, than
right operand
(A^B) as rvalue and then performs assignment. In C#
A^=B^=A^=B is equal to
A=A^(B=B^(A=A^B)) where arguments are evaluated from left
to right.
 
A

Adam Benson

Hi,

This is only a general observation, for what its worth.
But if 1 line of code is this difficult to figure out, then surely it's not
worth it : turn it into several statements that are obvious.

Yes, it's less efficient (or maybe not after the optimizer has done its job)
but the time saved in coding the thing up, and the time saved if anyone else
has to pick up your code is worth it.

Only a suggestion.

Regards,

Adam.
 
M

Michael A. Covington

In general, it appears that in C#, if you evaluate


x + blah


where + is any operator, and blah contains something that changes the value
of x,

then C# will use the *old* value of x for the first argument of +, and C++
will use the *new* value.


I still don't know if this is intentional, or wise. As an old Pascal
programmer, I tend to that that assignments ought not to be mixed up with
evaluations anyhow. And I note that C# is the work of another old Pascal
programmer (Anders Hejlsberg, of Turbo Pascal fame).
 
M

Michael A. Covington

This is only a general observation, for what its worth.
But if 1 line of code is this difficult to figure out, then surely it's not
worth it : turn it into several statements that are obvious.

I agree wholeheartedly, but at the same time, we've unearthed a fairly deep
peculiarity of C# semantics. It's bothersome, because things that look like
C ought to act like C.
 
M

Michael A. Covington

Iain Simpson said:
This all depends on what you interpret to be "correct" behaviour.

In general I would recommend against using more than one ?= type operator on
the same line.

The C# specification may well say that these expressions are evaluated right
to left, but someone who doesn't know this, and is reading your code might get
confused as to the final result of your expression.
Regardless of the specification, the semantics of this are unclear, and almost
certainly inconsistent when you move between different (C/C++) compilers.

Can we find C++ compilers that go different ways on this? That would be
very interesting.
I don't think that technique 2 would be more efficient than technique 1, as
the compiler is probably going to split your line up into 3 separate
operations anyway.

Yes, there's no difference in the amount of work done by the computer.

Actually, I don't like tricks, so I'd see if C# has a built-in "swap"
function (it may well), and if not, do:

t = x;
x = y;
y = t;

But this was interesting anyhow!
 
J

Jon Skeet

Michael A. Covington said:
In general, it appears that in C#, if you evaluate

x + blah

where + is any operator, and blah contains something that changes the value
of x,

then C# will use the *old* value of x for the first argument of +, and C++
will use the *new* value.

Yes. As per the spec:

<quote>
The order in which operands in an expression are evaluated, is left to
right.
I still don't know if this is intentional, or wise.

I think it's simple and reasonably natural, even if it's not the way
that C does it (assuming it's even well defined in C - I haven't
checked).
I agree wholeheartedly, but at the same time, we've unearthed a fairly
deep peculiarity of C# semantics. It's bothersome, because things that
look like C ought to act like C.

I would say it's more that we've unearthed a case where C# has
corrected a fairly deep peculiary of C semantics. There's no reason why
it *should* evaluate the second operand first, and from a human
"reading left to right" perspective, it makes more sense to evaluate
the first operand first.

There are various ways in which C# doesn't act like C despite looking
like it (conditionals having to be boolean, for instance) and I for one
am grateful for that :)
 
H

Hong Hao

Another way to look at the issue has to do with the order in which
parameters passed into a function call are evaluated. If

int xor(int, int)

is a functin call that takes two integers, preform a xor operation on them
and return the result, the expression

x ^= y ^= x ^= y;

is then equivalent to, for variable x,

x = xor(x, xor(y, xor(x, y)));

It seems the C# evaluates the parameters from left to right, while C++ does
the reverse.

These behaviors are best demostrated with the following examples. First is a
simple C++ code:


#include <iostream.h>

int func1()
{
cout << "func1 called." << endl;
return 1;
}

int func2()
{
cout << "func2 called." << endl;
return 2;
}

void func3(int i1, int i2)
{
cout << "func3 called with " << i1 << " and " << i2 << endl;
}

void main(void)
{
func3(func1(), func2());
}

This code produces the following output

func2 called.
func1 called.
func3 called with 1 and 2

The equivalent C# console application is

using System;

namespace ConsoleApplication1
{
class Class1
{
[STAThread]
static void Main(string[] args)
{
func3(func1(), func2());
}

static int func1()
{
Console.WriteLine("func1 called.");
return 1;
}

static int func2()
{
Console.WriteLine("func2 called.");
return 2;
}

static void func3(int i1, int i2)
{
Console.Write("func3 called with ");
Console.Write(i1);
Console.Write(" and ");
Console.WriteLine(i2);
}
}
}

which produces the following output

func1 called.
func2 called.
func3 called with 1 and 2

The outputs clearly showed the difference in ordering.

Hong Hao
 
H

Harry Bosch

Jon Skeet said:
There are various ways in which C# doesn't act like C despite looking
like it (conditionals having to be boolean, for instance) and I for
one am grateful for that :)

Absolutely!
 
M

Michael A. Covington

I would say it's more that we've unearthed a case where C# has
corrected a fairly deep peculiary of C semantics. There's no reason why
it *should* evaluate the second operand first, and from a human
"reading left to right" perspective, it makes more sense to evaluate
the first operand first.

There are various ways in which C# doesn't act like C despite looking
like it (conditionals having to be boolean, for instance) and I for one
am grateful for that :)

Actually you make very good points.

In C, it has become customary to use various constructs in peculiar and
confusing ways.

Someone else indicated that this particular point of semantics is actually
undefined in C and C++, and it's just happenstance that most compilers go
the opposite way than C#. I wonder if that is correct.
 
H

Harry Bosch

Michael A. Covington said:
At one time my name for C was "UPL" ("uninitialized pointer language")
and a lot of those GPF's in Windows, which the public was blaming on
Windows, were of course due to klutzy application programming.
Programmers have gotten a little better. But I never liked the
penchant for obscurity that seemed to be a characteristic, not of the
C language, but of the C community.

UPL -- That's funny! :) I call it "C Programmer's Disease" when a person
actually *likes* coding that way :) Another symptom of this disease is an
obsession with optimizing everything, even things which may be executed
rarely, or only once in the entire app. It's basically an inability to
evaluate the difference between critical and non-critical code. They don't
need profilers, because they spend all their time optimizing EVERYTHING :)
 
J

Jason

Of course, I'm not a noob. I just wanted to see what
others thought.

I also don't like to use "trick" programming (all on one
line stuff), but I just wanted to investigate this
difference between C++ and C#.
 
M

Michael A. Covington

Harry Bosch said:
UPL -- That's funny! :) I call it "C Programmer's Disease" when a person
actually *likes* coding that way :) Another symptom of this disease is an
obsession with optimizing everything, even things which may be executed
rarely, or only once in the entire app. It's basically an inability to
evaluate the difference between critical and non-critical code. They don't
need profilers, because they spend all their time optimizing EVERYTHING
:)

Right, and they write C but think in assembly language, pretending that a
very simple, non-optimizing compiler is doing the translating.

When I think in assembly language, I write in assembly language!
 
M

Michael A. Covington

Jason said:
Of course, I'm not a noob. I just wanted to see what
others thought.

I also don't like to use "trick" programming (all on one
line stuff), but I just wanted to investigate this
difference between C++ and C#.

Understood. It was an interesting adventure!
 
M

Michael A. Covington

Jason said:
Stu,

Then why can I do this (in C++):


int x, y, z;
x = y = z = 0;

They must evaluate from the right in order to make sense!

Interesting point. All three variables are lvalues here, so they aren't
*evaluated* at all. It's only rvalues that get evaluated.

BTW, responding to another message, I understand entirely that you aren't a
newbie and that the triple XOR compound assignment statement is not good
style. And I also agree wholeheartedly with your point, which is that if
something works at all, it ought to work the way we expect!
 
H

Harry Bosch

Jason said:
Stu,

Then why can I do this (in C++):


int x, y, z;
x = y = z = 0;

They must evaluate from the right in order to make sense!

That's because assignment is right-associative. See your Stroustrup,
section 6.2.
 
H

Harry Bosch

Michael A. Covington said:
Right, and they write C but think in assembly language, pretending
that a very simple, non-optimizing compiler is doing the translating.

I can make fun of C/C++ programmes, because I spent enough years with each
language to have earned the right to do so :)
When I think in assembly language, I write in assembly language!

It's a good practice with each language you work with, but I find it
difficult sometimes. Once I was working on two projects each day, one in
Java, the other in C++. It was rough switching mental gears so frequently.
And oddly enough, each time I switched there were things I missed from the
other language, but also things I liked over the other. I think it depended
on the kinds of bugs I was fighting on each project :)
 
M

Michael A. Covington

Harry Bosch said:
That's because assignment is right-associative. See your Stroustrup,
section 6.2.

More importantly, nothing in this requires evaluating x, y, or z, only
determining their memory locations, which are unambiguous. The value
previously in x, y, or z is not used here as it was in our example.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top