krw said:
Multiplies usually take more than one cycle.
Usually, but not necessarily. Some processors do an fused multiply-
add too, saving another cycle ( [(A+B)] * [C-D] > 0). That wasn't
the issue though.
Sure. I was just pointing out that the example will take a lot longer than you
might think. There are a lot of unstated assumptions here (register based
architecture etc)...
Nope. That's in the compare.
Few architectures combine a compare and branch, MIPS is one of the few
that can do that, but even MIPS cannot do the above comparison and branch
in one instruction.
I don't see F, G, H, and I in the problem. Who said anything about
total cycles being zero?
If you claim that E = 1 takes zero cycles then it follows that any similar
assignment takes zero cycles and thus my example would take zero
cycles. Instructions do not take zero cycles.
They certainly can. A processor with register renaming can certainly
do a move in zero cycles. In this case, all the MOVE does is change
pointers in the register file. BTW, some processors even to
exchanges in zero cycles.
Yes it is possible for a move to be implemented using register renaming
in an advanced CPU rather than using the ALU. However this only applies
to register-register moves, not constant moves. And it still doesn't mean
the move takes zero cycles. An advanced implementation could make
moves have zero cycle *latency*, but no CPU I know of does that.
WTF do I need a "constant decoder"?
Constants need to be extracted from the instructions, the bits grouped
together if not adjacent already and zero or signextended to the register
width. On some architectures constants can include a shift or expand into
a pattern like AB00AB00. So you always need to decode the constant
before it can be used.
Nope. Just change the register name(s).
Change the name to what exactly? Register renaming can only rename
registers to other registers, not constants to registers...
Nope. They can be hidden since they need not use the ALU at all,
just a change to the rename file. Note that this move isn't an
independent instruction, so it's even easier to hide the move (a
rename is required at the completion of every arithmetic instruction
anyway).
Which particular CPU are you talking about? An advanced CPU would
have executed E = 1 long before it had done any of the other instructions,
without using any register renaming tricks.
Branches can be predicted or speculatively executed with the
rename/completion contingent on the results. There are all sorts of
games that can be played to save cycles.
Sure. But the issue was how long a simple CPU like the OP proposed would take.
You're thinking of Pentium-class CPUs, but we're talking about far far simpler CPUs
executing at best 1 instruction per cycle (or 1 every N cycles for N-bit serial).
Wilco