S
Stephan Rose
I am currently working on an EDA app and heavily working on squeezing
the last bits of performance out of it. Going as far as sending
batches of geometry to the video card while still processing geometry
to get some parallization going. Though at least on my hardware, this
does not buy me much. I luv my two 7800 GTs in SLI =) But for users
with a lower spec video card, this may actually be of help.
I am also going ahead and running two render threads each processing
half the geometry to make use of hyperthreading or dual core if
available. Bought me a few ms rendering speedup!
The next thing I did instead of using a generic List<Vertex> to store
my created triangles in, I use a static Vertex[] list instead. Once it
fills up, data is committed to the video hardware and the next batch
is processed. I was halfway expecting a speedup here already since the
overhead of calling .Add for the List is now removed but it actually
did not make any significant measurable difference.
So to finally get to my question of the new operator, the next thing I
am looking at is how I am assigning data to my vertex list.
Currently this looks as follows:
polys[currentPoly++] = new Vertex(...);
polys[currentPoly++] = new Vertex(...);
polys[currentPoly++] = new Vertex(...);
and so on...there are quite a few cases where I have multiple lines of
assignments like that, generally always in sets of 3. Triangles are
just wierd that way =)
Now out of all my drawing function, the one that gets called the most
number of times is my function to render a triangulated line with
round caps. So I took this function apart and did the following:
polys[currentPoly].x = coordinate;
polys[currentPoly].y = coordinate;
polys[currentPoly].remaining parameters = values...;
currentPoly++;
polys[currentPoly].x = coordinate;
polys[currentPoly].y = coordinate;
polys[currentPoly].remaining parameters = values...;
currentPoly++;
polys[currentPoly].x = coordinate;
polys[currentPoly].y = coordinate;
polys[currentPoly].remaining parameters = values...;
currentPoly++;
Repeat as necessary for all the assignments. I was expecting that
eliminating the new operator and subsequent copy of the vertex
structure would give me a speed up if I assign the parameters
directly.
I was rather surprised, both pleasantly and not to see it made no
difference. It's nice because the new Vertex() way is more readable
code-wise. But...I would have really liked to have seen a performance
improvement.
So what exatly does the new operator do in this case? Does the
compiler somehow optimize the new operator away and generate code to
assign the values manually like I tried to avoid creating a struct and
copying it? Technically it is a possibility since it is assigning
identical value types to each other...so it does know what's
ultimately going to happen.
Just curious...
And damnit..I now need to find something else to do to get more
speed!! =)
--
Stephan
2003 Yamaha R6
kimi no koto omoidasu hi
nante nai no wa
kimi no koto wasureta toki ga nai kara
the last bits of performance out of it. Going as far as sending
batches of geometry to the video card while still processing geometry
to get some parallization going. Though at least on my hardware, this
does not buy me much. I luv my two 7800 GTs in SLI =) But for users
with a lower spec video card, this may actually be of help.
I am also going ahead and running two render threads each processing
half the geometry to make use of hyperthreading or dual core if
available. Bought me a few ms rendering speedup!
The next thing I did instead of using a generic List<Vertex> to store
my created triangles in, I use a static Vertex[] list instead. Once it
fills up, data is committed to the video hardware and the next batch
is processed. I was halfway expecting a speedup here already since the
overhead of calling .Add for the List is now removed but it actually
did not make any significant measurable difference.
So to finally get to my question of the new operator, the next thing I
am looking at is how I am assigning data to my vertex list.
Currently this looks as follows:
polys[currentPoly++] = new Vertex(...);
polys[currentPoly++] = new Vertex(...);
polys[currentPoly++] = new Vertex(...);
and so on...there are quite a few cases where I have multiple lines of
assignments like that, generally always in sets of 3. Triangles are
just wierd that way =)
Now out of all my drawing function, the one that gets called the most
number of times is my function to render a triangulated line with
round caps. So I took this function apart and did the following:
polys[currentPoly].x = coordinate;
polys[currentPoly].y = coordinate;
polys[currentPoly].remaining parameters = values...;
currentPoly++;
polys[currentPoly].x = coordinate;
polys[currentPoly].y = coordinate;
polys[currentPoly].remaining parameters = values...;
currentPoly++;
polys[currentPoly].x = coordinate;
polys[currentPoly].y = coordinate;
polys[currentPoly].remaining parameters = values...;
currentPoly++;
Repeat as necessary for all the assignments. I was expecting that
eliminating the new operator and subsequent copy of the vertex
structure would give me a speed up if I assign the parameters
directly.
I was rather surprised, both pleasantly and not to see it made no
difference. It's nice because the new Vertex() way is more readable
code-wise. But...I would have really liked to have seen a performance
improvement.
So what exatly does the new operator do in this case? Does the
compiler somehow optimize the new operator away and generate code to
assign the values manually like I tried to avoid creating a struct and
copying it? Technically it is a possibility since it is assigning
identical value types to each other...so it does know what's
ultimately going to happen.
Just curious...
And damnit..I now need to find something else to do to get more
speed!! =)
--
Stephan
2003 Yamaha R6
kimi no koto omoidasu hi
nante nai no wa
kimi no koto wasureta toki ga nai kara