Neil Roberts 2ded18933e cogl-matrix: Get rid of the *_packed variants
cogl_matrix_project_points and cogl_matrix_transform_points had an
optimization for the common case where the stride parameters exactly
match the size of the corresponding structures. The code for both when
generated by gcc with -O2 on x86-64 use two registers to hold the
addresses of the input and output arrays. In the strided version these
pointers are incremented by adding the value of a register and in the
packed version they are incremented by adding an immediate value. I
think the difference in cost here would be negligible and it may even
be faster to add a register.

Also GCC appears to retain the loop counter in a register for the
strided version but in the packed version it can optimize it out and
directly use the input pointer as the counter. I think it would be
possible to reorder the code a bit to explicitly use the input pointer
as the counter if this were a problem.

Getting rid of the packed versions tidies up the code a bit and it
could potentially be faster if the code differences are small and we
get to avoid an extra conditional in cogl_matrix_transform_points.
2011-02-01 13:18:43 +00:00
..