This re-designs the matrix stack so we now keep track of each separate
operation such as rotating, scaling, translating and multiplying as
immutable, ref-counted nodes in a graph.
Being a "graph" here means that different transformations composed of
a sequence of linked operation nodes may share nodes.
The first node in a matrix-stack is always a LOAD_IDENTITY operation.
As an example consider if an application where to draw three rectangles
A, B and C something like this:
cogl_framebuffer_scale (fb, 2, 2, 2);
cogl_framebuffer_push_matrix(fb);
cogl_framebuffer_translate (fb, 10, 0, 0);
cogl_framebuffer_push_matrix(fb);
cogl_framebuffer_rotate (fb, 45, 0, 0, 1);
cogl_framebuffer_draw_rectangle (...); /* A */
cogl_framebuffer_pop_matrix(fb);
cogl_framebuffer_draw_rectangle (...); /* B */
cogl_framebuffer_pop_matrix(fb);
cogl_framebuffer_push_matrix(fb);
cogl_framebuffer_set_modelview_matrix (fb, &mv);
cogl_framebuffer_draw_rectangle (...); /* C */
cogl_framebuffer_pop_matrix(fb);
That would result in a graph of nodes like this:
LOAD_IDENTITY
|
SCALE
/ \
SAVE LOAD
| |
TRANSLATE RECTANGLE(C)
| \
SAVE RECTANGLE(B)
|
ROTATE
|
RECTANGLE(A)
Each push adds a SAVE operation which serves as a marker to rewind too
when a corresponding pop is issued and also each SAVE node may also
store a cached matrix representing the composition of all its ancestor
nodes. This means if we repeatedly need to resolve a real CoglMatrix
for a given node then we don't need to repeat the composition.
Some advantages of this design are:
- A single pointer to any node in the graph can now represent a
complete, immutable transformation that can be logged for example
into a journal. Previously we were storing a full CoglMatrix in
each journal entry which is 16 floats for the matrix itself as well
as space for flags and another 16 floats for possibly storing a
cache of the inverse. This means that we significantly reduce
the size of the journal when drawing lots of primitives and we also
avoid copying over 128 bytes per entry.
- It becomes much cheaper to check for equality. In cases where some
(unlikely) false negatives are allowed simply comparing the pointers
of two matrix stack graph entries is enough. Previously we would use
memcmp() to compare matrices.
- It becomes easier to do comparisons of transformations. By looking
for the common ancestry between nodes we can determine the operations
that differentiate the transforms and use those to gain a high level
understanding of the differences. For example we use this in the
journal to be able to efficiently determine when two rectangle
transforms only differ by some translation so that we can perform
software clipping.
Reviewed-by: Neil Roberts <neil@linux.intel.com>
(cherry picked from commit f75aee93f6b293ca7a7babbd8fcc326ee6bf7aef)
Previously flushing the matrices was performed as part of the
framebuffer state. When on GLES2 this matrix flushing is actually
diverted so that it only keeps a reference to the intended matrix
stack. This is necessary because on GLES2 there are no builtin
uniforms so it can't actually flush the matrices until the program for
the pipeline is generated. When the matrices are flushed it would
store the age of modifications on the matrix stack so that it could
detect when the matrix hasn't changed and avoid flushing it.
This patch changes it so that the pipeline is responsible for flushing
the matrices even when we are using the GL builtins. The same
mechanism for detecting unmodified matrix stacks is used in all
cases. There is a new CoglMatrixStackCache type which is used to store
a reference to the intended matrix stack along with its last flushed
age. There are now two of these attached to the CoglContext to track
the flushed state for the global matrix builtins and also two for each
glsl progend program state to track the flushed state for a
program. The framebuffer matrix flush now just updates the intended
matrix stacks without actually trying to flush.
When a vertex snippet is attached to the pipeline, the GLSL vertend
will now avoid using the projection matrix to flip the rendering. This
is necessary because any vertex snippet may cause the projection
matrix not to be used. Instead the flip is done as a forced final step
by multiplying cogl_position_out by a vec4 uniform. This uniform is
updated as part of the progend pre_paint depending on whether the
framebuffer is offscreen or not.
Reviewed-by: Robert Bragg <robert@linux.intel.com>