This re-designs the matrix stack so we now keep track of each separate
operation such as rotating, scaling, translating and multiplying as
immutable, ref-counted nodes in a graph.
Being a "graph" here means that different transformations composed of
a sequence of linked operation nodes may share nodes.
The first node in a matrix-stack is always a LOAD_IDENTITY operation.
As an example consider if an application where to draw three rectangles
A, B and C something like this:
cogl_framebuffer_scale (fb, 2, 2, 2);
cogl_framebuffer_push_matrix(fb);
cogl_framebuffer_translate (fb, 10, 0, 0);
cogl_framebuffer_push_matrix(fb);
cogl_framebuffer_rotate (fb, 45, 0, 0, 1);
cogl_framebuffer_draw_rectangle (...); /* A */
cogl_framebuffer_pop_matrix(fb);
cogl_framebuffer_draw_rectangle (...); /* B */
cogl_framebuffer_pop_matrix(fb);
cogl_framebuffer_push_matrix(fb);
cogl_framebuffer_set_modelview_matrix (fb, &mv);
cogl_framebuffer_draw_rectangle (...); /* C */
cogl_framebuffer_pop_matrix(fb);
That would result in a graph of nodes like this:
LOAD_IDENTITY
|
SCALE
/ \
SAVE LOAD
| |
TRANSLATE RECTANGLE(C)
| \
SAVE RECTANGLE(B)
|
ROTATE
|
RECTANGLE(A)
Each push adds a SAVE operation which serves as a marker to rewind too
when a corresponding pop is issued and also each SAVE node may also
store a cached matrix representing the composition of all its ancestor
nodes. This means if we repeatedly need to resolve a real CoglMatrix
for a given node then we don't need to repeat the composition.
Some advantages of this design are:
- A single pointer to any node in the graph can now represent a
complete, immutable transformation that can be logged for example
into a journal. Previously we were storing a full CoglMatrix in
each journal entry which is 16 floats for the matrix itself as well
as space for flags and another 16 floats for possibly storing a
cache of the inverse. This means that we significantly reduce
the size of the journal when drawing lots of primitives and we also
avoid copying over 128 bytes per entry.
- It becomes much cheaper to check for equality. In cases where some
(unlikely) false negatives are allowed simply comparing the pointers
of two matrix stack graph entries is enough. Previously we would use
memcmp() to compare matrices.
- It becomes easier to do comparisons of transformations. By looking
for the common ancestry between nodes we can determine the operations
that differentiate the transforms and use those to gain a high level
understanding of the differences. For example we use this in the
journal to be able to efficiently determine when two rectangle
transforms only differ by some translation so that we can perform
software clipping.
Reviewed-by: Neil Roberts <neil@linux.intel.com>
(cherry picked from commit f75aee93f6b293ca7a7babbd8fcc326ee6bf7aef)
Removing CoglHandle has been an on going goal for quite a long time now
and finally this patch removes the last remaining uses of the CoglHandle
type and the cogl_handle_ apis.
Since the big remaining users of CoglHandle were the cogl_program_ and
cogl_shader_ apis which have replaced with the CoglSnippets api this
patch removes both of these apis.
Reviewed-by: Neil Roberts <neil@linux.intel.com>
(cherry picked from commit 6ed3aaf4be21d605a1ed3176b3ea825933f85cf0)
Since the original patch was done after removing deprecated API
this back ported patch doesn't affect deprecated API and so
actually this cherry-pick doesn't remove all remaining use of
CoglHandle as it did for the master branch of Cogl.
This adds a COGL_DEBUG=clipping option that reports how the clip is
being flushed. This is needed to determine whether the scissor,
stencil clip planes or software clipping is being used.
COGL_DEBUG=disable-fast-read-pixel can be used to disable the
optimization for reading a single pixel colour back by looking at the
geometry in the journal and not involving the GPU. With this disabled we
will always flush the journal, rendering to the framebuffer and then use
glReadPixels to get the result.
This adds a cache (A GHashTable) of ARBfp programs and before ever
starting to code-generate a new program we will always first try and
find an existing program in the cache. This uses _cogl_pipeline_hash and
_cogl_pipeline_equal to hash and compare the keys for the cache.
There is a new COGL_DEBUG=disable-program-caches option that can disable
the cache for debugging purposes.
This adds a debug option called disable-software-clipping which causes
the journal to always log the clip stack state rather than trying to
manually clip rectangles.
This adds COGL_DEBUG=disable-fixed to disable the fixed function
pipeline backend. This is needed to test the GLSL shader generation
because otherwise the fixed function backend would always override it.
This adds a COGL_DEBUG=wireframe option to visualize the underlying
geometry of the primitives being drawn via Cogl. This works for triangle
list, triangle fan, triangle strip and quad (internal only) primitives.
It also works for indexed vertex arrays.
Previously in the tests/tools directory we build a disable-npots
library which was used as an LD_PRELOAD to trick Cogl in to thinking
there is no NPOT texture extension. This is a little awkward to use so
it seems much simpler to just define a COGL_DEBUG option to disable
npot textures.
When building with --enable-profile we now depend on the uprof-0.3
developer release which brings a few improvements:
» It lets us "fix" how we initialize uprof so that instead of using a shared
object constructor/destructor (which was a hack used when first adding
uprof support to Clutter) we can now initialize as part of clutter's
normal initialization code. As a side note though, I found that the way
Clutter initializes has some quite serious problems whenever it
involves GOptionGroups. It is not able to guarantee the initialization
of dependencies like uprof and Cogl. For this reason we still use the
contructor/destructor approach to initialize uprof in Cogl.
» uprof-0.3 provides a better API for adding custom columns when reporting
timer and counter statistics which lets us remove quite a lot of manual
report generation code in clutter-profile.c.
» uprof-0.3 provides a shared context for tracking mainloop timer
statistics. This means any mainloop based library following the same
"Mainloop" timer naming convention can use the shared context and no
matter who ends up owning the final mainloop the statistics will always
be in the same place. This allows profiling of Clutter with an
external mainloop such as with the Mutter compositor.
» uprof-0.3 can export statistics over dbus and comes with an ncurses
based ui to vizualize timer and counter stats live.
The latest version of uprof can be cloned from:
git://github.com/rib/UProf.git