The profiling support was broken - probably during the restructuring of
the build environment, but I'm too lazy to bisect that.
The fix is trivial, and everything works as it should.
When converting the virtual coordinates of the underlying texture for
a slice to virtual coordinates for the whole texture it was using the
size and offset of the intersection as the size of the child
texture. This would be incorrect if the texture contains waste or the
texture coordinates are not the default. Instead the sliced foreach
function now passes the CoglSpan to the callback instead of the
intersection.
http://bugzilla.clutter-project.org/show_bug.cgi?id=2398
The size of the texture used for test-cogl-npot-texture was only using
1 pixel of waste and the texture was scaled down so it would be quite
likely that the test would still pass if only the top left slice was
rendered. It also didn't test using non-default texture
coordinates. These problems made it fail to pick up bug 2398. The
texture is now using the maximum amount of waste and rendered in four
parts at 1:1 scale.
Previously in the tests/tools directory we build a disable-npots
library which was used as an LD_PRELOAD to trick Cogl in to thinking
there is no NPOT texture extension. This is a little awkward to use so
it seems much simpler to just define a COGL_DEBUG option to disable
npot textures.
When converting the virtual coordinates of the underlying texture for
a slice to virtual coordinates for the whole texture it was using the
size and offset of the intersection as the size of the child
texture. This would be incorrect if the texture contains waste or the
texture coordinates are not the default. Instead the sliced foreach
function now passes the CoglSpan to the callback instead of the
intersection.
http://bugzilla.clutter-project.org/show_bug.cgi?id=2398
Previously in the tests/tools directory we build a disable-npots
library which was used as an LD_PRELOAD to trick Cogl in to thinking
there is no NPOT texture extension. This is a little awkward to use so
it seems much simpler to just define a COGL_DEBUG option to disable
npot textures.
Instead of waiting until clutter_actor_paint to check if there are any
handlers connected to the "paint" signal, we now do the check whenever
the paint-volume is requested in _actor_get_paint_volume_mutable().
Previously we checked in clutter_actor_paint(), but at that time we may
already be using a stage clip that could be derived from an invalid
paint-volume. We used to try and handle that by queuing a follow up,
unclipped, redraw but anyway there was an additional problem with the
previous approach because the checking wasn't enough to always catch
invalid volumes involved in culling (considering that containers may
derive their volume from children that haven't yet been painted)
By moving the check to _get_paint_volume time not only do we now
correctly check children in cases where a container derives its volume
from its children's volumes but we no longer need to queue follow up
redraws to cover up artefacts.
Since we now never queue follow up redraws, this in turn means we should
no longer clobber redraws queued with an explicit clip which was
something affecting gnome-shell since it connects a handler to the paint
signal of the stage.
http://bugzilla.clutter-project.org/show_bug.cgi?id=2388
In some micro-benchmarks testing journal throughput the list
manipulation jumps pretty high in the profile. This replaces the GSList
usage with a GArray instead which is effectively a grow only allocation
that means we avoid ongoing allocations while manipulating the stack
mid-scene.
In some micro-benchmarks testing journal throughput the list
manipulation jumps pretty high in the profile. This replaces the GSList
usage with a GArray instead which is effectively a grow only allocation
that means we avoid ongoing allocations while manipulating the stack
mid-scene.
During _cogl_pipeline_needs_blending_enabled we were always checking the
current lighting properties (ambient,diffuse,specular,emission) which
had a notable impact during micro-benchmarks that exercise journal
throughput of simple colored rectangles. This #if 0's the offending code
considering that Cogl doesn't actually support lighting currently and
when it actually does then we will be able to optimize this by avoiding
the checks when lighting is disabled.
During _cogl_pipeline_needs_blending_enabled we were always checking the
current lighting properties (ambient,diffuse,specular,emission) which
had a notable impact during micro-benchmarks that exercise journal
throughput of simple colored rectangles. This #if 0's the offending code
considering that Cogl doesn't actually support lighting currently and
when it actually does then we will be able to optimize this by avoiding
the checks when lighting is disabled.
When using cogl_set_source_color4ub there is a notable difference
between colors that require blending and those that dont. When trying to
modify the color of pipeline referenced by the journal we don't force a
flush of the journal unless the color change will also change the
blending state. By using two separate pipeline objects for handing
opaque or transparent colors we can avoid ever flushing the journal when
repeatedly using cogl_set_source_color and jumping between opaque and
transparent colors.
When using cogl_set_source_color4ub there is a notable difference
between colors that require blending and those that dont. When trying to
modify the color of pipeline referenced by the journal we don't force a
flush of the journal unless the color change will also change the
blending state. By using two separate pipeline objects for handing
opaque or transparent colors we can avoid ever flushing the journal when
repeatedly using cogl_set_source_color and jumping between opaque and
transparent colors.
This reworks _cogl_texture_quad_multiple_primitives so instead of using
the CoglPipelineWrapModeOverrides mechanism to force the clamp to edge
repeat mode we now derive an override pipeline using cogl_pipeline_copy
instead. This avoids a relatively large, unconditional, memset.
This reworks _cogl_texture_quad_multiple_primitives so instead of using
the CoglPipelineWrapModeOverrides mechanism to force the clamp to edge
repeat mode we now derive an override pipeline using cogl_pipeline_copy
instead. This avoids a relatively large, unconditional, memset.
This avoids using the wrap mode overrides mechanism to implement
_cogl_multitexture_quad_single_primitive which requires memsetting a
fairly large array. This updates it to use cogl_pipeline_foreach_layer()
and we now derive an override_material to handle changes to the wrap
modes instead of using the CoglPipelineWrapModeOverrides.
This avoids using the wrap mode overrides mechanism to implement
_cogl_multitexture_quad_single_primitive which requires memsetting a
fairly large array. This updates it to use cogl_pipeline_foreach_layer()
and we now derive an override_material to handle changes to the wrap
modes instead of using the CoglPipelineWrapModeOverrides.
Previously there was a check to avoid filling the path if there are
zero nodes. However the tesselator also won't generate any triangles
if there are less than 3 nodes so we might as well bail out in that
case too. If we don't emit any triangles then we would end up trying
to create an empty VBO. Although I don't think this should necessarily
be a problem, this seems to cause Mesa to segfault in version 7.8.1
when calling glBufferSubData (although not in
master). test-cogl-primitives tries to fill a path with only two
points so it's convenient to be able to avoid the crash in this case.
Previously there was a check to avoid filling the path if there are
zero nodes. However the tesselator also won't generate any triangles
if there are less than 3 nodes so we might as well bail out in that
case too. If we don't emit any triangles then we would end up trying
to create an empty VBO. Although I don't think this should necessarily
be a problem, this seems to cause Mesa to segfault in version 7.8.1
when calling glBufferSubData (although not in
master). test-cogl-primitives tries to fill a path with only two
points so it's convenient to be able to avoid the crash in this case.
When adding a new entry to the journal a reference is now taken on the
current clip stack. Modifying the current clip state no longer causes
a journal flush. The journal flushing code now has an extra stage to
compare the clip state of each entry. The comparison can simply be
done by comparing the pointers. Although different clip states will
still end up with multiple draw calls this at leasts allows a scene
comprising of multiple different clips to be upload with one vbo. It
also lays the groundwork to do certain tricks when drawing clipped
rectangles such as modifying the geometry instead of setting a clip
state.
When adding a new entry to the journal a reference is now taken on the
current clip stack. Modifying the current clip state no longer causes
a journal flush. The journal flushing code now has an extra stage to
compare the clip state of each entry. The comparison can simply be
done by comparing the pointers. Although different clip states will
still end up with multiple draw calls this at leasts allows a scene
comprising of multiple different clips to be upload with one vbo. It
also lays the groundwork to do certain tricks when drawing clipped
rectangles such as modifying the geometry instead of setting a clip
state.
This adds a flag to avoid flushing the clip state when flushing the
framebuffer state. This will be used by the journal to manage its own
clip state flushing.
This adds a flag to avoid flushing the clip state when flushing the
framebuffer state. This will be used by the journal to manage its own
clip state flushing.
Flushing the clip state no longer does anything that would cause the
journal to flush. The clip state is only flushed when flushing the
framebuffer state and in all cases this ends up flushing the journal
in one way or another anyway. Avoiding flushing the journal will make
it easier to log the clip state in the journal.
Previously when trying to set up a rectangle clip that can't be
scissored or when using a path clip the code would use cogl_rectangle
as part of the process to fill the stencil buffer. This is now changed
to use a new internal _cogl_rectangle_immediate function which
directly uses the vertex array API to draw a triangle strip without
affecting the journal. This should be just as efficient as the
previous journalled code because these places would end up flushing
the journal immediately before and after submitting the single
rectangle anyway and flushing the journal always creates a new vbo so
it would effectively do the same thing.
Similarly there is also a new internal _cogl_clear function that does
not flush the journal.
Flushing the clip state no longer does anything that would cause the
journal to flush. The clip state is only flushed when flushing the
framebuffer state and in all cases this ends up flushing the journal
in one way or another anyway. Avoiding flushing the journal will make
it easier to log the clip state in the journal.
Previously when trying to set up a rectangle clip that can't be
scissored or when using a path clip the code would use cogl_rectangle
as part of the process to fill the stencil buffer. This is now changed
to use a new internal _cogl_rectangle_immediate function which
directly uses the vertex array API to draw a triangle strip without
affecting the journal. This should be just as efficient as the
previous journalled code because these places would end up flushing
the journal immediately before and after submitting the single
rectangle anyway and flushing the journal always creates a new vbo so
it would effectively do the same thing.
Similarly there is also a new internal _cogl_clear function that does
not flush the journal.
Previously we tracked whether the clip stack needs flushing as part of
the CoglClipState which is part of the CoglFramebuffer state. This is
a bit odd because most of the clipping state (such as the clip planes
and the scissor) are part of the GL context's state rather than the
framebuffer. We were marking the clip state on the framebuffer dirty
every time we change the framebuffer anyway so it seems to make more
sense to have the dirtiness be part of the global context.
Instead of a just a single boolean to record whether the state needs
flushing, the CoglContext now holds a reference to the clip stack that
was flushed. That way we can flush arbitrary stack states and if it
happens to be the same as the state already flushed then Cogl will do
nothing. This will be useful if we log the clip stack in the journal
because then we will need to flush unrelated clip stack states for
each batch.
Previously we tracked whether the clip stack needs flushing as part of
the CoglClipState which is part of the CoglFramebuffer state. This is
a bit odd because most of the clipping state (such as the clip planes
and the scissor) are part of the GL context's state rather than the
framebuffer. We were marking the clip state on the framebuffer dirty
every time we change the framebuffer anyway so it seems to make more
sense to have the dirtiness be part of the global context.
Instead of a just a single boolean to record whether the state needs
flushing, the CoglContext now holds a reference to the clip stack that
was flushed. That way we can flush arbitrary stack states and if it
happens to be the same as the state already flushed then Cogl will do
nothing. This will be useful if we log the clip stack in the journal
because then we will need to flush unrelated clip stack states for
each batch.
Instead of having a separate CoglHandle for CoglClipStack the code is
now expected to directly hold a pointer to the top entry on the
stack. The empty stack is then the NULL pointer. This saves an
allocation when we want to copy the stack because we can just take a
reference on a stack entry. The idea is that this will make it
possible to store the clip stack in the journal without any extra
allocations.
The _cogl_get_clip_stack and set functions now take a CoglClipStack
pointer instead of a handle so it would no longer make sense to make
them public. However I think the only reason we would have wanted that
in the first place would be to save the clip state between switching
FBOs and that is no longer necessary.
Instead of having a separate CoglHandle for CoglClipStack the code is
now expected to directly hold a pointer to the top entry on the
stack. The empty stack is then the NULL pointer. This saves an
allocation when we want to copy the stack because we can just take a
reference on a stack entry. The idea is that this will make it
possible to store the clip stack in the journal without any extra
allocations.
The _cogl_get_clip_stack and set functions now take a CoglClipStack
pointer instead of a handle so it would no longer make sense to make
them public. However I think the only reason we would have wanted that
in the first place would be to save the clip state between switching
FBOs and that is no longer necessary.
CoglVertexAttribute has an internal draw function that is used by the
CoglJournal to avoid the call to cogl_journal_flush which would
otherwise end up recursively flushing the journal forever. The
enable_gl_state function called by this was previously also calling
_cogl_flush_framebuffer_state. However the journal code tries to
handle this function specially by calling it with a flag to disable
flushing the modelview matrix. This is useful because the journal
handles flushing the modelview itself. Without this patch the journal
state ends up getting flushed twice. This isn't a particularly big
problem currently because the matrix stack has caching to recognise
when it would push the same state twice and bails out. However if we
later want to use the framebuffer flush flags to override a particular
state of the framebuffer (such as the clip state) then we need to make
sure the flush isn't called twice.
CoglVertexAttribute has an internal draw function that is used by the
CoglJournal to avoid the call to cogl_journal_flush which would
otherwise end up recursively flushing the journal forever. The
enable_gl_state function called by this was previously also calling
_cogl_flush_framebuffer_state. However the journal code tries to
handle this function specially by calling it with a flag to disable
flushing the modelview matrix. This is useful because the journal
handles flushing the modelview itself. Without this patch the journal
state ends up getting flushed twice. This isn't a particularly big
problem currently because the matrix stack has caching to recognise
when it would push the same state twice and bails out. However if we
later want to use the framebuffer flush flags to override a particular
state of the framebuffer (such as the clip state) then we need to make
sure the flush isn't called twice.
Unless the CoglBuffer is being used for texture data then it's
relatively unlikely that the data will contain an array of bytes. For
example if it's used as a vertex array then it's more likely to be
floats or some vertex struct. In that case it's much more convenient
if set_data and map use void* pointers so that we can avoid a cast.
Unless the CoglBuffer is being used for texture data then it's
relatively unlikely that the data will contain an array of bytes. For
example if it's used as a vertex array then it's more likely to be
floats or some vertex struct. In that case it's much more convenient
if set_data and map use void* pointers so that we can avoid a cast.