We need to guard the usage of symbols related to the
GLX_INTEL_swap_event extension, to avoid breaking on platforms and/or
versions of Mesa that do not expose that extension.
It's generally useful to be able to query the width and height of a
framebuffer and we expect to need this in Clutter when we move the
eglnative backend code into Cogl since Clutter will need to read back
the fixed size of the framebuffer when realizing the stage.
This backend hasn't been used for years now and so because it is
untested code and almost certainly doesn't work any more it would be a
burdon to continue trying to maintain it. Considering that we are now
looking at moving OpenGL window system integration code down from
Clutter backends into Cogl that will be easier if we don't have to
consider this backend.
This adds an autogen.sh, configure.ac and build/autotool files etc under
clutter/cogl and makes some corresponding Makefile.am changes that make
it possible to build and install Cogl as a standalone library.
Some notable things about this are:
A standalone installation of Cogl installs 3 pkg-config files;
cogl-1.0.pc, cogl-gl-1.0.pc and cogl-2.0.pc. The second is only for
compatibility with what clutter installed though I'm not sure that
anything uses it so maybe we could remove it. cogl-1.0.pc is what
Clutter would use if it were updated to build against a standalone cogl
library. cogl-2.0.pc is what you would use if you were writing a
standalone Cogl application.
A standalone installation results in two libraries currently, libcogl.so
and libcogl-pango.so. Notably we don't include a major number in the
sonames because libcogl supports two major API versions; 1.x as used by
Clutter and the experimental 2.x API for standalone applications.
Parallel installation of later versions e.g. 3.x and beyond will be
supportable either with new sonames or if we can maintain ABI then we'll
continue to share libcogl.so.
The headers are similarly not installed into a directory with a major
version number since the same headers are shared to export the 1.x and
2.x APIs (The only difference is that cogl-2.0.pc ensures that
-DCOGL_ENABLE_EXPERIMENTAL_2_0_API is used). Parallel installation of
later versions is not precluded though since we can either continue
sharing or later add a major version suffix.
This migrates all the GLX window system code down from the Clutter
backend code into a Cogl winsys. Moving OpenGL window system binding
code down from Clutter into Cogl is the biggest blocker to having Cogl
become a standalone 3D graphics library, so this is an important step in
that direction.
As part of the process of splitting Cogl out as a standalone graphics
API we need to introduce some API concepts that will allow us to
initialize a new CoglContext when Clutter isn't there to handle that for
us...
The new objects roughly in the order that they are (optionally) involved
in constructing a context are: CoglRenderer, CoglOnscreenTemplate,
CoglSwapChain and CoglDisplay.
Conceptually a CoglRenderer represents a means for rendering. Cogl
supports rendering via OpenGL or OpenGL ES 1/2.0 and those APIs are
accessed through a number of different windowing APIs such as GLX, EGL,
SDL or WGL and more. Potentially in the future Cogl could render using
D3D or even by using libdrm and directly banging the hardware. All these
choices are wrapped up in the configuration of a CoglRenderer.
Conceptually a CoglDisplay represents a display pipeline for a renderer.
Although Cogl doesn't aim to provide a detailed abstraction of display
hardware, on some platforms we can give control over multiple display
planes (On TV platforms for instance video content may be on one plane
and 3D would be on another so a CoglDisplay lets you select the plane
up-front.)
Another aspect of CoglDisplay is that it lets us negotiate a display
pipeline that best supports the type of CoglOnscreen framebuffers we are
planning to create. For instance if you want transparent CoglOnscreen
framebuffers then we have to be sure the display pipeline wont discard
the alpha component of your framebuffers. Or if you want to use
double/tripple buffering that requires support from the display
pipeline.
CoglOnscreenTemplate and CoglSwapChain are how we describe our default
CoglOnscreen framebuffer configuration which can affect the
configuration of the display pipeline.
The default/simple way we expect most CoglContexts to be constructed
will be via something like:
if (!cogl_context_new (NULL, &error))
g_error ("Failed to construct a CoglContext: %s", error->message);
Where that NULL is for an optional "display" parameter and NULL says to
Cogl "please just try to do something sensible".
If you want some more control though you can manually construct a
CoglDisplay something like:
display = cogl_display_new (NULL, NULL);
cogl_gdl_display_set_plane (display, plane);
if (!cogl_display_setup (display, &error))
g_error ("Failed to setup a CoglDisplay: %s", error->message);
And in a similar fashion to cogl_context_new() you can optionally pass
a NULL "renderer" and/or a NULL "onscreen template" so Cogl will try to
just do something sensible.
If you need to change the CoglOnscreen defaults you can provide a
template something like:
chain = cogl_swap_chain_new ();
cogl_swap_chain_set_has_alpha (chain, TRUE);
cogl_swap_chain_set_length (chain, 3);
onscreen_template = cogl_onscreen_template_new (chain);
cogl_onscreen_template_set_pixel_format (onscreen_template,
COGL_PIXEL_FORMAT_RGB565);
display = cogl_display_new (NULL, onscreen_template);
if (!cogl_display_setup (display, &error))
g_error ("Failed to setup a CoglDisplay: %s", error->message);
This tries to make the naming style of files under cogl/winsys/
consistent with other cogl source files. In particular private header
files didn't have a '-private' infix.
This gives us a way to clearly track the internal Cogl API that Clutter
depends on. The aim is to split Cogl out from Clutter into a standalone
3D graphics API and eventually we want to get rid of any private
interfaces for Clutter so its useful to have a handle on that task.
Actually it's not as bad as I was expecting though.
This renames the two internal functions _cogl_get_draw/read_buffer
as cogl_get_draw_framebuffer and _cogl_get_read_framebuffer. The
former is now also exposed as experimental API.
The long term goal with the Cogl API is that we will get rid of the
default global context. As a step towards this, this patch tracks a
reference back to the context in each CoglFramebuffer so in a lot of
cases we can avoid using the _COGL_GET_CONTEXT macro.
There is no corresponding implementation of _cogl_features_init any more
so it was simply an oversight that the prototype wasn't removed when the
implementation was removed.
Recently _cogl_swap_buffers_notify was added (in 142b229c5c) so that
Cogl would be notified when Clutter performs a swap buffers request for
a given onscreen framebuffer. It was expected this would be required for
the recent cogl_read_pixel optimization that was implemented (ref
1bdb0e6e98) but in the end it wasn't used.
Since it wasn't used in the end this patch removes the API.
This moves the functionality of _cogl_create_context_driver from
driver/{gl,gles}/cogl-context-driver-{gl,gles}.c into
driver/{gl,gles}/cogl-{gl,gles}.c as a static function called
initialize_context_driver.
cogl-context-driver-{gl,gles}.[ch] have now been removed.
This adds a new experimental function (you need to define
COGL_ENABLE_EXPERIMENTAL_API to access it) which takes us towards being
able to have a standalone Cogl API. This is really a minor aesthetic
change for now since all the GL context creation code still lives in
Clutter but it's a step forward none the less.
Since our current designs introduce a CoglDisplay object as something
that would be passed to the context constructor this provides a stub
cogl-display.h with CoglDisplay typedef.
_cogl_context_get_default() which Clutter uses to access the Cogl
context has been modified to use cogl_context_new() to initialize
the default context.
There is one rather nasty hack used in this patch which is that the
implementation of cogl_context_new() has to forcibly make the allocated
context become the default context because currently all the code in
Cogl assumes it can access the context using _COGL_GET_CONTEXT including
code used to initialize the context.
In 047227fb cogl_atlas_new was changed so that it can take a flags
parameter to specify whether to clear the new atlases and whether to
copy images to the new atlas after reorganisation. This was done so
that the atlas code could be shared with the glyph cache. At some
point during the development of this patch the flag was just a single
boolean instead and this is accidentally how it is used from the glyph
cache. The glyph cache therefore passes 'TRUE' as the set of flags
which means it will only get the 'clear' flag and not the
'disable-migration' flag. When the glyph cache gets full it will
therefore try to copy the texture to the new atlas as well as
redrawing them with cairo. This causes problems because the glyph
cache needs to work in situations where there is no FBO support.
In _cogl_pipeline_prune_empty_layer_difference if the layer's parent
has no owner then it just takes ownership of it. However this could
theoretically end up taking ownership of the root layer because
according to the comment above in the same function that should never
have an owner. This patch just adds an extra check to ensure that the
unowned layer has a parent.
http://bugzilla.clutter-project.org/show_bug.cgi?id=2588
In _cogl_pipeline_prune_empty_layer_difference if we are reverting to
the immediate parent of an empty/redundant layer then it is not enough
to simply add a reference to the pipeline's ->layer_differences list
without also updating parent_layer->owner to point back to its new
owner.
This oversight was leading us to break the invariable that all layers
referenced in layer_differences have an owner and was also causing us to
break another invariable whereby after calling
_cogl_pipeline_layer_pre_change_notify the returned layer must always be
owned by the given 'required_owner'.
http://bugzilla.clutter-project.org/show_bug.cgi?id=2588
glib already has a data type to manage a list of callbacks called a
GHookList so we might as well use it instead of maintaining Cogl's own
type. The glib version may be slightly more efficient because it
avoids using a GList and instead encodes the prev and next pointers
directly in the GHook structure. It also has more features than
CoglCallbackList.
Previously we were applying the culling optimization to any actor
painted without considering that we may be painting to an offscreen
framebuffer where the stage clip isn't applicable.
For now we simply expose a getter for the current draw framebuffer
and we can assume that a return value of NULL corresponds to the
stage.
Note: This will need to be updated as stages start to be backed by real
CoglFramebuffer objects and so we won't get NULL in those cases.
Drawing and clipping to paths is generally quite expensive because the
geometry has to be tessellated into triangles in a single VBO which
breaks up the journal batching. If we can detect when the path
contains just a single rectangle then we can instead divert to calling
cogl_rectangle which will take advantage of the journal, or by pushing
a rectangle clip which usually ends up just using the scissor.
This patch adds a boolean to each path to mark when it is a
rectangle. It gets cleared whenever a node is added or gets set to
TRUE whenever cogl2_path_rectangle is called. This doesn't try to
catch cases where a rectangle is composed by cogl_path_line_to and
cogl_path_move_to commands.
In 9ff04e8a99 the builtin uniforms were moved to the common shader
boilerplate. However the common boilerplate is positioned before the
default precision specifier on GLES2 so it would fail to compile
because the uniforms end up with no precision in the fragment
shader. This patch just moves the precision specifier to above the
common boilerplate.
Instead of unconditionally combining the modelview and projection
matrices and then iterating each of the vertices to call
cogl_matrix_transform_point for each one in turn we now only combine the
matrices if there are more than 4 vertices (with less than 4 vertices
its less work to transform them separately) and we use the new
cogl_vertex_{transform,project}_points APIs which can hopefully
vectorize the transformations.
Finally the perspective divide and viewport scale is done in a separate
loop at the end and we don't do the spurious perspective divide and
viewport scale for the z component.
This adds two new experimental functions to cogl-matrix.c:
cogl_matrix_view_2d_in_perspective and cogl_matrix_view_2d_in_frustum
which can be used to setup a view transform that maps a 2D coordinate
system (0,0) top left and (width,height) bottom right to the current
viewport.
Toolkits such as Clutter that want to mix 2D and 3D drawing can use
these APIs to position a 2D coordinate system at an arbitrary depth
inside a 3D perspective projected viewing frustum.
OpenGL < 4.0 only supports integer based viewports and internally we
have a mixture of code using floats and integers for viewports. This
patch switches all viewports throughout clutter and cogl to be
represented using floats considering that in the future we may want to
take advantage of floating point viewports with modern hardware/drivers.
This makes a change to the original point_in_poly algorithm from:
http://www.ecse.rpi.edu/Homepages/wrf/Research/Short_Notes/pnpoly.html
The aim was to tune the test so that tests against screen aligned
rectangles are more resilient to some in-precision in how we transformed
that rectangle into screen coordinates. In particular gnome-shell was
finding that for some stage sizes then row 0 of the stage would become a
dead zone when going through the software picking fast-path and this was
because the y position of screen aligned rectangles could end up as
something like 0.00024 and the way the algorithm works it doesn't have
any epsilon/fuz factor to consider that in-precision.
We've avoided introducing an epsilon factor to the comparisons since we
feel there's a risk of changing some semantics in ways that might not be
desirable. One of those is that if you transform two polygons which
share an edge and test a point close to that edge then this algorithm
will currently give a positive result for only one polygon.
Another concern is the way this algorithm resolves the corner case where
the horizontal ray being cast to count edge crossings may cross directly
through a vertex. The solution is based on the "idea of Simulation of
Simplicity" and "pretends to shift the ray infinitesimally down so that
it either clearly intersects, or clearly doesn't touch". I'm not
familiar with the idea myself so I expect a misplaced epsilon is likely
to break that aspect of the algorithm.
The simple solution this patch applies is to pixel align the polygon
vertices which should eradicate most noise due to in-precision.
https://bugzilla.gnome.org/show_bug.cgi?id=641197
When using a pipeline and the journal to blit images between
framebuffers, it should disable blending. Otherwise it will end up
blending the source texture with uninitialised garbage in the
destination texture.
If an atlas texture's last reference is held by the journal or by the
last flushed pipeline then if an atlas migration is started it can
cause a crash. This is because the atlas migration will cause a
journal flush and can sometimes change the current pipeline which
means that the texture would be destroyed during migration.
This patch adds an extra 'post_reorganize' callback to the existing
'reorganize' callback (which is now renamed to 'pre_reorganize'). The
pre_reorganize callback is now called before the atlas grabs a list of
the current textures instead of after so that it doesn't matter if the
journal flush destroys some of those textures. The pre_reorganize
callback for CoglAtlasTexture grabs a reference to all of the textures
so that they can not be destroyed when the migration changes the
pipeline. In the post_reorganize callback the reference is removed
again.
http://bugzilla.clutter-project.org/show_bug.cgi?id=2538
When Cogl debugging is disabled then the 'waste' variable is not used
so it throws a compiler warning. This patch removes the variable and
the value is calculated directly as the parameter to COGL_NOTE.
Some code was doing pointer arithmetic on the return value from
cogl_buffer_map which is void* pointer. This is a GCC extension so we
should try to avoid it. This patch adds casts to guint8* where
appropriate.
Based on a patch by Fan, Chun-wei.
http://bugzilla.clutter-project.org/show_bug.cgi?id=2561
Instead of directly banging GL to migrate textures the atlas now uses
the CoglFramebuffer API. It will use one of four approaches; it can
set up two FBOs and use _cogl_blit_framebuffer to copy between them;
it can use a single target fbo and then render the source texture to
the FBO using a Cogl draw call; it can use a single FBO and call
glCopyTexSubImage2D; or it can fallback to reading all of the texture
data back to system memory and uploading it again with a sub texture
update.
Previously GL calls were used directly because Cogl wasn't able to
create a framebuffer without a stencil and depth buffer. However there
is now an internal version of cogl_offscreen_new_to_texture which
takes a set of flags to disable the two buffers.
The code for blitting has now been moved into a separate file called
cogl-blit.c because it has become quite long and it may be useful
outside of the atlas at some point.
The 4 different methods have a fixed order of preference which is:
* Texture render between two FBOs
* glBlitFramebuffer
* glCopyTexSubImage2D
* glGetTexImage + glTexSubImage2D
Once a method is succesfully used it is tried first for all subsequent
blits. The default default can be overridden by setting the
environment variable COGL_ATLAS_DEFAULT_BLIT_MODE to one of the
following values:
* texture-render
* framebuffer
* copy-tex-sub-image
* get-tex-data
This adds a declaration for _cogl_is_texture_2d to the private header
so that it can be used in cogl-blit.c to determine if the target
texture is a simple 2D texture.
This adds a function called _cogl_texture_2d_copy_from_framebuffer
which is a simple wrapper around glCopyTexSubImage2D. It is currently
specific to the texture 2D backend.
This adds the _cogl_blit_framebuffer internal function which is a
wrapper around glBlitFramebuffer. The API is changed from the GL
version of the function to reflect the limitations provided by the
GL_ANGLE_framebuffer_blit extension (eg, no scaling or mirroring).
This extension is the GLES equivalent of the GL_EXT_framebuffer_blit
extension except that it has some extra restrictions. We need to check
for some extension that provides glBlitFramebuffer so that we can
unconditionally use ctx->drv.pf_glBlitFramebuffer in both GL and GLES
code. Even with the restrictions, the extension provides enough
features for what Cogl needs.
Previously when _cogl_atlas_texture_migrate_out_of_atlas is called it
would unreference the atlas texture's sub-texture before calling
_cogl_atlas_copy_rectangle. This would leave the atlas texture in an
inconsistent state during the copy. This doesn't normally matter but
if the copy ends up doing a render then the atlas texture may end up
being referenced. In particular it would cause problems if the texture
is left in a texture unit because then Cogl may try to call
get_gl_texture even though the texture isn't actually being used for
rendering. To fix this the sub texture is now unrefed after the copy
call instead.
The current framebuffer is now internally separated so that there can
be a different draw and read buffer. This is required to use the
GL_EXT_framebuffer_blit extension. The current draw and read buffers
are stored as a pair in a single stack so that pushing the draw and
read buffer is done simultaneously with the new
_cogl_push_framebuffers internal function. Calling
cogl_pop_framebuffer will restore both the draw and read buffer to the
previous state. The public cogl_push_framebuffer function is layered
on top of the new function so that it just pushes the same buffer for
both drawing and reading.
When flushing the framebuffer state, the cogl_framebuffer_flush_state
function now tackes a pointer to both the draw and the read
buffer. Anywhere that was just flushing the state for the current
framebuffer with _cogl_get_framebuffer now needs to call both
_cogl_get_draw_buffer and _cogl_get_read_buffer.
When pushing a framebuffer it would previously push
COGL_INVALID_HANDLE to the top of the framebuffer stack so that when
it later calls cogl_set_framebuffer it will recognise that the
framebuffer is different and replace the top with the new
pointer. This isn't ideal because it breaks the code to flush the
journal because _cogl_framebuffer_flush_journal is called with the
value of the old pointer which is NULL. That function was checking for
a NULL pointer so it wouldn't actually flush. It also would mean that
if you pushed the same framebuffer twice we would end up dirtying
state unnecessarily. To fix this cogl_push_framebuffer now pushes a
reference to the current framebuffer instead.
After a dependent framebuffer is added to a framebuffer it was never
getting removed. Once the journal for a framebuffer is flushed we no
longer depend on any framebuffers so the list should be cleared. This
was causing leaks of offscreens and textures.
This adds a note to clarify that cogl_matrix_multiply allows you to
multiply the @a matrix in-place, so @a can equal @result but @b can't
equal @result.
When uploading the layer matrix to GL it wasn't first calling
glActiveTextureMatrix to set the right texture unit for the
layer. This would end up setting the texture matrix on whatever layer
happened to be previously active. This happened to work for
test-cogl-multitexture presumably because it was coincidentally
setting the layer matrix on the last used layer.
The pipeline private data is accessed both from the private data set
on a CoglPipeline and the destroy notify function of a weak material
that the vertex buffer creates when it needs to override the wrap
mode. However when a CoglPipeline is destroyed, the CoglObject code
first removes all of the private data set on the object and then the
CoglPipeline code gets invoked to destroy all of the weak children. At
this point the vertex buffer's weak override destroy notify function
will get invoked and try to use the private data which has already
been freed causing a crash.
This patch instead adds a reference count to the pipeline private data
stuct so that we can avoid freeing it until both the private data on
the pipeline has been destroyed and all of the weak materials are
destroyed.
http://bugzilla.clutter-project.org/show_bug.cgi?id=2544
In cogl_pipeline_set_layer_combine_constant it was comparing whether
the new color is the same as the old color using a memcmp on the
constant_color parameter. However the combine constant is stored in
the layer data as an array of four floats but the passed in color is a
CoglColor (which is currently an array of four guint8s). This was
causing valgrind errors and presumably also the check for setting the
same color twice would always fail.
This patch makes it do the conversion to a float array upfront before
the comparison.
cogl_matrix_project_points and cogl_matrix_transform_points had an
optimization for the common case where the stride parameters exactly
match the size of the corresponding structures. The code for both when
generated by gcc with -O2 on x86-64 use two registers to hold the
addresses of the input and output arrays. In the strided version these
pointers are incremented by adding the value of a register and in the
packed version they are incremented by adding an immediate value. I
think the difference in cost here would be negligible and it may even
be faster to add a register.
Also GCC appears to retain the loop counter in a register for the
strided version but in the packed version it can optimize it out and
directly use the input pointer as the counter. I think it would be
possible to reorder the code a bit to explicitly use the input pointer
as the counter if this were a problem.
Getting rid of the packed versions tidies up the code a bit and it
could potentially be faster if the code differences are small and we
get to avoid an extra conditional in cogl_matrix_transform_points.
When copying COMBINE state in
_cogl_pipeline_layer_init_multi_property_sparse_state we would read some
state from the destination layer (invalid data potentially), then
redundantly set the value back on the destination. This was picked up by
valgrind, and the code is now more careful about how it references the
src layer vs the destination layer.
There is currently a problem with per-framebuffer journals in that it's
possible to create a framebuffer from a texture which then gets rendered
too but the framebuffer (and corresponding journal) can be freed before
the texture gets used to draw with.
Conceptually we want to make sure when freeing a framebuffer that - if
it is associated with a texture - we flush the journal as the last thing
before really freeing the framebuffer's meta data. Technically though
this is awkward to implement since the obvious mechanism for us to be
notified about the framebuffer's destruction (by setting some user data
internally with a callback) notifies when the framebuffer has a
ref-count of 0. This means we'd have to be careful what we do with the
framebuffer to consider e.g. recursive destruction; anything that would
set more user data on the framebuffer while it is being destroyed and
ensuring nothing else gets notified of the framebuffer's destruction
before the journal has been flushed.
For simplicity, for now, this patch provides another solution which is
to flush framebuffer journals whenever we switch away from a given
framebuffer via cogl_set_framebuffer or cogl_push/pop_framebuffer. The
disadvantage of this approach is that we can't batch all the geometry of
a scene that involves intermediate renders to offscreen framebufers.
Clutter is doing this more and more with applications that use the
ClutterEffect APIs so this is a shame. Hopefully this will only be a
stop-gap solution while we consider how to reliably support journal
logging across framebuffer changes.
When flushing a clip stack that contains more than one rectangle which
needs to use the stencil buffer the code takes a different path so
that it can combine the new rectangle with the existing contents of
the stencil buffer. However it was not correctly flushing the
modelview and projection matrices so that rectangle would be in the
wrong place.
This adds a COGL_DEBUG=clipping option that reports how the clip is
being flushed. This is needed to determine whether the scissor,
stencil clip planes or software clipping is being used.
The CoglDebugFlags are now stored in an array of unsigned ints rather
than a single variable. The flags are accessed using macros instead of
directly peeking at the cogl_debug_flags variable. The index values
are stored in the enum rather than the actual mask values so that the
enum doesn't need to be more than 32 bits wide. The hope is that the
code to determine the index into the array can be optimized out by the
compiler so it should have exactly the same performance as the old
code.
The lighting parameters such as the diffuse and ambient colors were
previously only flushed in the fixed vertend. This meant that if a
vertex shader was used then they would not be set. The lighting
parameters are uniforms which are just as useful in a fragment shader
so it doesn't really make sense to set them in the vertend. They are
now flushed in the common cogl-pipeline-opengl code but the code is
#ifdef'd for GLES2 because they need to be part of the progend in that
case.
The uniforms for the alpha test reference value and point size on
GLES2 are updating using similar code. This generalizes the code so
that there is a static array of predefined builtin uniforms which
contains the uniform name, a pointer to a function to get the value
from the pipeline, a pointer to a function to update the uniform and a
flag representing which CoglPipelineState change affects the
uniform. The uniforms are then updated in a loop. This should simplify
adding more builtin uniforms.
The builtin uniforms are accessible from either the vertex shader or
the fragment shader so we should define them in the common
section. This doesn't really matter for the current list of uniforms
because it's pretty unlikely that you'd want to access the matrices
from the fragment shader, but for other builtins such as the lighting
material properties it makes sense.
When we added the texture->framebuffers member a _cogl_texture_init
funciton was added to initialize the list of framebuffers associated
with a texture to NULL. All the backends were updated except the
x11 tfp backend. This was causing crashes in test-pixmap.
This is part of a broader cleanup of some of the experimental Cogl API.
One of the reasons for this particular rename is to reduce the verbosity
of using the API. Another reason is that CoglVertexArray is going to be
renamed CoglAttributeBuffer and we want to help emphasize the
relationship between CoglAttributes and CoglAttributeBuffers.
We have a bunch of experimental convenience functions like
cogl_primitive_p2/p2t2 that have corresponding vertex structures but it
seemed a bit odd to have the vertex annotation e.g. "P2T2" be an infix
of the type like CoglP2T2Vertex instead of be a postfix like
CoglVertexP2T2. This switches them all to follow the postfix naming
style.
COGL_DEBUG=disable-fast-read-pixel can be used to disable the
optimization for reading a single pixel colour back by looking at the
geometry in the journal and not involving the GPU. With this disabled we
will always flush the journal, rendering to the framebuffer and then use
glReadPixels to get the result.
This adds a transparent optimization to cogl_read_pixels for when a
single pixel is being read back and it happens that all the geometry of
the current frame is still available in the framebuffer's associated
journal.
The intention is to indirectly optimize Clutter's render based picking
mechanism in such a way that the 99% of cases where scenes are comprised
of trivial quad primitives that can easily be intersected we can avoid
the latency of kicking a GPU render and blocking for the result when we
know we can calculate the result manually on the CPU probably faster
than we could even kick a render.
A nice property of this solution is that it maintains all the
flexibility of the render based picking provided by Clutter and it can
gracefully fall back to GPU rendering if actors are drawn using anything
more complex than a quad for their geometry.
It seems worth noting that there is a limitation to the extensibility of
this approach in that it can only optimize picking a against geometry
that passes through Cogl's journal which isn't something Clutter
directly controls. For now though this really doesn't matter since
basically all apps should end up hitting this fast-path. The current
idea to address this longer term would be a pick2 vfunc for ClutterActor
that can support geometry and render based input regions of actors and
move this optimization up into Clutter instead.
Note: currently we don't have a primitive count threshold to consider
that there could be scenes with enough geometry for us to compensate for
the cost of kicking a render and determine a result more efficiently by
utilizing the GPU. We don't currently expect this to be common though.
Note: in the future it could still be interesting to revive something
like the wip/async-pbo-picking branch to provide an asynchronous
read-pixels based optimization for Clutter picking in cases where more
complex input regions that necessitate rendering are in use or if we do
add a threshold for rendering as mentioned above.
Both cogl_matrix_transform_points and _project_points take points_in and
points_out arguments and explicitly allow pointing to the same array
(i.e. to transform in-place) The implementation of the various internal
transform functions though were not handling this possability and so it
was possible the reference partially transformed vertex values as if
they were original input values leading to incorrect results. This patch
ensures we take a temporary copy of the current input point when
transforming.
This adds a utility function that can determine if a given point
intersects an arbitrary polygon, by counting how many edges a
"semi-infinite" horizontal ray crosses from that point. The plan is to
use this for a software based read-pixel fast path that avoids using the
GPU to rasterize journaled primitives and can instead intersect a point
being read with quads in the journal to determine the correct color.
This adds a stop-gap mechanism for Cogl to know when the window system
is requested to present the current backbuffer to the frontbuffer by
adding a _cogl_swap_buffers_notify function that backends are now
expected to call right after issuing the equivalent request to OpenGL
vie the platforms OpenGL binding layer. This (blindly) updates all the
backends to call this new function.
For now Cogl doesn't do anything with the notification but the intention
is to use it as part of a planned read-pixel optimization which will
need to reset some state at the start of each new frame.
Instead of having _cogl_get/set_clip stack which reference the global
CoglContext this instead makes those into CoglClipState method functions
named _cogl_clip_state_get/set_stack that take an explicit pointer to a
CoglClipState.
This also adds _cogl_framebuffer_get/set_clip_stack convenience
functions that avoid having to first get the ClipState from a
framebuffer then the stack from that - so we can maintain the
convenience of _cogl_get_clip_stack.
This adds an internal function to be able to query the screen space
bounding box of the current clip entries contained in a given
CoglClipStack.
This bounding box which is cheap to determine can be useful to know the
largest extents that might be updated while drawing with this clip
stack.
For example the plan is to use this as part of an optimized read-pixel
path handled on the CPU which will need to track the currently valid
extents of the last call to cogl_clear()
Instead of having a single journal per context, we now have a
CoglJournal object for each CoglFramebuffer. This means we now don't
have to flush the journal when switching/pushing/popping between
different framebuffers so for example a Clutter scene that involves some
ClutterEffect actors that transiently redirect to an FBO can still be
batched.
This also allows us to track state in the journal that relates to the
current frame of its associated framebuffer which we'll need for our
optimization for using the CPU to handle reading a single pixel back
from a framebuffer when we know the whole scene is currently comprised
of simple rectangles in a journal.
This adds an internal alternative to cogl_object_set_user_data that also
passes an instance pointer to destroy notify callbacks.
When setting private data on a CoglObject it's often desirable to know
the instance being destroyed when we are being notified to free the
private data due to the object being freed. The typical solution to this
is to track a pointer to the instance in the private data itself so it
can be identified but that usually requires an extra micro allocation
for the private data that could have been avoided if only the callback
were given an instance pointer.
The new internal _cogl_object_set_user_data passes the instance pointer
as a second argument which means it is ABI compatible for us to layer
the public version on top of this internal function.
This moves the implementation of cogl_clear into cogl-framebuffer.c as
two new internal functions _cogl_framebuffer_clear and
_cogl_framebuffer_clear4f. It's not clear if this is what the API will
look like as we make more of the CoglFramebuffer API public due to the
limitations of using flags to identify buffers when framebuffers may
contain any number of ancillary buffers but conceptually it makes some
sense to tie the operation of clearing a color buffer to a framebuffer.
The short term intention is to enable tracking the current clear color
as a property of the framebuffer as part of an optimization for reading
back single pixels when the geometry is simple enough that we can
compute the result quickly on the CPU. (If the point doesn't intersect
any geometry we'll need to return the last clear color.)
Previously most of the code for cogl-program and cogl-shader was
ifdef'd out for GLES 1.1 and alternate stub definitions were
defined. This patch removes those and instead puts #ifdef's directly
in the functions that need it. This should make it a little bit easier
to maintain.
http://bugzilla.clutter-project.org/show_bug.cgi?id=2516
When determining whether to hash the combine constant Cogl checks the
arguments to the combine funcs to determine whether the combine
constant is used. However is was using the GLenums GL_CONSTANT_COLOR
and GL_CONSTANT_ALPHA but these are not valid values for the
CoglPipelineCombineSource enum so presumably the constant would never
get hashed. This patch makes it use Cogl's enum of
COGL_PIPELINE_COMBINE_SOURCE_CONSTANT instead.
http://bugzilla.clutter-project.org/show_bug.cgi?id=2516
GLES has an extension called GL_OES_mapbuffer to support mapping
buffer objects but only for writing. Cogl now has two new feature
flags to advertise whether mapping for reading and writing is
supported. Under OpenGL, these features are always set if the VBO
extension is advertised and under GLES only the write flag is set if
the GL_OES_mapbuffer extension is advertised.
In the journal code and when generating the stroke path the vertices
are generated on the fly and stored in a CoglBuffer using
cogl_buffer_map. However cogl_buffer_map is allowed to fail but it
wasn't checking for a NULL return value. In particular on GLES it will
always fail because glMapBuffer is only provided by an extension. This
adds a new pair of internal functions called
_cogl_buffer_{un,}map_for_fill_or_fallback which wrap
cogl_buffer_map. If the map fails then it will instead return a
pointer into a GByteArray attached to the context. When the buffer is
unmapped the array is copied into the buffer using
cogl_buffer_set_data.
On GLES2 there's no builtin mechanism to replace texture coordinates
with point sprite coordinates so calling glEnable(GL_POINT_SPRITE)
isn't valid. Instead the point sprite coords are implemented by using
a special builtin varying variable in GLSL.
There are several places where we need to compare the texture state of a
pipeline and sometimes we need to take into consideration if the
underlying texture has changed but other times we may only care to know
if the texture target has changed.
For example the fragends typically generate programs that they want to
share with all pipelines with equivalent fragment processing state, and
in this case when comparing pipelines we only care about the texture
targets since changes to the underlying texture won't affect the
programs generated.
Prior to this we had tried to handle this by passing around some special
flags to various functions that evaluate pipeline state to say when we
do/don't care about the texture data, but this wasn't working in all
cases and was more awkward to manage than the new approach.
Now we simply have two state bits:
COGL_PIPELINE_LAYER_STATE_TEXTURE_TARGET and
COGL_PIPELINE_LAYER_STATE_TEXTURE_DATA and CoglPipelineLayer has an
additional target member. Since all the appropriate code takes masks of
these state bits to determine what to evaluate we don't need any extra
magic flags.
When notifying that a pipeline property is going to change, then at
times a pipeline will take over being the authority of the corresponding
state group. Some state groups can contain multiple properties and so to
maintain the integrity of all of the properties we have to initialize
all the property values in the new authority. For state groups with only
one property we don't have to initialize anything during the
pre_change_notify() because we can assume the value will be initialized
as part of the change being notified.
This patch optimizes how we handle this initialization of state groups
in a couple of ways; firstly we no longer do anything to initialize
state-groups with only one property, secondly we no longer use
_cogl_pipeline_copy_differences - (we have a new
_cogl_pipeline_init_multi_property_sparse_state() func) so we can avoid
lots calls to handle_automatic_blend_enable() which is sometimes seen
high in sysprof profiles.
Previously atlasing would be disabled if the GL driver does not
support reading back texture data. This meant that atlasing would not
happen on GLES. However we also require that the driver support FBOs
and the texture data is only read back as a fallback if the FBO
fails. Therefore the atlas should be ok on GLES 2 which has FBO
support in core.
We try and bail out of flushing pipeline state asap if we can see the
pipeline has already been flushed and hasn't changed but we weren't
checking to see if the skip_gl_color flag is the same as when it was
last flush too and so we'd sometimes bail out without updating the
glColor correctly.
When an item is added to the journal the current pipeline immediately
gets the legacy state applied to it and the modified pipeline is
logged instead of the original. However the actual drawing from the
journal is done using the vertex attribute API which was also applying
the legacy state. This meant that the legacy state used would be a
combination of the state set when the journal entry was added as well
as the state set when the journal is flushed. To fix this there is now
an extra CoglDrawFlag to avoid applying the legacy state when setting
up the GL state for the vertex attributes. The journal uses this flag
when flushing.
The vertex attribute API assumes that if there is a color array
enabled then we can't determine if the colors are opaque so we have to
enable blending. The journal always uses a color array to avoid
switching color state between rectangles. Since the journal switched
to using vertex attributes this means we effectively always enable
blending from the journal. To fix this there is now a new flag for
_cogl_draw_vertex_attributes to specify that the color array is known
to only contain opaque colors which causes the draw function not to
copy the pipeline. If the pipeline has blending disabled then the
journal passes this flag.
http://bugzilla.clutter-project.org/show_bug.cgi?id=2481
There is an internal version of cogl_draw_vertex_attributes_array
which previously just bypassed the framebuffer flushing, journal
flushing and pipeline validation so that it could be used to draw the
journal. This patch generalises the function so that it takes a set of
flags to specify which parts to flush. The public version of the
function now just calls the internal version with the flags set to
0. The '_real' version of the function has now been merged into the
internal version of the function because it was only called in one
place. This simplifies the code somewhat. The common code which
flushed the various state has been moved to a separate function. The
indexed versions of the functions have had a similar treatment.
http://bugzilla.clutter-project.org/show_bug.cgi?id=2481
Cogl no longer has any code that assumes the buffer in a CoglBitmap is
allocated to the full size of height*rowstride. We should comment that
this is the case so that we remember to keep it that way. This is
important for cogl_texture_new_from_data because the application may
have created the data from a sub-region of a larger image and in that
case it's not safe to read the full rowstride of the last row when the
sub region contains the last row of the larger image.
http://bugzilla.clutter-project.org/show_bug.cgi?id=2491
When uploading data for GLES we need to deal with cases where the
rowstride is too large to be described only by GL_UNPACK_ALIGNMENT
because there is no GL_UNPACK_ROW_LENGTH. Previously for the
sub-region uploading code it would always copy the bitmap and for the
code to upload the whole image it would copy the bitmap unless the
rowstride == bpp*width. Neither paths took into account that we don't
need to copy if the rowstride is just an alignment of bpp*width. This
moves the bitmap copying code to a separate function that is used by
both upload methods. It only copies the bitmap if the rowstride is not
just an alignment of bpp*width.
http://bugzilla.clutter-project.org/show_bug.cgi?id=2491
The ffs function is defined in C99 so if we want to use it in Cogl we
need to provide a fallback for MSVC. This adds a configure check for
the function and then a fallback using a while loop if it is not
available.
http://bugzilla.clutter-project.org/show_bug.cgi?id=2491
If we have to copy the bitmap to do the premultiplication then we were
previously using the rowstride of the source image as the rowstride
for the new image. This is wasteful if the source image is a subregion
of a larger image which would make it use a large rowstride. If we
have to copy the data anyway we might as well compact it to the
smallest rowstride. This also prevents the copy from reading past the
end of the last row of pixels.
An internal function called _cogl_bitmap_copy has been added to do the
copy. It creates a new bitmap with the smallest possible rowstride
rounded up the nearest multiple of 4 bytes. There may be other places
in Cogl that are currently assuming we can read height*rowstride of
the source buffer so they may want to take advantage of this function
too.
http://bugzilla.clutter-project.org/show_bug.cgi?id=2491
The builtin vertex attribute for the normals was incorrectly checked
for as 'cogl_normal' however it is defined as cogl_normal_in in the
shader boilerplate and for the name generated by CoglVertexBuffer.
http://bugzilla.clutter-project.org/show_bug.cgi?id=2499
The ARBfp fragend was bypassing generating a shader if the pipeline
contains a user program. However it shouldn't do this if the pipeline
only contains a vertex shader. This was breaking
test-cogl-just-vertex-shader.
Previously Cogl would only ever use one atlas for textures and if it
reached the maximum texture size then all other new textures would get
their own GL texture. This patch makes it so that we create as many
atlases as needed. This should avoid breaking up some batches and it
will be particularly good if we switch to always using multi-texturing
with a default shader that selects between multiple atlases using a
vertex attribute.
Whenever a new atlas is created it is stored in a GSList on the
context. A weak weference is taken on the atlas using
cogl_object_set_user_data so that it can be removed from the list when
the atlas is destroyed. The atlas textures themselves take a reference
to the atlas and this is the only thing that keeps the atlas
alive. This means that once the atlas becomes empty it will
automatically be destroyed.
All of the COGL_NOTEs pertaining to atlases are now prefixed with the
atlas pointer to make it clearer which atlas is changing.
All of the drawing needed in _cogl_add_path_to_stencil_buffer is done
with the vertex attribute API so there should be no need to flush the
enable flags to enable the vertex array. This was causing problems on
GLES2 where the vertex array isn't available.
The GLES2 wrapper is no longer needed because the shader generation is
done within the GLSL fragend and vertend and any functions that are
different for GLES2 are now guarded by #ifdefs.
Once the GLES2 wrapper is removed then we won't have the GLenums
needed for setting up the layer combine state. This adds Cogl enums
instead which have the same values as the corresponding GLenums. The
enums are:
CoglPipelineCombineFunc
CoglPipelineCombineSource
and
CoglPipelineCombineOp
Once the GLES2 wrapper is removed we won't be able to upload the
matrices with the fixed function API any more. The fixed function API
gives a global state for setting the matrix but if a custom shader
uniform is used for the matrices then the state is per
program. _cogl_matrix_stack_flush_to_gl is called in a few places and
it is assumed the current pipeline doesn't need to be flushed before
it is called. To allow these semantics to continue to work, on GLES2
the matrix flush now just stores a reference to the matrix stack in
the CoglContext. A pre_paint virtual is added to the progend which is
called whenever a pipeline is flushed, even if the same pipeline was
flushed already. This gives the GLSL progend a chance to upload the
matrices to the uniforms. The combined modelview/projection matrix is
only calculated if it is used. The generated programs end up never
using the modelview or projection matrix so it usually only has to
upload the combined matrix. When a matrix stack is flushed a reference
is taked to it by the pipeline progend and the age is stored so that
if the same state is used with the same program again then we don't
need to reupload the uniform.
Sometimes it would be useful if we could efficiently track when a matrix
stack has been modified. For example on GLES2 we have to upload the
modelview as a uniform to our glsl programs but because the modelview
state is part of the framebuffer state it becomes a bit more tricky to
know when to re-sync the value of the uniform with the framebuffer
state. This adds an "age" counter to CoglMatrixStack which is
incremented for any operation that effectively modifies the top of the
stack so now we can save the age of the stack inside the pipeline
whenever we update modelview uniform and later compare that with the
stack to determine if it has changed.
This returns the layer matrix given a pipeline and a layer index. The
API is kept as internal because it directly returns a pointer into the
layer private data to avoid a copy into an out-param. We might also
want to add a public function which does the copy.
When the GLES2 wrapper is removed we can't use the fixed function API
such as glColorPointer to set the builtin attributes. Instead the GLSL
progend now maintains a cache of attribute locations that are queried
with glGetAttribLocation. The code that previously maintained a cache
of the enabled texture coord arrays has been modified to also cache
the enabled vertex attributes under GLES2. The vertex attribute API is
now the only place that is using this cache so it has been moved into
cogl-vertex-attribute.c
Previously when stroking a path it was flushing a pipeline and then
directly calling glDrawArrays to draw the line strip from the path
nodes array. This patch changes it to build a CoglVertexArray and a
series of attributes to paint with instead. The vertex array and
attributes are attached to the CoglPath so it can be reused later. The
old vertex array for filling has been renamed to fill_vbo.
The code to display the source when the show-source debug option is
given has been moved to _cogl_shader_set_source_with_boilerplate so
that it will show both user shaders and generated shaders. It also
shows the code with the full boilerplate. To make it the same for
ARBfp, cogl_shader_compile_real now also dumps user ARBfp shaders.
The GLSL vertend is mostly only useful for GLES2. The fixed function
vertend is kept at higher priority than the GLSL vertend so it is
unlikely to be used in any other circumstances.
Due to Mesa bug 28585 calling glVertexAttrib with attrib location 0
doesn't appear to work. This patch just reorders the vertex and color
attributes in the shader in the hope that Mesa will assign the color
attribute to a different location.
Some builtin attributes such as the matrix uniforms and some varyings
were missing from the boilerplate for GLES2. This also moves the
texture matrix and texture coord attribute declarations to
cogl-shader.c so that they can be dynamically defined depending on the
number of texture coord arrays enabled.
The vertends are intended to flush state that would be represented in
a vertex program. Code to handle the layer matrix, lighting and
point size has now been moved from the common cogl-pipeline-opengl
backend to the fixed vertend.
'progend' is short for 'program backend'. The progend is intended to
operate on combined state from a fragment backend and a vertex
backend. The progend has an 'end' function which is run whenever the
pipeline is flushed and the two pipeline change notification
functions. All of the progends are run whenever the pipeline is
flushed instead of selecting a single one because it is possible that
multiple progends may be in use for example if the vertends and
fragends are different. The GLSL progend will take the shaders
generated by the fragend and vertend and link them into a single
program. The fragend code has been changed to only generate the shader
and not the program. The idea is that pipelines can share fragment
shader objects even if their vertex state is different. The authority
for the progend needs to be the combined authority on the vertend and
fragend state.
This adds two internal functions:
gboolean
_cogl_program_has_fragment_shader (CoglHandle handle);
gboolean
_cogl_program_has_vertex_shader (CoglHandle handle);
They just check whether any of the contained shaders are of that type.
The pipeline function _cogl_pipeline_find_codegen_authority has been
renamed to _cogl_pipeline_find_equivalent_parent and it now takes a
set of flags for the pipeline and layer state that affects the
authority. This is needed so that we can reuse the same code in the
vertend and progends.
Previously enabling and disabling textures was done whatever the
backend in cogl-pipeline-opengl. However enabling and disabling
texture targets only has any meaning if no fragment shaders are being
used so this patch moves the code to cogl-pipeline-fragend-fixed.
The GLES2 wrapper has also been changed to ignore enabledness when
deciding whether to update texture coordinate attribute pointers.
The current Cogl pipeline backends are entirely concerned with the
fragment processing state. We also want to eventually have separate
backends to generate shaders for the vertex processing state so we
need to rename the fragment backends. 'Fragend' is a somewhat weird
name but we wanted to avoid ending up with illegible symbols like
CoglPipelineFragmentBackendGlslPrivate.
We are currently using a pipeline as a key into our arbfp program cache
but because we weren't making a copy of the pipelines used as keys there
were times when doing a lookup in the cache would end up trying to
compare a lookup key with an entry key that would point to invalid
memory.
Note: the current approach isn't ideal from the pov that that key
pipeline may reference some arbitrarily large user textures will now be
kept alive indefinitely. The plan to improve on this is that we will
have a mechanism to create a special "key pipeline" which will derive
from the default Cogl pipeline (to avoid affecting the lifetime of
other pipelines) and only copy state from the original pipeline that
affects the arbfp program and will reference small dummy textures
instead of potentially large user textures.
In the arbfp backend there is a seqential approach to finding a suitable
arbfp program to use for a given pipeline; first we see if there's
already a program associated with the pipeline, 2nd we try and find a
program associated with the "arbfp-authority" 3rd we try and lookup a
program in a cache and finally we resort to starting code-generation for
a new program. This patch slightly reworks the code of these steps to
hopefully make them a bit clearer.
_cogl_pipeline_needs_blending_enabled tries to determine whether each
layer is using the default combine state. However it was using
argument 0 for both checks so the if-statement would never be true.
There are a set of "EvalFlags" that get passed to _cogl_pipeline_hash
that can tweak the semantics of what state is evaluated for hashing but
these flags weren't getting passed via the HashState state structure
so it would be undefined if you would get the correct semantics.
According to 9cc9033347 the windows headers #define near as nothing,
and presumable the same is true for 'far' too. Apparently this define is
to improve compatibility with code written for Windows 3.1, so it's good
that people will be able to incorporate such code into their Clutter
applications.
We were trying to declare and initializing an arbfp program cache for
GLES but since the prototypes for the _hash and _equal functions were
only available for GL this broke the GLES builds. By #ifdefing the code
to conditionally declare/initialize for GL only this should hopefully
fix GLES builds.
The constant 'True' is defined by Xlib which isn't used for all clutter
builds so this replaces occurrences of True with TRUE which is defined
by glib. This should hopefully fix the win32 builds.
This adds a cache (A GHashTable) of ARBfp programs and before ever
starting to code-generate a new program we will always first try and
find an existing program in the cache. This uses _cogl_pipeline_hash and
_cogl_pipeline_equal to hash and compare the keys for the cache.
There is a new COGL_DEBUG=disable-program-caches option that can disable
the cache for debugging purposes.
This allows us to get a hash for a set of state groups for a given
pipeline. This can be used for example to get a hash of the fragment
processing state of a pipeline so we can implement a cache for compiled
arbfp/glsl programs.
_cogl_pipeline_equal now accepts a mask of pipeline differences and layer
differences to constrain what state will be compared. In addition a set
of flags are passed that can tweak the comparison semantics for some
state groups. For example when comparing layer textures we sometimes
only need to compare the texture target and can ignore the data itself.
In updating the code this patch also changes it so all required pipeline
authorities are resolved in one step up-front instead of resolving the
authority for each state group in turn and repeatedly having to traverse
the pipeline's ancestry. This adds two new functions
_cogl_pipeline_resolve_authorities and
_cogl_pipeline_layer_resolve_authorities to handle resolving a set of
authorities.
This removes the unused array of per-packend priv data pointers
associated with every CoglPipelineLayer. This reduces the size of all
layer allocations and avoids having to zero an array for each
_cogl_pipeline_layer_copy.
A non-static function named cogl_object_get_type was inadvertently added
during the addition of the CoglObject base type, but there is no public
prototype in the headers and it's only referenced inside cogl-object.c
to implement cogl_handle_get_type() for compatibility. This removes the
function since we don't want to commit to CoglObject always simply being
a boxed type. In the future we may want to register hierarchical
GTypeInstance based types.
To allow us to have gobject properties that accept a CoglMatrix value we
need to register a GType. This adds a cogl_gtype_matrix_get_type function
that will register a static boxed type called "CoglMatrix".
This adds a new section to the reference manual for GType integration
functions.
As a pre-requisite for being able to register a boxed GType for
CoglMatrix (enabling us to define gobject properties that accept a
CoglMatrix) this adds cogl_matrix_copy and _free functions.
In _cogl_pipeline_needs_blending_enabled after first checking whether
the property most recently changed requires blending we would then
resort to checking all other properties too in case some other state
also requires blending. We now avoid checking all other properties in
the case that blending was previously disabled and checking the property
recently changed doesn't require blending.
Note: the plan is to improve this further by explicitly keeping track
of the properties that currently cause blending to be enabled so that we
never have to resort to checking all other properties we can constrain
the checks to those masked properties.
This moves _cogl_pipeline_get_parent and _cogl_pipeline_get_authority
into cogl-pipeline-private.h so they can be inlined since they have been
seen to get quite high in profiles. Given that they both contain such
small amounts of code the function call overhead is significant.
This adds a debug option called disable-software-clipping which causes
the journal to always log the clip stack state rather than trying to
manually clip rectangles.
Before flushing the journal there is now a separate iteration that
will try to determine if the matrix of the clip stack and the matrix
of the rectangle in each entry are on the same plane. If they are it
can completely avoid the clip stack and instead manually modify the
vertex and texture coordinates to implement the clip. The has the
advantage that it won't break up batching if a single clipped
rectangle is used in a scene.
The software clip is only used if there is no user program and no
texture matrices. There is a threshold to the size of the batch where
it is assumed that it is worth the cost to break up a batch and
program the GPU to do the clipping. Currently this is set to 8
although this figure is plucked out of thin air.
To check whether the two matrices are on the same plane it tries to
determine if one of the matrices is just a simple translation of the
other. In the process of this it also works out what the translation
would be. These values can be used to translate the clip rectangle
into the coordinate space of the rectangle to be logged. Then we can
do the clip directly in the rectangle's coordinate space.
Previously in cogl-clip-state.c when it detected that the current
modelview matrix is screen-aligned it would convert the clip entry to
a window clip. Instead of doing this cogl-clip-stack.c now contains
the detection and keeps the entry as a rectangle clip but marks that
it is entirely described by its scissor rect. When flusing the clip
stack it doesn't do anything extra for entries that have this mark
(because the clip will already been setup by the scissor). This is
needed so that we can still track the original rectangle coordinates
and modelview matrix to help detect when it would be faster to modify
the rectangle when adding it to the journal rather than having to
break up the batch to set the clip state.
When logging a quad we now only store the 2 vertices representing the
top left and bottom right of the quad. The color is only stored once
per entry. Once we come to upload the data we expand the 2 vertices
into four and copy the color to each vertex. We do this by mapping the
buffer and directly expanding into it. We have to copy the data before
we can render it anyway so it doesn't make much sense to expand the
vertices before uploading and this way should save some space in the
size of the journal. It also makes it slightly easier if we later want
to do pre-processing on the journal entries before uploading such as
doing software clipping.
The modelview matrix is now always copied to the journal entry whereas
before it would only be copied if we aren't doing software
transform. The journal entry struct always has the space for the
modelview matrix so hopefully it's only a small cost to copy the
matrix.
The transform for the four entries is now done using
cogl_matrix_transform_points which may be slightly faster than
transforming them each individually with a call to
cogl_matrix_transfom.
This reverts commit 4cfe90bde2.
GLSL 1.00 on GLES doesn't support unsized arrays so the whole idea
can't work.
Conflicts:
clutter/cogl/cogl/cogl-pipeline-glsl.c
The check for whether we can reuse a program we've already generated
was only being done if the pipeline already had a
glsl_program_state. When there is no glsl_program_state it then looks
for the nearest ancestor it can share the program with. It then
wasn't checking whether that ancestor already had a GL program so it
would start generating the source again. It wouldn't however compile
that source again because _cogl_pipeline_backend_glsl_end does check
whether there is already a program. This patch moves the check until
after it has found the glsl_program_state, whether or not it was found
from an ancestor or as its own state.
Under GLES2 we were defining the cogl_tex_coord_in varying as an array
with a size determined by the number of texture coordinate arrays
enabled whenever the program is used. This meant that we may have to
regenerate the shader with a different size if the shader is used with
more texture coord arrays later. However in OpenGL the equivalent
builtin varying gl_TexCoord is simply defined as:
varying vec4 gl_TexCoord[]; /* <-- no size */
GLSL is documented that if you declare an array with no size then you
can only access it with a constant index and the size of the array
will be determined by the highest index used. If you want to access it
with a non-constant expression you need to redeclare the array
yourself with a size.
We can replicate the same behaviour in our Cogl shaders by instead
declaring the cogl_tex_coord_in with no size. That way we don't have
to pass around the number of tex coord attributes enabled when we
flush a material. It also means that CoglShader can go back to
directly uploading the source string to GL when cogl_shader_source is
called so that we don't have to keep a copy of it around.
If the user wants to access cogl_tex_coord_in with a non-constant
index then they can simply redeclare the array themself. Hopefully
developers will expect to have to do this if they are accustomed to
the gl_TexCoord array.
When compiling for GLES2, the codegen is affected by state other than
the layers. That means when we find an authority for the codegen state
we can't directly look at authority->n_layers to determine the number
of layers because it isn't necessarily the layer state authority. This
patch changes it to use cogl_pipeline_get_n_layers instead. Once we
have two authorities that differ in codegen state we then compare all
of the layers to decide if they would affect codegen. However it was
ignoring the fact that the authorities might also differ by the other
codegen state. This path also adds an extra check for whether
_cogl_pipeline_compare_differences contains any codegen bits other
than COGL_PIPELINE_STATE_LAYERS.