Instead of flushing the journal whenever the current framebuffer on a
context is changed it is now flushed whenever the framebuffer is about
to be destroyed instead. To do this it implements a custom unref
function which detects when there is going to be exactly one reference
on the framebuffer and then flushes its journal. The journal now
always has a reference on the framebuffer whenever it is non-empty.
That means the unref will only cause a flush if the only thing keeping
the framebuffer alive is the entries in the journal.
Reviewed-by: Robert Bragg <robert@linux.intel.com>
As part of the on going, incremental effort to purge the non type safe
CoglHandle type from the Cogl API this patch tackles most of the
CoglHandle uses relating to textures.
We'd postponed making this change for quite a while because we wanted to
have a clearer understanding of how we wanted to evolve the texture APIs
towards Cogl 2.0 before exposing type safety here which would be
difficult to change later since it would imply breaking APIs.
The basic idea that we are steering towards now is that CoglTexture
can be considered to be the most primitive interface we have for any
object representing a texture. The texture interface would provide
roughly these methods:
cogl_texture_get_width
cogl_texture_get_height
cogl_texture_can_repeat
cogl_texture_can_mipmap
cogl_texture_generate_mipmap;
cogl_texture_get_format
cogl_texture_set_region
cogl_texture_get_region
Besides the texture interface we will then start to expose types
corresponding to specific texture types: CoglTexture2D,
CoglTexture3D, CoglTexture2DSliced, CoglSubTexture, CoglAtlasTexture and
CoglTexturePixmapX11.
We will then also expose an interface for the high-level texture types
we have (such as CoglTexture2DSlice, CoglSubTexture and
CoglAtlasTexture) called CoglMetaTexture. CoglMetaTexture is an
additional interface that lets you iterate a virtual region of a meta
texture and get mappings of primitive textures to sub-regions of that
virtual region. Internally we already have this kind of abstraction for
dealing with sliced texture, sub-textures and atlas textures in a
consistent way, so this will just make that abstraction public. The aim
here is to clarify that there is a difference between primitive textures
(CoglTexture2D/3D) and some of the other high-level textures, and also
enable developers to implement primitives that can support meta textures
since they can only be used with the cogl_rectangle API currently.
The thing that's not so clean-cut with this are the texture constructors
we have currently; such as cogl_texture_new_from_file which no longer
make sense when CoglTexture is considered to be an interface. These
will basically just become convenient factory functions and it's just a
bit unusual that they are within the cogl_texture namespace. It's worth
noting here that all the texture type APIs will also have their own type
specific constructors so these functions will only be used for the
convenience of being able to create a texture without really wanting to
know the details of what type of texture you need. Longer term for 2.0
we may come up with replacement names for these factory functions or the
other thing we are considering is designing some asynchronous factory
functions instead since it's so often detrimental to application
performance to be blocked waiting for a texture to be uploaded to the
GPU.
Reviewed-by: Neil Roberts <neil@linux.intel.com>
Previously whenever the journal is flushed a new vertex array would be
created to contain the vertices. To avoid the overhead of reallocating
a buffer every time, this patch makes it use a pool of 8 buffers which
are cycled in turn. The buffers are never destroyed but instead the
data is replaced. The journal should only ever be using one buffer at
a time but we cache more than one buffer anyway in case the GL driver
is internally using the buffer in which case mapping the buffer may
cause it to create a new buffer anyway.
This migrates all the GLX window system code down from the Clutter
backend code into a Cogl winsys. Moving OpenGL window system binding
code down from Clutter into Cogl is the biggest blocker to having Cogl
become a standalone 3D graphics library, so this is an important step in
that direction.
This adds a transparent optimization to cogl_read_pixels for when a
single pixel is being read back and it happens that all the geometry of
the current frame is still available in the framebuffer's associated
journal.
The intention is to indirectly optimize Clutter's render based picking
mechanism in such a way that the 99% of cases where scenes are comprised
of trivial quad primitives that can easily be intersected we can avoid
the latency of kicking a GPU render and blocking for the result when we
know we can calculate the result manually on the CPU probably faster
than we could even kick a render.
A nice property of this solution is that it maintains all the
flexibility of the render based picking provided by Clutter and it can
gracefully fall back to GPU rendering if actors are drawn using anything
more complex than a quad for their geometry.
It seems worth noting that there is a limitation to the extensibility of
this approach in that it can only optimize picking a against geometry
that passes through Cogl's journal which isn't something Clutter
directly controls. For now though this really doesn't matter since
basically all apps should end up hitting this fast-path. The current
idea to address this longer term would be a pick2 vfunc for ClutterActor
that can support geometry and render based input regions of actors and
move this optimization up into Clutter instead.
Note: currently we don't have a primitive count threshold to consider
that there could be scenes with enough geometry for us to compensate for
the cost of kicking a render and determine a result more efficiently by
utilizing the GPU. We don't currently expect this to be common though.
Note: in the future it could still be interesting to revive something
like the wip/async-pbo-picking branch to provide an asynchronous
read-pixels based optimization for Clutter picking in cases where more
complex input regions that necessitate rendering are in use or if we do
add a threshold for rendering as mentioned above.
Instead of having a single journal per context, we now have a
CoglJournal object for each CoglFramebuffer. This means we now don't
have to flush the journal when switching/pushing/popping between
different framebuffers so for example a Clutter scene that involves some
ClutterEffect actors that transiently redirect to an FBO can still be
batched.
This also allows us to track state in the journal that relates to the
current frame of its associated framebuffer which we'll need for our
optimization for using the CPU to handle reading a single pixel back
from a framebuffer when we know the whole scene is currently comprised
of simple rectangles in a journal.
Before flushing the journal there is now a separate iteration that
will try to determine if the matrix of the clip stack and the matrix
of the rectangle in each entry are on the same plane. If they are it
can completely avoid the clip stack and instead manually modify the
vertex and texture coordinates to implement the clip. The has the
advantage that it won't break up batching if a single clipped
rectangle is used in a scene.
The software clip is only used if there is no user program and no
texture matrices. There is a threshold to the size of the batch where
it is assumed that it is worth the cost to break up a batch and
program the GPU to do the clipping. Currently this is set to 8
although this figure is plucked out of thin air.
To check whether the two matrices are on the same plane it tries to
determine if one of the matrices is just a simple translation of the
other. In the process of this it also works out what the translation
would be. These values can be used to translate the clip rectangle
into the coordinate space of the rectangle to be logged. Then we can
do the clip directly in the rectangle's coordinate space.
When logging quads in the journal it used to be possible to specify a
mask of fallback layers (layers where a default white texture should be
used in-place of the corresponding texture in the current source
pipeline). Since we now handle fallbacks for cogl_rectangle* primitives
when validating the pipeline up-front before logging in the journal we
no longer need the ability for the journal to apply fallbacks too.
This removes the possibility to specify wrap mode overrides within a
CoglPipelineFlushOptions struct since the right way to handle these
overrides is by copying the user's material and making the changes to
that copy before flushing. All primitives code has already switched away
from using these wrap mode overrides so this patch just removes unused
code and types. It also remove the wrap_mode_overrides argument for
_cogl_journal_log_quad.
Since d5634e37 the sliced texture backend now works in terms of
CoglTexture2Ds so there's no need to have special casing for
overriding the texture of a pipeline layer with a GL handle. Instead
we can just use cogl_pipeline_set_layer_texture with the
CoglHandle. The special _cogl_pipeline_set_layer_gl_texture_slice
function has now been removed and parts of the code for comparing
materials have been simplified.
When adding a new entry to the journal a reference is now taken on the
current clip stack. Modifying the current clip state no longer causes
a journal flush. The journal flushing code now has an extra stage to
compare the clip state of each entry. The comparison can simply be
done by comparing the pointers. Although different clip states will
still end up with multiple draw calls this at leasts allows a scene
comprising of multiple different clips to be upload with one vbo. It
also lays the groundwork to do certain tricks when drawing clipped
rectangles such as modifying the geometry instead of setting a clip
state.
This applies an API naming change that's been deliberated over for a
while now which is to rename CoglMaterial to CoglPipeline.
For now the new pipeline API is marked as experimental and public
headers continue to talk about materials not pipelines. The CoglMaterial
API is now maintained in terms of the cogl_pipeline API internally.
Currently this API is targeting Cogl 2.0 so we will have time to
integrate it properly with other upcoming Cogl 2.0 work.
The basic reasons for the rename are:
- That the term "material" implies to many people that they are
constrained to fragment processing; perhaps as some kind of high-level
texture abstraction.
- In Clutter they get exposed by ClutterTexture actors which may be
re-inforcing this misconception.
- When comparing how other frameworks use the term material, a material
sometimes describes a multi-pass fragment processing technique which
isn't the case in Cogl.
- In code, "CoglPipeline" will hopefully be a much more self documenting
summary of what these objects represent; a full GPU pipeline
configuration including, for example, vertex processing, fragment
processing and blending.
- When considering the API documentation story, at some point we need a
document introducing developers to how the "GPU pipeline" works so it
should become intuitive that CoglPipeline maps back to that
description of the GPU pipeline.
- This is consistent in terminology and concept to OpenGL 4's new
pipeline object which is a container for program objects.
Note: The cogl-material.[ch] files have been renamed to
cogl-material-compat.[ch] because otherwise git doesn't seem to treat
the change as a moving the old cogl-material.c->cogl-pipeline.c and so
we loose all our git-blame history.
Since cogl_material_copy should now be cheap to use we can simplify
how we handle fallbacks and wrap mode overrides etc by simply copying
the original material and making our override changes on the new
material. This avoids the need for a sideband state structure that has
been growing in size and makes flushing material state more complex.
Note the plan is to eventually use weak materials for these override
materials and attach these as private data to the original materials so
we aren't making so many one-shot materials.
Previously, Cogl's texture coordinate system was effectively always
GL_REPEAT so that if an application specifies coordinates outside the
range 0→1 it would get repeated copies of the texture. It would
however change the mode to GL_CLAMP_TO_EDGE if all of the coordinates
are in the range 0→1 so that in the common case that the whole texture
is being drawn with linear filtering it will not blend in edge pixels
from the opposite sides.
This patch adds the option for applications to change the wrap mode
per layer. There are now three wrap modes: 'repeat', 'clamp-to-edge'
and 'automatic'. The automatic map mode is the default and it
implements the previous behaviour. The wrap mode can be changed for
the s and t coordinates independently. I've tried to make the
internals support setting the r coordinate but as we don't support 3D
textures yet I haven't exposed any public API for it.
The texture backends still have a set_wrap_mode virtual but this value
is intended to be transitory and it will be changed whenever the
material is flushed (although the backends are expected to cache it so
that it won't use too many GL calls). In my understanding this value
was always meant to be transitory and all primitives were meant to set
the value before drawing. However there were comments suggesting that
this is not the expected behaviour. In particular the vertex buffer
drawing code never set a wrap mode so it would end up with whatever
the texture was previously used for. These issues are now fixed
because the material will always set the wrap modes.
There is code to manually implement clamp-to-edge for textures that
can't be hardware repeated. However this doesn't fully work because it
relies on being able to draw the stretched parts using quads with the
same values for tx1 and tx2. The texture iteration code doesn't
support this so it breaks. This is a separate bug and it isn't
trivially solved.
When flushing a material there are now extra options to set wrap mode
overrides. The overrides are an array of values for each layer that
specifies an override for the s, t or r coordinates. The primitives
use this to implement the automatic wrap mode. cogl_polygon also uses
it to set GL_CLAMP_TO_BORDER mode for its trick to render sliced
textures. Although this code has been added it looks like the sliced
trick has been broken for a while and I haven't attempted to fix it
here.
I've added a constant to represent the maximum number of layers that a
material supports so that I can size the overrides array. I've set it
to 32 because as far as I can tell we have that limit imposed anyway
because the other flush options use a guint32 to store a flag about
each layer. The overrides array ends up adding 32 bytes to each flush
options struct which may be a concern.
http://bugzilla.openedhand.com/show_bug.cgi?id=2063
Since using addresses that might change is something that finally
the FSF acknowledge as a plausible scenario (after changing address
twice), the license blurb in the source files should use the URI
for getting the license in case the library did not come with it.
Not that URIs cannot possibly change, but at least it's easier to
set up a redirection at the same place.
As a side note: this commit closes the oldes bug in Clutter's bug
report tool.
http://bugzilla.openedhand.com/show_bug.cgi?id=521
All the cogl_rectangle* APIs normalize their input into into an array of
_CoglMutiTexturedRect rectangles and pass these on to our work horse;
_cogl_rectangles_with_multitexture_coords. The definition of
_CoglMutiTexturedRect had 4 separate float members, x_1, y_1, x_2 and
y_2 which meant for some common cases we were having to copy out from an
array into these members. We are now able to simply point into the users
array avoiding a copy which seems desirable when submiting lots of
rectangles.
The CoglTextureSliceCallback function pointer now takes const pointers
for the texture coordinates. This makes it clearer that the callback
should not modify the array and therefore the backend can use the same
array for both sets of coords.
The Journal can be considered a standalone component, so even though
it's currently only used to log quads, it seems better to split it
out into its own file.