This adds a COGL_DEBUG=clipping option that reports how the clip is
being flushed. This is needed to determine whether the scissor,
stencil clip planes or software clipping is being used.
The CoglDebugFlags are now stored in an array of unsigned ints rather
than a single variable. The flags are accessed using macros instead of
directly peeking at the cogl_debug_flags variable. The index values
are stored in the enum rather than the actual mask values so that the
enum doesn't need to be more than 32 bits wide. The hope is that the
code to determine the index into the array can be optimized out by the
compiler so it should have exactly the same performance as the old
code.
This is part of a broader cleanup of some of the experimental Cogl API.
One of the reasons for this particular rename is to reduce the verbosity
of using the API. Another reason is that CoglVertexArray is going to be
renamed CoglAttributeBuffer and we want to help emphasize the
relationship between CoglAttributes and CoglAttributeBuffers.
COGL_DEBUG=disable-fast-read-pixel can be used to disable the
optimization for reading a single pixel colour back by looking at the
geometry in the journal and not involving the GPU. With this disabled we
will always flush the journal, rendering to the framebuffer and then use
glReadPixels to get the result.
This adds a transparent optimization to cogl_read_pixels for when a
single pixel is being read back and it happens that all the geometry of
the current frame is still available in the framebuffer's associated
journal.
The intention is to indirectly optimize Clutter's render based picking
mechanism in such a way that the 99% of cases where scenes are comprised
of trivial quad primitives that can easily be intersected we can avoid
the latency of kicking a GPU render and blocking for the result when we
know we can calculate the result manually on the CPU probably faster
than we could even kick a render.
A nice property of this solution is that it maintains all the
flexibility of the render based picking provided by Clutter and it can
gracefully fall back to GPU rendering if actors are drawn using anything
more complex than a quad for their geometry.
It seems worth noting that there is a limitation to the extensibility of
this approach in that it can only optimize picking a against geometry
that passes through Cogl's journal which isn't something Clutter
directly controls. For now though this really doesn't matter since
basically all apps should end up hitting this fast-path. The current
idea to address this longer term would be a pick2 vfunc for ClutterActor
that can support geometry and render based input regions of actors and
move this optimization up into Clutter instead.
Note: currently we don't have a primitive count threshold to consider
that there could be scenes with enough geometry for us to compensate for
the cost of kicking a render and determine a result more efficiently by
utilizing the GPU. We don't currently expect this to be common though.
Note: in the future it could still be interesting to revive something
like the wip/async-pbo-picking branch to provide an asynchronous
read-pixels based optimization for Clutter picking in cases where more
complex input regions that necessitate rendering are in use or if we do
add a threshold for rendering as mentioned above.
Instead of having _cogl_get/set_clip stack which reference the global
CoglContext this instead makes those into CoglClipState method functions
named _cogl_clip_state_get/set_stack that take an explicit pointer to a
CoglClipState.
This also adds _cogl_framebuffer_get/set_clip_stack convenience
functions that avoid having to first get the ClipState from a
framebuffer then the stack from that - so we can maintain the
convenience of _cogl_get_clip_stack.
Instead of having a single journal per context, we now have a
CoglJournal object for each CoglFramebuffer. This means we now don't
have to flush the journal when switching/pushing/popping between
different framebuffers so for example a Clutter scene that involves some
ClutterEffect actors that transiently redirect to an FBO can still be
batched.
This also allows us to track state in the journal that relates to the
current frame of its associated framebuffer which we'll need for our
optimization for using the CPU to handle reading a single pixel back
from a framebuffer when we know the whole scene is currently comprised
of simple rectangles in a journal.
In the journal code and when generating the stroke path the vertices
are generated on the fly and stored in a CoglBuffer using
cogl_buffer_map. However cogl_buffer_map is allowed to fail but it
wasn't checking for a NULL return value. In particular on GLES it will
always fail because glMapBuffer is only provided by an extension. This
adds a new pair of internal functions called
_cogl_buffer_{un,}map_for_fill_or_fallback which wrap
cogl_buffer_map. If the map fails then it will instead return a
pointer into a GByteArray attached to the context. When the buffer is
unmapped the array is copied into the buffer using
cogl_buffer_set_data.
When an item is added to the journal the current pipeline immediately
gets the legacy state applied to it and the modified pipeline is
logged instead of the original. However the actual drawing from the
journal is done using the vertex attribute API which was also applying
the legacy state. This meant that the legacy state used would be a
combination of the state set when the journal entry was added as well
as the state set when the journal is flushed. To fix this there is now
an extra CoglDrawFlag to avoid applying the legacy state when setting
up the GL state for the vertex attributes. The journal uses this flag
when flushing.
The vertex attribute API assumes that if there is a color array
enabled then we can't determine if the colors are opaque so we have to
enable blending. The journal always uses a color array to avoid
switching color state between rectangles. Since the journal switched
to using vertex attributes this means we effectively always enable
blending from the journal. To fix this there is now a new flag for
_cogl_draw_vertex_attributes to specify that the color array is known
to only contain opaque colors which causes the draw function not to
copy the pipeline. If the pipeline has blending disabled then the
journal passes this flag.
http://bugzilla.clutter-project.org/show_bug.cgi?id=2481
There is an internal version of cogl_draw_vertex_attributes_array
which previously just bypassed the framebuffer flushing, journal
flushing and pipeline validation so that it could be used to draw the
journal. This patch generalises the function so that it takes a set of
flags to specify which parts to flush. The public version of the
function now just calls the internal version with the flags set to
0. The '_real' version of the function has now been merged into the
internal version of the function because it was only called in one
place. This simplifies the code somewhat. The common code which
flushed the various state has been moved to a separate function. The
indexed versions of the functions have had a similar treatment.
http://bugzilla.clutter-project.org/show_bug.cgi?id=2481
_cogl_pipeline_equal now accepts a mask of pipeline differences and layer
differences to constrain what state will be compared. In addition a set
of flags are passed that can tweak the comparison semantics for some
state groups. For example when comparing layer textures we sometimes
only need to compare the texture target and can ignore the data itself.
In updating the code this patch also changes it so all required pipeline
authorities are resolved in one step up-front instead of resolving the
authority for each state group in turn and repeatedly having to traverse
the pipeline's ancestry. This adds two new functions
_cogl_pipeline_resolve_authorities and
_cogl_pipeline_layer_resolve_authorities to handle resolving a set of
authorities.
This adds a debug option called disable-software-clipping which causes
the journal to always log the clip stack state rather than trying to
manually clip rectangles.
Before flushing the journal there is now a separate iteration that
will try to determine if the matrix of the clip stack and the matrix
of the rectangle in each entry are on the same plane. If they are it
can completely avoid the clip stack and instead manually modify the
vertex and texture coordinates to implement the clip. The has the
advantage that it won't break up batching if a single clipped
rectangle is used in a scene.
The software clip is only used if there is no user program and no
texture matrices. There is a threshold to the size of the batch where
it is assumed that it is worth the cost to break up a batch and
program the GPU to do the clipping. Currently this is set to 8
although this figure is plucked out of thin air.
To check whether the two matrices are on the same plane it tries to
determine if one of the matrices is just a simple translation of the
other. In the process of this it also works out what the translation
would be. These values can be used to translate the clip rectangle
into the coordinate space of the rectangle to be logged. Then we can
do the clip directly in the rectangle's coordinate space.
When logging a quad we now only store the 2 vertices representing the
top left and bottom right of the quad. The color is only stored once
per entry. Once we come to upload the data we expand the 2 vertices
into four and copy the color to each vertex. We do this by mapping the
buffer and directly expanding into it. We have to copy the data before
we can render it anyway so it doesn't make much sense to expand the
vertices before uploading and this way should save some space in the
size of the journal. It also makes it slightly easier if we later want
to do pre-processing on the journal entries before uploading such as
doing software clipping.
The modelview matrix is now always copied to the journal entry whereas
before it would only be copied if we aren't doing software
transform. The journal entry struct always has the space for the
modelview matrix so hopefully it's only a small cost to copy the
matrix.
The transform for the four entries is now done using
cogl_matrix_transform_points which may be slightly faster than
transforming them each individually with a call to
cogl_matrix_transfom.
When logging quads in the journal it used to be possible to specify a
mask of fallback layers (layers where a default white texture should be
used in-place of the corresponding texture in the current source
pipeline). Since we now handle fallbacks for cogl_rectangle* primitives
when validating the pipeline up-front before logging in the journal we
no longer need the ability for the journal to apply fallbacks too.
This removes the possibility to specify wrap mode overrides within a
CoglPipelineFlushOptions struct since the right way to handle these
overrides is by copying the user's material and making the changes to
that copy before flushing. All primitives code has already switched away
from using these wrap mode overrides so this patch just removes unused
code and types. It also remove the wrap_mode_overrides argument for
_cogl_journal_log_quad.
This adds an optional data argument for cogl_vertex_array_new() since it
seems that mostly every case where we use this API we follow up with a
cogl_buffer_set_data() matching the size of the new array. This
simplifies all those cases and whenever we want to delay uploading of
data then NULL can simply be passed.
Since d5634e37 the sliced texture backend now works in terms of
CoglTexture2Ds so there's no need to have special casing for
overriding the texture of a pipeline layer with a GL handle. Instead
we can just use cogl_pipeline_set_layer_texture with the
CoglHandle. The special _cogl_pipeline_set_layer_gl_texture_slice
function has now been removed and parts of the code for comparing
materials have been simplified.
When adding a new entry to the journal a reference is now taken on the
current clip stack. Modifying the current clip state no longer causes
a journal flush. The journal flushing code now has an extra stage to
compare the clip state of each entry. The comparison can simply be
done by comparing the pointers. Although different clip states will
still end up with multiple draw calls this at leasts allows a scene
comprising of multiple different clips to be upload with one vbo. It
also lays the groundwork to do certain tricks when drawing clipped
rectangles such as modifying the geometry instead of setting a clip
state.
Unless the CoglBuffer is being used for texture data then it's
relatively unlikely that the data will contain an array of bytes. For
example if it's used as a vertex array then it's more likely to be
floats or some vertex struct. In that case it's much more convenient
if set_data and map use void* pointers so that we can avoid a cast.
This applies an API naming change that's been deliberated over for a
while now which is to rename CoglMaterial to CoglPipeline.
For now the new pipeline API is marked as experimental and public
headers continue to talk about materials not pipelines. The CoglMaterial
API is now maintained in terms of the cogl_pipeline API internally.
Currently this API is targeting Cogl 2.0 so we will have time to
integrate it properly with other upcoming Cogl 2.0 work.
The basic reasons for the rename are:
- That the term "material" implies to many people that they are
constrained to fragment processing; perhaps as some kind of high-level
texture abstraction.
- In Clutter they get exposed by ClutterTexture actors which may be
re-inforcing this misconception.
- When comparing how other frameworks use the term material, a material
sometimes describes a multi-pass fragment processing technique which
isn't the case in Cogl.
- In code, "CoglPipeline" will hopefully be a much more self documenting
summary of what these objects represent; a full GPU pipeline
configuration including, for example, vertex processing, fragment
processing and blending.
- When considering the API documentation story, at some point we need a
document introducing developers to how the "GPU pipeline" works so it
should become intuitive that CoglPipeline maps back to that
description of the GPU pipeline.
- This is consistent in terminology and concept to OpenGL 4's new
pipeline object which is a container for program objects.
Note: The cogl-material.[ch] files have been renamed to
cogl-material-compat.[ch] because otherwise git doesn't seem to treat
the change as a moving the old cogl-material.c->cogl-pipeline.c and so
we loose all our git-blame history.
Instead of using raw OpenGL in the journal we now use the vertex
attributes API instead. This is part of an ongoing effort to reduce the
number of drawing paths we maintain in Cogl.
This moves the code supporting _cogl_material_flush_gl_state into
cogl-material-opengl.c as part of an effort to reduce the size of
cogl-material.c to keep it manageable.
As a follow on to using cogl_material_copy instead of flush options this
patch now removes the ability to pass flush options to
_cogl_material_equal which is the final reference to the
CoglMaterialFlushOptions mechanism.
Since cogl_material_copy should now be cheap to use we can simplify
how we handle fallbacks and wrap mode overrides etc by simply copying
the original material and making our override changes on the new
material. This avoids the need for a sideband state structure that has
been growing in size and makes flushing material state more complex.
Note the plan is to eventually use weak materials for these override
materials and attach these as private data to the original materials so
we aren't making so many one-shot materials.
This is a complete overhaul of the data structures used to manage
CoglMaterial state.
We have these requirements that were aiming to meet:
(Note: the references to "renderlists" correspond to the effort to
support scenegraph level shuffling of Clutter actor primitives so we can
minimize GPU state changes)
Sparse State:
We wanted a design that allows sparse descriptions of state so it scales
well as we make CoglMaterial responsible for more and more state. It
needs to scale well in terms of memory usage and the cost of operations
we need to apply to materials such as comparing, copying and flushing
their state. I.e. we would rather have these things scale by the number
of real changes a material represents not by how much overall state
CoglMaterial becomes responsible for.
Cheap Copies:
As we add support for renderlists in Clutter we will need to be able to
get an immutable handle for a given material's current state so that we
can retain a record of a primitive with its associated material without
worrying that changes to the original material will invalidate that
record.
No more flush override options:
We want to get rid of the flush overrides mechanism we currently use to
deal with texture fallbacks, wrap mode changes and to handle the use of
highlevel CoglTextures that need to be resolved into lowlevel textures
before flushing the material state.
The flush options structure has been expanding in size and the structure
is logged with every journal entry so it is not an approach that scales
well at all. It also makes flushing material state that much more
complex.
Weak Materials:
Again for renderlists we need a way to create materials derived from
other materials but without the strict requirement that modifications to
the original material wont affect the derived ("weak") material. The
only requirement is that its possible to later check if the original
material has been changed.
A summary of the new design:
A CoglMaterial now basically represents a diff against its parent.
Each material has a single parent and a mask of state that it changes.
Each group of state (such as the blending state) has an "authority"
which is found by walking up from a given material through its ancestors
checking the difference mask until a match for that group is found.
There is only one root node to the graph of all materials, which is the
default material first created when Cogl is being initialized.
All the groups of state are divided into two types, such that
infrequently changed state belongs in a separate "BigState" structure
that is only allocated and attached to a material when necessary.
CoglMaterialLayers are another sparse structure. Like CoglMaterials they
represent a diff against their parent and all the layers are part of
another graph with the "default_layer_0" layer being the root node that
Cogl creates during initialization.
Copying a material is now basically just a case of slice allocating a
CoglMaterial, setting the parent to be the source being copied and
zeroing the mask of changes.
Flush overrides should now be handled by simply relying on the cheapness
of copying a material and making changes to it. (This will be done in a
follow on commit)
Weak material support will be added in a follow on commit.
As part of an effort to improve the architecture of CoglMaterial
internally this overhauls how we flush layer state to OpenGL by adding a
formal backend abstraction for fragment processing and further
formalizing the CoglTextureUnit abstraction.
There are three backends: "glsl", "arbfp" and "fixed". The fixed backend
uses the OpenGL fixed function APIs to setup the fragment processing,
the arbfp backend uses code generation to handle fragment processing
using an ARBfp program, and the GLSL backend is currently only there as
a formality to handle user programs associated with a material. (i.e.
the glsl backend doesn't yet support code generation)
The GLSL backend has highest precedence, then arbfp and finally the
fixed. If a backend can't support some particular CoglMaterial feature
then it will fallback to the next backend.
This adds three new COGL_DEBUG options:
* "disable-texturing" as expected should disable all texturing
* "disable-arbfp" always make the arbfp backend fallback
* "disable-glsl" always make the glsl backend fallback
* "show-source" show code generated by the arbfp/glsl backends
While this is totally fine (0 in the pointer context will be converted
in the right internal NULL representation, which could be a value with
some bits to 1), I believe it's clearer to use NULL in the pointer
context.
It seems that, in most case, it's more an overlook than a deliberate
choice to use FALSE/0 as NULL, eg. copying a _COGL_GET_CONTEXT (ctx, 0)
or a g_return_val_if_fail (cond, 0) from a function returning a
gboolean.
Instead of directly using a guint32 to store a bitmask for each used
texcoord array, it now stores them in a CoglBitmask. This removes the
limitation of 32 layers (although there are still other places in Cogl
that imply this restriction). To disable texcoord arrays code should
call _cogl_disable_other_texcoord_arrays which takes a bitmask of
texcoord arrays that should not be disabled. There are two extra
bitmasks stored in the CoglContext which are used temporarily for this
function to avoid allocating a new bitmask each time.
http://bugzilla.openedhand.com/show_bug.cgi?id=2132
It should be quite acceptable to use a texture without defining any
texture coords. For example a shader may be in use that is doing
texture lookups without referencing the texture coordinates. Also it
should be possible to replace the vertex colors using a texture layer
without a texture but with a constant layer color.
enable_state_for_drawing_buffer no longer sets any disabled layers in
the overrides. Instead of counting the number of units with texture
coordinates it now keeps them in a mask. This means there can now be
gaps in the list of enabled texture coordinate arrays. To cope with
this, the Cogl context now also stores a mask to track the enabled
arrays. Instead of code manually iterating each enabled array to
disable them, there is now an internal function called
_cogl_disable_texcoord_arrays which disables a given mask.
I think this could also fix potential bugs when a vertex buffer has
gaps in the texture coordinate attributes that it provides. For
example if the vertex buffer only had texture coordinates for layer 2
then the disabling code would not disable the coordinates for layers 0
and 1 even though they are not used. This could cause a crash if the
previous data for those arrays is no longer valid.
http://bugzilla.openedhand.com/show_bug.cgi?id=2132
Previously, Cogl's texture coordinate system was effectively always
GL_REPEAT so that if an application specifies coordinates outside the
range 0→1 it would get repeated copies of the texture. It would
however change the mode to GL_CLAMP_TO_EDGE if all of the coordinates
are in the range 0→1 so that in the common case that the whole texture
is being drawn with linear filtering it will not blend in edge pixels
from the opposite sides.
This patch adds the option for applications to change the wrap mode
per layer. There are now three wrap modes: 'repeat', 'clamp-to-edge'
and 'automatic'. The automatic map mode is the default and it
implements the previous behaviour. The wrap mode can be changed for
the s and t coordinates independently. I've tried to make the
internals support setting the r coordinate but as we don't support 3D
textures yet I haven't exposed any public API for it.
The texture backends still have a set_wrap_mode virtual but this value
is intended to be transitory and it will be changed whenever the
material is flushed (although the backends are expected to cache it so
that it won't use too many GL calls). In my understanding this value
was always meant to be transitory and all primitives were meant to set
the value before drawing. However there were comments suggesting that
this is not the expected behaviour. In particular the vertex buffer
drawing code never set a wrap mode so it would end up with whatever
the texture was previously used for. These issues are now fixed
because the material will always set the wrap modes.
There is code to manually implement clamp-to-edge for textures that
can't be hardware repeated. However this doesn't fully work because it
relies on being able to draw the stretched parts using quads with the
same values for tx1 and tx2. The texture iteration code doesn't
support this so it breaks. This is a separate bug and it isn't
trivially solved.
When flushing a material there are now extra options to set wrap mode
overrides. The overrides are an array of values for each layer that
specifies an override for the s, t or r coordinates. The primitives
use this to implement the automatic wrap mode. cogl_polygon also uses
it to set GL_CLAMP_TO_BORDER mode for its trick to render sliced
textures. Although this code has been added it looks like the sliced
trick has been broken for a while and I haven't attempted to fix it
here.
I've added a constant to represent the maximum number of layers that a
material supports so that I can size the overrides array. I've set it
to 32 because as far as I can tell we have that limit imposed anyway
because the other flush options use a guint32 to store a flag about
each layer. The overrides array ends up adding 32 bytes to each flush
options struct which may be a concern.
http://bugzilla.openedhand.com/show_bug.cgi?id=2063
Every now and then someone sees the cogl_enable API and gets confused,
thinking its public API so this renames the symbol to be clear that it's
is an internal only API.
Since using addresses that might change is something that finally
the FSF acknowledge as a plausible scenario (after changing address
twice), the license blurb in the source files should use the URI
for getting the license in case the library did not come with it.
Not that URIs cannot possibly change, but at least it's easier to
set up a redirection at the same place.
As a side note: this commit closes the oldes bug in Clutter's bug
report tool.
http://bugzilla.openedhand.com/show_bug.cgi?id=521
Most Cogl debugging code conditions are marked as G_UNLIKELY with the
intention of having the CPU branch prediction always assume the
path is disabled so having debugging support in release binaries has
negligible overhead.
This patch simply fixes a few cases where we weren't using G_UNLIKELY.
All the cogl_rectangle* APIs normalize their input into into an array of
_CoglMutiTexturedRect rectangles and pass these on to our work horse;
_cogl_rectangles_with_multitexture_coords. The definition of
_CoglMutiTexturedRect had 4 separate float members, x_1, y_1, x_2 and
y_2 which meant for some common cases we were having to copy out from an
array into these members. We are now able to simply point into the users
array avoiding a copy which seems desirable when submiting lots of
rectangles.
We've had complaints that our Cogl code/headers are a bit "special" so
this is a first pass at tidying things up by giving them some
consistency. These changes are all consistent with how new code in Cogl
is being written, but the style isn't consistently applied across all
code yet.
There are two parts to this patch; but since each one required a large
amount of effort to maintain tidy indenting it made sense to combine the
changes to reduce the time spent re indenting the same lines.
The first change is to use a consistent style for declaring function
prototypes in headers. Cogl headers now consistently use this style for
prototypes:
return_type
cogl_function_name (CoglType arg0,
CoglType arg1);
Not everyone likes this style, but it seems that most of the currently
active Cogl developers agree on it.
The second change is to constrain the use of redundant glib data types
in Cogl. Uses of gint, guint, gfloat, glong, gulong and gchar have all
been replaced with int, unsigned int, float, long, unsigned long and char
respectively. When talking about pixel data; use of guchar has been
replaced with guint8, otherwise unsigned char can be used.
The glib types that we continue to use for portability are gboolean,
gint{8,16,32,64}, guint{8,16,32,64} and gsize.
The general intention is that Cogl should look palatable to the widest
range of C programmers including those outside the Gnome community so
- especially for the public API - we want to minimize the number of
foreign looking typedefs.
This adds three new texture backends.
- CoglTexture2D: This is a trimmed down version of CoglTexture2DSliced
which only supports a single texture and only works with the
GL_TEXTURE_2D target. The code is a lot simpler so it has a less
overheads than dealing with slices. Cogl will use this wherever
possible.
- CoglSubTexture: This is used to get a CoglHandle to represent a
subregion of another texture. The texture can be used as if it was a
standalone texture but it does not need to copy the resources.
- CoglAtlasTexture: This collects RGB and RGBA textures into a single
GL texture with the aim of reducing texture state changes and
increasing batching. The backend will try to manage the atlas and
may move the textures around to close gaps in the texture. By
default all textures will be placed in the atlas.
Instead of assigning a new colour to each quad of a batch, the
rectangle debugging code now assigns a new colour to each batch so
that it can be used to visually see what is being batched. The colour
is stored in a global variable that is reset during cogl_clear. This
improves the chances that the same colour will be used for a batch in
the next frames to avoid flickering.
If a user supplied multiple groups of texture coordinates with
cogl_rectangle_with_multitexture_coords() then we would repeatedly log only
the first group in the journal. This fixes that bug and adds a conformance
test to verify the fix.
Thanks to Gord Allott for reporting this bug.
This adds gives Cogl a dedicated UProf context which will be linked together
with Clutter's context during clutter_init_real().
Initial timers cover _cogl_journal_flush and _cogl_journal_log_quad
You can explicitly ask for a report of Cogl statistics by exporting
COGL_PROFILE_OUTPUT_REPORT=1 but since the context is linked with Clutter's
the statisitcs will also be shown in the automatic Clutter reports.
The CoglTextureSliceCallback function pointer now takes const pointers
for the texture coordinates. This makes it clearer that the callback
should not modify the array and therefore the backend can use the same
array for both sets of coords.