Because Cogl defines the origin for texture as top left and offscreen draw
buffers can be used to render to textures, we (internally) force all
offscreen rendering to be upside down. (because OpenGL defines the origin
to be bottom left)
By forcing the users scene to be rendered upside down though we also reverse
the winding order of all the drawn triangles which may interfere with the
users use of backface culling. This patch ensures that we reverse the
winding order for a front face (if culling is in use) while rendering
offscreen so we don't conflict with the users back face culling.
Technically this change shouldn't make a difference since we are
calling glReadPixels with GL_RGBA GL_UNSIGNED_BYTE which is a 4
byte format and it should always result in the same value according
to how OpenGL calculates the location of sequential rows.
i.e. k = a/s * ceil(snl/a) where:
a = alignment
s = component size (1)
n = number of components per pixel (4)
l = number of pixels in a row
gives:
k = 4/1 * ceil(4l/4) and k = 1/1 * ceil(4l/1) which are equivalent
I'm changing it because I've seen i915 driver code that bails out of
hardware accelerated paths if the alignment isn't 1, and because
conceptually we have no alignment constraints here so even if the current
value has no effect, when we start reading back other formats it may upset
things.
We were previously calling cogl_flush() after setting up the glPixelStore
state for calling glReadPixels, but flushing the journal could itself
change the glPixelStore state.
Since offscreen rendering is forced to be upside down we don't need to do
any conversion of the users coordinates to go from Cogl window coordinates
to OpenGL window coordinates.
Since we do all offscreen rendering upside down (so that we can have the
origin for texture coordinates be the top left of textures for the cases
where offscreen draw buffers are bound to textures) we don't need to flip
data read back from an offscreen framebuffer before we we return it to the
user.
I was originally expecting the code not to handle offset viewports or
viewports with a different size to the framebuffer, but it turns out the
code worked fine. In the process though I think I made the code slightly
more readable.
cogl_viewport only accepted a viewport width and height, but there are times
when it's also desireable to have a viewport offset so that a scene can be
translated after projection but before hitting the framebuffer.
Because Cogl defines the origin of viewport and window coordinates to be
top-left it always needs to know the size of the current window so that Cogl
window/viewport coordinates can be transformed into OpenGL coordinates.
This also fixes cogl_read_pixels to use the current draw buffer height
instead of the viewport height to determine the OpenGL y coordinate to use
for glReadPixels.
First a few notes about Cogl coordinate systems:
- Cogl defines the window origin, viewport origin and texture coordinates
origin to be top left unlike OpenGL which defines them as bottom left.
- Cogl defines the modelview and projection identity matrices in exactly the
same way as OpenGL.
- I.e. we believe that for 2D centric constructs: windows/framebuffers,
viewports and textures developers are more used to dealing with a top left
origin, but when modeling objects in 3D; an origin at the center with y
going up is quite natural.
The way Cogl handles textures is by uploading data upside down in OpenGL
terms so that bottom left becomes top left. (Note: This also has the
benefit that we don't need to flip the data we get from image decoding
libraries since they typically also consider top left to be the image
origin.)
The viewport and window coords are mostly handled with various y =
height - y tweaks before we pass y coordinates to OpenGL.
Generally speaking though the handling of coordinate spaces in Cogl is a bit
fragile. I guess partly because none of it was design to be, it just
evolved from how Clutter defines its coordinates without much consideration
or testing. I hope to improve this over a number of commits; starting here.
This commit deals with the fact that offscreen draw buffers may be bound to
textures but we don't "upload" the texture data upside down, and so if you
texture from an offscreen draw buffer you need to manually flip the texture
coordinates to get it the right way around. We now force offscreen
rendering to be flipped upside down by tweaking the projection matrix right
before we submit it to OpenGL to scale y by -1. The tweak is entirely
hidden from the user such that if you call cogl_get_projection you will not
see this scale.
We were ignoring the possibility that the current modelview matrix may flip
the incoming rectangle in which case we didn't calculate a valid scissor
rectangle for clipping.
This fixes: http://bugzilla.o-hand.com/show_bug.cgi?id=1809
(Clipping doesn't work within an FBO)
Cogl's support for offscreen rendering was originally written just to support
the clutter_texture_new_from_actor API and due to lack of documentation and
several confusing - non orthogonal - side effects of using the API it wasn't
really possible to use directly.
This commit does a number of things:
- It removes {gl,gles}/cogl-fbo.{c,h} and adds shared cogl-draw-buffer.{c,h}
files instead which should be easier to maintain.
- internally CoglFbo objects are now called CoglDrawBuffers. A
CoglDrawBuffer is an abstract base class that is inherited from to
implement CoglOnscreen and CoglOffscreen draw buffers. CoglOffscreen draw
buffers will initially be used to support the
cogl_offscreen_new_to_texture API, and CoglOnscreen draw buffers will
start to be used internally to represent windows as we aim to migrate some
of Clutter's backend code to Cogl.
- It makes draw buffer objects the owners of the following state:
- viewport
- projection matrix stack
- modelview matrix stack
- clip state
(This means when you switch between draw buffers you will automatically be
switching to their associated viewport, matrix and clip state)
Aside from hopefully making cogl_offscreen_new_to_texture be more useful
short term by having simpler and well defined semantics for
cogl_set_draw_buffer, as mentioned above this is the first step for a couple
of other things:
- Its a step toward moving ownership for windows down from Clutter backends
into Cogl, by (internally at least) introducing the CoglOnscreen draw
buffer. Note: the plan is that cogl_set_draw_buffer will accept on or
offscreen draw buffer handles, and the "target" argument will become
redundant since we will instead query the type of the given draw buffer
handle.
- Because we have a common type for on and offscreen framebuffers we can
provide a unified API for framebuffer management. Things like:
- blitting between buffers
- managing ancillary buffers (e.g. attaching depth and stencil buffers)
- size requisition
- clearing
Over time the two cogl-fbo.c files have needlessly diverged as bug fixes or
cleanups went into one version but not the other. This tries to bring them
back in line with each other. It should actually be simple enough to move
cogl-fbo.c to be a common file, and simply not build it for GLES 1.1, so
maybe I'll follow up with such a patch soon.
The comment just said: "Some implementation require a clear before drawing
to an fbo. Luckily it is affected by scissor test." and did a scissored
clear, which is clearly a driver bug workaround, but for what driver? The
fact that it was copied into the gles backend (or vica versa is also
suspicious since it seems unlikely that the workaround is necessary for both
backends.)
We can easily restore the workaround with a better comment if this problem
really still exists on current drivers, but for now I'd rather minimize
hand-wavey workaround code that can't be tested.
Otherwise you can't use the alpha channel of the vertex colors unless
the material has a texture with alpha or the material's color has
alpha less than 255.
Some changes to make COGL pass distcheck with Automake 1.11 and
anal-retentiveness turned up to 11.
The "major" change is the flattening of the winsys/ part of COGL,
which is built directly inside libclutter-cogl.la instead of an
intermediate libclutter-cogl-winsys.la object.
Ideally, the whole COGL should be flattened out using a
quasi-non-recursive Automake layout; unfortunately, the driver/
sub-section ships with identical targets and Automake cannot
distinguish GL and GLES objects.
Since we no longer depend on the GL matrix API in Cogl we can remove a lot
of wrapper code from the GLES 2 backend. This is particularly nice given
that there was no code shared between the cogl-matrix-stack API and gles2
wrappers so we had a lot of duplicated logic.
The indirection through this API isn't necessary since we no longer
arbitrate between the OpenGL matrix API and Cogl's client side API. Also it
doesn't help to maintain an OpenGL style matrix mode API for internal use
since it's awkward to keep restoring the MODELVIEW mode and easy enough to
directly work with the matrix stacks of interest.
This replaces use of the _cogl_current_matrix API with direct use of the
_cogl_matrix_stack API. All the unused cogl_current_matrix API is removed
and the matrix utility code left in cogl-current-matrix.c was moved to
cogl.c.
This cache of the gl matrix mode lets us avoid repeat calls to glMatrixMode
in _cogl_matrix_stack_flush_to_gl when we have lots of sequential modelview
matrix modifications.
This goes a bit further than the previous patch, and as a special case
we now simply represent identity matrices using a boolean, and only
lazily initialize them when they need to be modified.
The journal always uses an identity matrix since it uses software
transformation. Currently it manually uses glLoadMatrix since previous
experimentation showed that the cogl-matrix-stack gave bad performance, but
it would be nice to fix performance so we only have to care about one path
for loading matrices.
For the common case where we do:
cogl_matrix_stack_push()
cogl_matrix_stack_load_identity()
we were effectively initializing the matrix 3 times. Once due to use of
g_slice_new0, then we had a cogl_matrix_init_identity in
_cogl_matrix_state_new for good measure, and then finally in
cogl_matrix_stack_load_identity we did another cogl_matrix_init_identity.
We don't use g_slice_new0 anymore, _cogl_matrix_state_new is documented as
not initializing the matrix (instead _cogl_matrix_stack_top_mutable now
takes a boolean to choose if new stack entries should be initialised) and so
we now only initialize once in cogl_matrix_stack_load_identity.
This relates back to an earlier commitment to stop using the OpenGL matrix
API which is considered deprecated. (ref 54159f5a1d)
The new texture matrix stacks are hung from a list of (internal only)
CoglTextureUnit structures which the CoglMaterial code internally references
via _cogl_get_texure_unit ().
So we would be left with only the cogl-matrix-stack code being responsible
for glMatrixMode, glLoadMatrix and glLoadIdentity this commit updates the
journal code so it now uses the matrix-stack API instead of GL directly.
The Journal can be considered a standalone component, so even though
it's currently only used to log quads, it seems better to split it
out into its own file.
When we implement atlas textures we will probably want to use the spans API
to handle texture repeating so it doesn't make sense to leave the code in
cogl-texture-2d-sliced.c. Since it's a standalone set of data structures
and algorithms it also seems reasonable to split out from cogl-texture.
cogl-texture-2d-sliced provides an implementation of CoglTexture and this
seperation lays the foundation for potentially supporting atlas textures,
pixmap textures (as in GLX_EXT_texture_from_pixmap) and fast-path
GL_TEXTURE_{1D,2D,3D,RECTANGLE} textures in a maintainable fashion.
cogl-primitives.c was previously digging right into CoglTextures so it could
manually iterate the texture slices for texturing quads and polygons and
because we were missing some state getters we were lazily just poking into
the structures directly.
This adds some extra state getter functions, and adds a higher level
_cogl_texture_foreach_slice () API that hopefully simplifies the way in
which sliced textures may be used to render primitives. This lets you
specify a rectangle in "virtual" texture coords and it will call a given
callback for each slice that intersects that rectangle giving the virtual
coords of the current slice and corresponding "real" texture coordinates for
the underlying gl texture.
At the same time a noteable bug in how we previously iterated sliced
textures was fixed, whereby we weren't correctly handling inverted texture
coordinates. E.g. with the previous code if you supplied texture coords of
tx1=100,ty1=0,tx2=0,ty2=100 (inverted along y axis) that would result in a
back-facing quad, which could be discarded if using back-face culling.
The descriptions for gl_handle and gl_target were inverted.
Thanks to Young-Ho Cha for spotting that.
Signed-off-by: Robert Bragg <robert@linux.intel.com>
As part of an incremental process to have Cogl be a standalone project we
want to re-consider how we organise the Cogl source code.
Currently this is the structure I'm aiming for:
cogl/
cogl/
<put common source here>
winsys/
cogl-glx.c
cogl-wgl.c
driver/
gl/
gles/
os/ ?
utils/
cogl-fixed
cogl-matrix-stack?
cogl-journal?
cogl-primitives?
pango/
The new winsys component is a starting point for migrating window system
code (i.e. x11,glx,wgl,osx,egl etc) from Clutter to Cogl.
The utils/ and pango/ directories aren't added by this commit, but they are
noted because I plan to add them soon.
Overview of the planned structure:
* The winsys/ API is the API that binds OpenGL to a specific window system,
be that X11 or win32 etc. Example are glx, wgl and egl. Much of the logic
under clutter/{glx,osx,win32 etc} should migrate here.
* Note there is also the idea of a winsys-base that may represent a window
system for which there are multiple winsys APIs. An example of this is
x11, since glx and egl may both be used with x11. (currently only Clutter
has the idea of a winsys-base)
* The driver/ represents a specific varient of OpenGL. Currently we have "gl"
representing OpenGL 1.4-2.1 (mostly fixed function) and "gles" representing
GLES 1.1 (fixed funciton) and 2.0 (fully shader based)
* Everything under cogl/ should fundamentally be supporting access to the
GPU. Essentially Cogl's most basic requirement is to provide a nice GPU
Graphics API and drawing a line between this and the utility functionality
we add to support Clutter should help keep this lean and maintainable.
* Code under utils/ as suggested builds on cogl/ adding more convenient
APIs or mechanism to optimize special cases. Broadly speaking you can
compare cogl/ to OpenGL and utils/ to GLU.
* clutter/pango will be moved to clutter/cogl/pango
How some of the internal configure.ac/pkg-config terminology has changed:
backendextra -> CLUTTER_WINSYS_BASE # e.g. "x11"
backendextralib -> CLUTTER_WINSYS_BASE_LIB # e.g. "x11/libclutter-x11.la"
clutterbackend -> {CLUTTER,COGL}_WINSYS # e.g. "glx"
CLUTTER_FLAVOUR -> {CLUTTER,COGL}_WINSYS
clutterbackendlib -> CLUTTER_WINSYS_LIB
CLUTTER_COGL -> COGL_DRIVER # e.g. "gl"
Note: The CLUTTER_FLAVOUR and CLUTTER_COGL defines are kept for apps
As the first thing to take advantage of the new winsys component in Cogl;
cogl_get_proc_address() has been moved from cogl/{gl,gles}/cogl.c into
cogl/common/cogl.c and this common implementation first trys
_cogl_winsys_get_proc_address() but if that fails then it falls back to
gmodule.
This moves most of cogl-context.{c.h} to cogl/common with some driver
specific members now living in a CoglContextDriver struct. Driver specific
context initialization and typedefs now live in
cogl/{gl,gles}/cogl-context-driver.{c,h}
Driver specific members can be found under ctx->drv.stuff
This splits the limited components that differed between
cogl/{gl,gles}/cogl-texture.c into new {gl,gles}/cogl-texture-driver.c files
and the rest that can now be shared into cogl/common/cogl-texture.c
When not building a debug build the compiler was warning about empty
else clauses with no braces due to code like:
if (blah)
do_foo();
else
COGL_NOTE (DRAW, "a-wibble");
This simply ensures that even for non debug builds COGL_NOTE will expand to
a single statement.
glVertexPointer expects positions with 2, 3 or 4 components, glColorPointer
expects colors with 3 or 4 components and glNormalPointer expects normals
with three components so when adding vertex buffer atributes with the names
"gl_Vertex", "gl_Color" or "gl_Normal" we assert these constraints and print
an explanation to the developer if not met.
This also fixes the previosly incorrect constraint that gl_Normal attributes
must have n_components == 1; thanks to Cat Sidhe for reporting this:
Bug: http://bugzilla.openedhand.com/show_bug.cgi?id=1819
By default, float * is considered as an out argument by gobject
introspection which is wrong for quite a few Cogl symbols. Start adding
annotations to fix that for the ones in the "Primitives" gtk-doc
section.
The lifetime of the journal VBO is entirely within the scope of the
cogl_journal_flush function so there is no need to store it globally
in the Cogl context. Instead, upload_vertices_to_vbo just returns the
new VBO. cogl_journal_flush stores this in a local variable and
destroys it before returning.
This also fixes an assertion when using the GLES backend which was
caused by nothing initialising the journal_vbo variable.
The framebuffer_object spec isn't clear in defining whether attaching a
texture as a renderbuffer with mipmap filtering enabled while the mipmaps
have not been uploaded should result in an incomplete framebuffer object.
(different drivers make different decisions)
To avoid an error with drivers that do consider this a problem we explicitly
set non mipmapped filters before calling glCheckFramebufferStatusEXT. The
filters will later be reset when the texture is actually used for rendering
according to the filters set on the corresponding CoglMaterial.
The blend string compiler checks that the syntax of a function name is
[A-Za-z_]*, preventing the use of DOT3_RGB[A].
Signed-off-by: Emmanuele Bassi <ebassi@linux.intel.com>
This reverts commit 3c47a3beb5.
Of course I remembered just after pushing the patch why we hadn't done
this before :-) If you look in the glsl spec:
http://www.khronos.org/registry/gles/specs/2.0/es_full_spec_2.0.24.pdf
Section 3.7.10 Texture Completeness and Non-Power-Of-Two Textures
you can see GLES 2.0 doesn't support mipmaps for npot textures.
There is possibly some way we could support this in Cogl but at least
it's not as simple as or-ing in the feature flag, sadly.
The core GLES2 API supports NPOT textures, i.e. there is no extension as for
OpenGL, so we now add COGL_FEATURE_TEXTURE_NPOT to the feature flags in
_cogl_features_init.
Thanks to Gordon Williams for spotting this.
Don't let stringify.sh write to the $srcdir + use the BUILT_SOURCES var in
Makefile.am so as to ensure all .c. and .h files get generated from their
corresponding .glsl files before building other targets.
The wrong part of an expression was bracketed in the test to determine
when a new texture matrix needed to be loaded which resulted in the
first pass through _cogl_material_layer_flush_gl_sampler_state
not uploading any user matrix.
Following bug #1762, the syntax of g-ir-scanner was changed in
gobject-introspection, so Clutter does not build anymore with 0.6.4.
See the bugzilla bug:
http://bugzilla.gnome.org/show_bug.cgi?id=591669
GObject-Introspection now uses a different mechanism to extract the
SONAME when building the gir file and it needs the libtool archive as
option.
Signed-off-by: Emmanuele Bassi <ebassi@linux.intel.com>
Keep the CoglContext in sync between GL and GLES backends. We ought
to find a way to have a generic context, though, and have backend
specific sections.
Fixes bug:
http://bugzilla.openedhand.com/show_bug.cgi?id=1698
On some platforms (anything but Linux, and on obscure Linux
architectures) dolt isn't used, so $(top_builddir)/doltlibtool
won't exist. $(top_builddir)/libtool will always be generated
even if dolt is used, so just use that unconditionally. We don't
need the extra speed when linking the single program for
introspection.
http://bugzilla.openedhand.com/show_bug.cgi?id=1699
Signed-off-by: Emmanuele Bassi <ebassi@linux.intel.com>
commit e2c4a2a9f8 fixed one thing but broke many others things :-/
hopfully this fixes that.
It turned out that the journal was mistakenly setting the OVERRIDE_LAYER0
flush option for all entries, but some other logic errors were also
uncovered in _cogl_material_equal.
To help us handle sliced textures; When flushing materials there is an
override option that can be given to replace the texture name for layer0
so we may iterate the slices without needing to modify the material
in use.
Since improving the journal's ability to batch state changes we added a
_cogl_material_equals function that is used by the journal to compare
materials and identify when a state change is required, but this wasn't
correctly considering the layer0 override resulting in false positives that
meant the journal wouldn't update the GL state and the first texture name
was used for all slices.
The cost of glGetFloatv with Mesa is still representing a majority of our
time in OpenGL for some applications, and the last thing left using this is
the current-matrix API when getting the projection matrix.
This adds a matrix stack for the projection matrix, so all getting, setting
and modification of the projection matrix is now managed by Cogl and it's only
when we come to draw that we flush changes to the matrix to OpenGL.
This also brings us closer to being able to drop internal use of the
deprecated OpenGL matrix functions, re: commit 54159f5a1d
Scanners like gtk-doc and g-ir-scanner get confused by:
typedef struct _Foo {
...
} Foo;
And expect instead:
typedef struct _Foo Foo;
struct _Foo {
...
};
CoglMatrix definition should be changed to avoid the former type.
In order to validate the sequence of:
XResizeWindow
ConfigureNotify
glViewport
that should happen on X11 we need to add debug annotations to the
calls to glViewport() done through COGL.
This avoids some calls to glGetFloatv, which have at least proven to be very
in-efficient in mesa at this point in time, since it always updates all derived
state even when it may not relate to the state being requested.
Fixes and adds a unit test for creating and drawing using materials with
COGL_INVALID_HANDLE texture layers.
This may be valid if for example the user has set a texture combine string
that only references a constant color.
_cogl_material_flush_layers_gl_state will bind the fallback texture for any
COGL_INVALID_HANDLE layer, later though we could explicitly check when the
current blend mode does't actually reference a texture source in which case
binding the fallback texture is redundant.
This tests drawing using cogl_rectangle, cogl_polygon and
cogl_vertex_buffer_draw.
Although we wouldn't recommend developers try and interleve OpenGL drawing
with Cogl drawing - we would prefer patches that improve Cogl to avoid this
if possible - we are providing a simple mechanism that will at least give
developers a fighting chance if they find it necissary.
Note: we aren't helping developers change OpenGL state to modify the
behaviour of Cogl drawing functions - it's unlikley that can ever be
reliably supported - but if they are trying to do something like:
- setup some OpenGL state.
- draw using OpenGL (e.g. glDrawArrays() )
- reset modified OpenGL state.
- continue using Cogl to draw
They should surround their blocks of raw OpenGL with cogl_begin_gl() and
cogl_end_gl():
cogl_begin_gl ();
- setup some OpenGL state.
- draw using OpenGL (e.g. glDrawArrays() )
- reset modified OpenGL state.
cogl_end_gl ();
- continue using Cogl to draw
Again; we aren't supporting code like this:
- setup some OpenGL state.
- use Cogl to draw
- reset modified OpenGL state.
When the internals of Cogl evolves, this is very liable to break.
cogl_begin_gl() will flush all internally batched Cogl primitives, and emit
all internal Cogl state to OpenGL as if it were going to draw something
itself.
The result is that the OpenGL modelview matrix will be setup; the state
corresponding to the current source material will be setup and other world
state such as backface culling, depth and fogging enabledness will be also
be sent to OpenGL.
Note: no special material state is flushed, so if developers want Cogl to setup
a simplified material state it is the their responsibility to set a simple
source material before calling cogl_begin_gl. E.g. by calling
cogl_set_source_color4ub().
Note: It is the developers responsibility to restore any OpenGL state that they
modify to how it was after calling cogl_begin_gl() if they don't do this then
the result of further Cogl calls is undefined.
This function should only need to be called in exceptional circumstances
since Cogl can normally determine internally when a flush is necessary.
As an optimization Cogl drawing functions may batch up primitives
internally, so if you are trying to use raw GL outside of Cogl you stand a
better chance of being successful if you ask Cogl to flush any batched
geometry before making your state changes.
cogl_flush() ensures that the underlying driver is issued all the commands
necessary to draw the batched primitives. It provides no guarantees about
when the driver will complete the rendering.
This provides no guarantees about the GL state upon returning and to avoid
confusing Cogl you should aim to restore any changes you make before
resuming use of Cogl.
If you are making state changes with the intention of affecting Cogl drawing
primitives you are 100% on your own since you stand a good chance of
conflicting with Cogl internals. For example clutter-gst which currently
uses direct GL calls to bind ARBfp programs will very likely break when Cogl
starts to use ARBfb programs internally for the material API, but for now it
can use cogl_flush() to at least ensure that the ARBfp program isn't applied
to additional primitives.
This does not provide a robust generalized solution supporting safe use of
raw GL, its use is very much discouraged.
Previously we would call _cogl_material_pre_change_notify unconditionally, but
now we wait until we really know we are removing a layer before notifying the
change, which will require a journal flush.
Since the convenience functions cogl_set_source_color4ub and
cogl_set_source_texture share a single material, cogl_set_source_color4ub
always calls cogl_material_remove_layer. Often this is a NOP though and
shouldn't require a journal flush.
This gets performance back to where it was before reverting the per-actor
material commits.
Before any cogl vertex buffer drawing we call
enable_state_for_drawing_buffer which sets up the GL state, but we weren't
disabling unsed client texture coord arrays.
This simplifies the vertex data uploading in the journal, and could improve
performance. Modifying a VBO mid-scene could reqire synchronizing with the
GPU or some form of shadowing/copying to avoid modifying data that the GPU
is currently processing; the buffer was also being marked as GL_STATIC_DRAW
which could have made things worse.
Now we simply create a GL_STATIC_DRAW VBO for each flush and and delete it
when we are finished.
Using cogl_rectangle (and thus the journal) in
_cogl_add_path_to_stencil_buffer means we have to consider all the state
that the journal may change in case it may interfer with the direct GL calls
used. This has proven to be error prone and in this case the journal is an
unnecissary overhead. We now simply call glRectf instead of using
cogl_rectangle.
We were missing the simplest test of all: are the two CoglHandles equal and
are the flush option flags for each material equal? This should improve
batching for some common cases.
Whenever we modify a material we call _cogl_material_pre_change_notify which
checks to see if the material is referenced by the journal and if so flushes
if before we modify the material.
Since the journal logs material colors directly into a vertex array (to
avoid us repeatedly calling glColor) then we know we never need to flush
the journal when material colors change.
Since most Clutter actors aren't much more than textured quads; flushing the
journal typically involves lots of 'change modelview; draw quad' sequences.
The amount of overhead involved in uploading a new modelview and queuing
that primitive is huge in comparison to simply transforming 4 vertices by
the current modelview when logging quads. (Note if your GPU supports HW
vertex transform, then it still does the projective and viewport transforms)
At the same time a --cogl-debug=disable-software-transform option has been
added for comparison and debugging.
This change allows typical pick scenes to be batched into a single draw call
and I'm seeing test-pick run over 200% faster with this. (i965 + Mesa
7.6-devel)
Enabling this option makes Cogl trace how the journal is managing to batch
your rectangles. The journal staggers how it emmits state to the GL driver
and the batches will normally get smaller for each stage, but ideally you
don't want to be in a situation where Cogl is only able to draw one quad per
modelview change and draw call.
E.g. this is a fairly ideal example:
BATCHING: journal len = 101
BATCHING: vbo offset batch len = 101
BATCHING: material batch len = 101
BATCHING: modelview batch len = 101
This isn't:
BATCHING: journal len = 1
BATCHING: vbo offset batch len = 1
BATCHING: material batch len = 1
BATCHING: modelview batch len = 1
BATCHING: journal len = 1
BATCHING: vbo offset batch len = 1
BATCHING: material batch len = 1
BATCHING: modelview batch len = 1
<repeat>
When this option is used Cogl will print a trace of all quads that get
logged into the journal, and a trace of quads as they get flushed.
If you are seeing a bug with the geometry being drawn by Cogl this may give
some clues by letting you sanity check the numbers being logged vs the
numbers being emitted.
For testing the VBO fallback paths it helps to be able to disable the
COGL_FEATURE_VBOS feature flag. When VBOs aren't available Cogl should use
client side malloc()'d buffers instead.
Previously we only used the Cogl matrix stack API for indirect contexts, but
it's too costly to keep on requesting modelview matrices from GL (for
logging in the journal) even for direct rendering.
I also experimented with a patch for mesa to improve performance and
discussed this with upstream, but we agreed to consider the GL matrix API
essentially deprecated. (For reference the GLES 2 and GL 3 specs have
removed the matrix APIs)
CoglColors shouldn't be compared using memcmp since they may contain
uninitialized padding bytes.
The prototype is also suitable for passing to g_hash_table_new as the
key_equal_func.
_cogl_pango_display_list_add_texture now uses this instead of memcmp.
We now put the color of materials into the vertex array used by the journal
instead of calling glColor() but the number of requests for the material
color were quite expensive so we have changed the material color to
internally be byte components instead of floats to avoid repeat conversions
and added _cogl_material_get_colorubv as a fast-path for the journal to
copy data into the vertex array.
The number of material layers enabled when logging a quad in the journal
determines the stride of the corresponding vertex data (since we need a set
of texture coordinates for each layer.) By padding data in the case where we
have only one layer we can avoid a change in stride if we are mixing single
and double layer primitives in a scene (e.g. relevent for a composite
manager that may use 2 layers for all shaped windows) Avoiding stride
changes means we can minimize calls to gl{Vertex,Color}Pointer when flushing
the journal.
Since we need to update the texcoord pointers when the actual number of
layers changes, this adds another batch_and_call() stage to deal with
glTexCoordPointer and enabling/disabling the client arrays.
Previously the journal was always flushed at the end of
_cogl_rectangles_with_multitexture_coords, (i.e. the end of any
cogl_rectangle* calls) but now we have broadened the potential for batching
geometry. In ideal circumstances we will only flush once per scene.
In summary the journal works like this:
When you use any of the cogl_rectangle* APIs then nothing is emitted to the
GPU at this point, we just log one or more quads into the journal. A
journal entry consists of the quad coordinates, an associated material
reference, and a modelview matrix. Ideally the journal only gets flushed
once at the end of a scene, but in fact there are things to consider that
may cause unwanted flushing, including:
- modifying materials mid-scene
This is because each quad in the journal has an associated material
reference (i.e. not copy), so if you try and modify a material that is
already referenced in the journal we force a flush first)
NOTE: For now this means you should avoid using cogl_set_source_color()
since that currently uses a single shared material. Later we
should change it to use a pool of materials that is recycled
when the journal is flushed.
- modifying any state that isn't currently logged, such as depth, fog and
backface culling enables.
The first thing that happens when flushing, is to upload all the vertex data
associated with the journal into a single VBO.
We then go through a process of splitting up the journal into batches that
have compatible state so they can be emitted to the GPU together. This is
currently broken up into 3 levels so we can stagger the state changes:
1) we break the journal up according to changes in the number of material layers
associated with logged quads. The number of layers in a material determines
the stride of the associated vertices, so we have to update our vertex
array offsets at this level. (i.e. calling gl{Vertex,Color},Pointer etc)
2) we further split batches up according to material compatability. (e.g.
materials with different textures) We flush material state at this level.
3) Finally we split batches up according to modelview changes. At this level
we update the modelview matrix and actually emit the actual draw command.
This commit is largely about putting the initial design in-place; this will be
followed by other changes that take advantage of the extended batching.