cogl_matrix_project_points and cogl_matrix_transform_points had an
optimization for the common case where the stride parameters exactly
match the size of the corresponding structures. The code for both when
generated by gcc with -O2 on x86-64 use two registers to hold the
addresses of the input and output arrays. In the strided version these
pointers are incremented by adding the value of a register and in the
packed version they are incremented by adding an immediate value. I
think the difference in cost here would be negligible and it may even
be faster to add a register.
Also GCC appears to retain the loop counter in a register for the
strided version but in the packed version it can optimize it out and
directly use the input pointer as the counter. I think it would be
possible to reorder the code a bit to explicitly use the input pointer
as the counter if this were a problem.
Getting rid of the packed versions tidies up the code a bit and it
could potentially be faster if the code differences are small and we
get to avoid an extra conditional in cogl_matrix_transform_points.
Use a DeviceManager sub-class similar to the Win32 backend one, which
creates two InputDevices: a core pointer and a core keyboard.
The event translation code then uses these two devices to fill out the
.device field of the events.
Throw in enter/leave tracking, given that we need to update the device's
state.
http://bugzilla.clutter-project.org/show_bug.cgi?id=2490
Implementation of event loop which works with GLib events, native OS X
events and Clutter events.
The event loop source code comes from the equivalent code in the Quartz
GDK backend from GTK+ 2.22.1, which is LGPL v2.1+ and thus compatible
with Clutter's licensing terms.
The code has been tested with libsoup, which did not work before together
with Clutter.
Signed-off-by: Emmanuele Bassi <ebassi@linux.intel.com>
http://bugzilla.clutter-project.org/show_bug.cgi?id=2490
Wayland visuals refer to a pixel's bytes in order from
most significant to least significant, while the
one-byte-per-component Cogl formats refer to the order
of increasing memory addresses, so converting between
the two depends on the system's endianness.
The height was being set from the ClutterGeometry in some parts
and from the stage in others. And since both callers of this
function pass &stage_wayland->allocation as the geometry anyway,
the stage argument isn't really even needed.
Since we need to find the stage from the X11 Window, it's better to use
a static hashmap that gets updated every time the ClutterStageX11:xwin
member is changed, instead of iterating over every stage handled by the
global ClutterStageManager singleton.
Clutter should just require that the windowing system used by a backend
adds a device to the stage when the device enters, and removes it from
the stage when the device leaves; with this information, we can
synthesize every crossing event and update the device state without
other intervention from the backend-specific code.
The generation of additional crossing events for actors that are
covering the stage at the coordinates of the crossing event should be
delegated to the event processing code.
The x11 and win32 backends need to be modified to relay the enter and
leave events from the windowing system.
When synthesizing events coming from input devices it should be
possible to just call a setter function, to avoid a huge switch
on the type of the event.
Clutter should also store the device pointer inside the private
data, for faster access of the pointer in allocated events.
Finally, the get_device_id() and get_device_type() accessors should
just be wrappers around clutter_event_get_device(), to reduce the
amount of code duplication.
Since we access it in order to get the X11 Display pointer, it makes
sense to have the ClutterBackendX11 already available inside the
ClutterStageX11 structure, and avoid the pattern:
ClutterBackend *backend = clutter_get_default_backend ();
ClutterBackendX11 *backend_x11 = CLUTTER_BACKEND_X11 (backend);
which costs us a function call, a type cast and an unused variable.
Adapt to changes from this Wayland commit:
"Update surface.attach and change surface.map to surface.map_toplevel"
(82da52b15b49da3f3c7b4bd85d334ddfaa375ebc)
When we receive a ConfigureNotify event that doesn't affect the size
of the window (only the position) then we were still calling
clutter_stage_ensure_viewport which ends up queueing a full stage
redraw. This patch makes it so that it only ensures the viewport when
the size changes as it already did for avoiding queueing a relayout.
It now also avoids setting the clipped redraws cool off period when
the window only moves under the assumption that it's only necessary
for size changes.
Since the XI2 device manager code is going to be compiled only on
POSIX compliant systems, we can safely assume the presence of stdint.h
and include it unconditionally.
CLUTTER_BIND_POSITION and CLUTTER_BIND_SIZE are two convenience
enumeration values for binding x and y, and width and height
respectively, using a single ClutterBindConstraint.
When copying COMBINE state in
_cogl_pipeline_layer_init_multi_property_sparse_state we would read some
state from the destination layer (invalid data potentially), then
redundantly set the value back on the destination. This was picked up by
valgrind, and the code is now more careful about how it references the
src layer vs the destination layer.
There is currently a problem with per-framebuffer journals in that it's
possible to create a framebuffer from a texture which then gets rendered
too but the framebuffer (and corresponding journal) can be freed before
the texture gets used to draw with.
Conceptually we want to make sure when freeing a framebuffer that - if
it is associated with a texture - we flush the journal as the last thing
before really freeing the framebuffer's meta data. Technically though
this is awkward to implement since the obvious mechanism for us to be
notified about the framebuffer's destruction (by setting some user data
internally with a callback) notifies when the framebuffer has a
ref-count of 0. This means we'd have to be careful what we do with the
framebuffer to consider e.g. recursive destruction; anything that would
set more user data on the framebuffer while it is being destroyed and
ensuring nothing else gets notified of the framebuffer's destruction
before the journal has been flushed.
For simplicity, for now, this patch provides another solution which is
to flush framebuffer journals whenever we switch away from a given
framebuffer via cogl_set_framebuffer or cogl_push/pop_framebuffer. The
disadvantage of this approach is that we can't batch all the geometry of
a scene that involves intermediate renders to offscreen framebufers.
Clutter is doing this more and more with applications that use the
ClutterEffect APIs so this is a shame. Hopefully this will only be a
stop-gap solution while we consider how to reliably support journal
logging across framebuffer changes.
When flushing a clip stack that contains more than one rectangle which
needs to use the stencil buffer the code takes a different path so
that it can combine the new rectangle with the existing contents of
the stencil buffer. However it was not correctly flushing the
modelview and projection matrices so that rectangle would be in the
wrong place.
This adds a COGL_DEBUG=clipping option that reports how the clip is
being flushed. This is needed to determine whether the scissor,
stencil clip planes or software clipping is being used.
The CoglDebugFlags are now stored in an array of unsigned ints rather
than a single variable. The flags are accessed using macros instead of
directly peeking at the cogl_debug_flags variable. The index values
are stored in the enum rather than the actual mask values so that the
enum doesn't need to be more than 32 bits wide. The hope is that the
code to determine the index into the array can be optimized out by the
compiler so it should have exactly the same performance as the old
code.
The lighting parameters such as the diffuse and ambient colors were
previously only flushed in the fixed vertend. This meant that if a
vertex shader was used then they would not be set. The lighting
parameters are uniforms which are just as useful in a fragment shader
so it doesn't really make sense to set them in the vertend. They are
now flushed in the common cogl-pipeline-opengl code but the code is
#ifdef'd for GLES2 because they need to be part of the progend in that
case.
The uniforms for the alpha test reference value and point size on
GLES2 are updating using similar code. This generalizes the code so
that there is a static array of predefined builtin uniforms which
contains the uniform name, a pointer to a function to get the value
from the pipeline, a pointer to a function to update the uniform and a
flag representing which CoglPipelineState change affects the
uniform. The uniforms are then updated in a loop. This should simplify
adding more builtin uniforms.
The builtin uniforms are accessible from either the vertex shader or
the fragment shader so we should define them in the common
section. This doesn't really matter for the current list of uniforms
because it's pretty unlikely that you'd want to access the matrices
from the fragment shader, but for other builtins such as the lighting
material properties it makes sense.
Between Clutter 0.8 and 1.0, the new-frame signal of ClutterTimeline
changed the second parameter to be an elapsed time in milliseconds
rather than the frame number. However a few places in clutter were
still calling the parameter 'frame_num' which is a bit
misleading. Notably the signature for the signal class closure in the
header was using the wrong name. This changes them to use 'msecs'.
ClutterTimeline has special handling for the first time do_tick is
called which was not emitting a new-frame signal. This meant that an
application which directly uses the timeline would have to manually
setup the initial state of an animation after starting a timeline to
avoid painting a single frame with the wrong state. It seems to make
more sense to instead emit the new-frame signal so that the
application always sees a new-frame when the progress changes before a
paint.
This adds a custom "rows" property, that allows to define the rows of a
ClutterModel. A single row can either an array of all columns or an
object with column-name : column-value pairs.
http://bugzilla.clutter-project.org/show_bug.cgi?id=2528
Allow to 'abuse' the clutter_script_parse_node function by calling it
with an initialized GValue instead of a valid GParamSpec argument to
obtain the correct typed value from the json node.
http://bugzilla.clutter-project.org/show_bug.cgi?id=2528
* xi2: (41 commits)
test-devices: Actually print the axis data
device-manager/xi2: Sync the stage of source devices
event: Clean up clutter_event_copy()
device: unset the axes array pointer when resetting
device-manager/xi2: Fix device hotplugging
glx: Clean up GLX implementation
device/x11: Store min/max keycode in the XI device class
x11: Hide all private symbols
docs: More documentation fixes for InputDevice
*/event: Never manipulate the event queue directly
win32: Update DeviceManager device creation
device: Allow enabling/disabling non-master devices
backend/eglx: Add newly created stages to the translators
device: Add more doc annotations
device: Use a double for translate_axis() argument
test-devices: Clean up and show axes data
event: Fix up clutter_event_copy()
device/xi2: Translate the axis data after setting devices
device: Add more accessors for properties
docs: Update API reference
...
When we added the texture->framebuffers member a _cogl_texture_init
funciton was added to initialize the list of framebuffers associated
with a texture to NULL. All the backends were updated except the
x11 tfp backend. This was causing crashes in test-pixmap.
This is part of a broader cleanup of some of the experimental Cogl API.
One of the reasons for this particular rename is to reduce the verbosity
of using the API. Another reason is that CoglVertexArray is going to be
renamed CoglAttributeBuffer and we want to help emphasize the
relationship between CoglAttributes and CoglAttributeBuffers.
We have a bunch of experimental convenience functions like
cogl_primitive_p2/p2t2 that have corresponding vertex structures but it
seemed a bit odd to have the vertex annotation e.g. "P2T2" be an infix
of the type like CoglP2T2Vertex instead of be a postfix like
CoglVertexP2T2. This switches them all to follow the postfix naming
style.
COGL_DEBUG=disable-fast-read-pixel can be used to disable the
optimization for reading a single pixel colour back by looking at the
geometry in the journal and not involving the GPU. With this disabled we
will always flush the journal, rendering to the framebuffer and then use
glReadPixels to get the result.
This adds a transparent optimization to cogl_read_pixels for when a
single pixel is being read back and it happens that all the geometry of
the current frame is still available in the framebuffer's associated
journal.
The intention is to indirectly optimize Clutter's render based picking
mechanism in such a way that the 99% of cases where scenes are comprised
of trivial quad primitives that can easily be intersected we can avoid
the latency of kicking a GPU render and blocking for the result when we
know we can calculate the result manually on the CPU probably faster
than we could even kick a render.
A nice property of this solution is that it maintains all the
flexibility of the render based picking provided by Clutter and it can
gracefully fall back to GPU rendering if actors are drawn using anything
more complex than a quad for their geometry.
It seems worth noting that there is a limitation to the extensibility of
this approach in that it can only optimize picking a against geometry
that passes through Cogl's journal which isn't something Clutter
directly controls. For now though this really doesn't matter since
basically all apps should end up hitting this fast-path. The current
idea to address this longer term would be a pick2 vfunc for ClutterActor
that can support geometry and render based input regions of actors and
move this optimization up into Clutter instead.
Note: currently we don't have a primitive count threshold to consider
that there could be scenes with enough geometry for us to compensate for
the cost of kicking a render and determine a result more efficiently by
utilizing the GPU. We don't currently expect this to be common though.
Note: in the future it could still be interesting to revive something
like the wip/async-pbo-picking branch to provide an asynchronous
read-pixels based optimization for Clutter picking in cases where more
complex input regions that necessitate rendering are in use or if we do
add a threshold for rendering as mentioned above.
Both cogl_matrix_transform_points and _project_points take points_in and
points_out arguments and explicitly allow pointing to the same array
(i.e. to transform in-place) The implementation of the various internal
transform functions though were not handling this possability and so it
was possible the reference partially transformed vertex values as if
they were original input values leading to incorrect results. This patch
ensures we take a temporary copy of the current input point when
transforming.
This adds a utility function that can determine if a given point
intersects an arbitrary polygon, by counting how many edges a
"semi-infinite" horizontal ray crosses from that point. The plan is to
use this for a software based read-pixel fast path that avoids using the
GPU to rasterize journaled primitives and can instead intersect a point
being read with quads in the journal to determine the correct color.
This adds a stop-gap mechanism for Cogl to know when the window system
is requested to present the current backbuffer to the frontbuffer by
adding a _cogl_swap_buffers_notify function that backends are now
expected to call right after issuing the equivalent request to OpenGL
vie the platforms OpenGL binding layer. This (blindly) updates all the
backends to call this new function.
For now Cogl doesn't do anything with the notification but the intention
is to use it as part of a planned read-pixel optimization which will
need to reset some state at the start of each new frame.
Instead of having _cogl_get/set_clip stack which reference the global
CoglContext this instead makes those into CoglClipState method functions
named _cogl_clip_state_get/set_stack that take an explicit pointer to a
CoglClipState.
This also adds _cogl_framebuffer_get/set_clip_stack convenience
functions that avoid having to first get the ClipState from a
framebuffer then the stack from that - so we can maintain the
convenience of _cogl_get_clip_stack.
This adds an internal function to be able to query the screen space
bounding box of the current clip entries contained in a given
CoglClipStack.
This bounding box which is cheap to determine can be useful to know the
largest extents that might be updated while drawing with this clip
stack.
For example the plan is to use this as part of an optimized read-pixel
path handled on the CPU which will need to track the currently valid
extents of the last call to cogl_clear()
Instead of having a single journal per context, we now have a
CoglJournal object for each CoglFramebuffer. This means we now don't
have to flush the journal when switching/pushing/popping between
different framebuffers so for example a Clutter scene that involves some
ClutterEffect actors that transiently redirect to an FBO can still be
batched.
This also allows us to track state in the journal that relates to the
current frame of its associated framebuffer which we'll need for our
optimization for using the CPU to handle reading a single pixel back
from a framebuffer when we know the whole scene is currently comprised
of simple rectangles in a journal.
This adds an internal alternative to cogl_object_set_user_data that also
passes an instance pointer to destroy notify callbacks.
When setting private data on a CoglObject it's often desirable to know
the instance being destroyed when we are being notified to free the
private data due to the object being freed. The typical solution to this
is to track a pointer to the instance in the private data itself so it
can be identified but that usually requires an extra micro allocation
for the private data that could have been avoided if only the callback
were given an instance pointer.
The new internal _cogl_object_set_user_data passes the instance pointer
as a second argument which means it is ABI compatible for us to layer
the public version on top of this internal function.
This moves the implementation of cogl_clear into cogl-framebuffer.c as
two new internal functions _cogl_framebuffer_clear and
_cogl_framebuffer_clear4f. It's not clear if this is what the API will
look like as we make more of the CoglFramebuffer API public due to the
limitations of using flags to identify buffers when framebuffers may
contain any number of ancillary buffers but conceptually it makes some
sense to tie the operation of clearing a color buffer to a framebuffer.
The short term intention is to enable tracking the current clear color
as a property of the framebuffer as part of an optimization for reading
back single pixels when the geometry is simple enough that we can
compute the result quickly on the CPU. (If the point doesn't intersect
any geometry we'll need to return the last clear color.)
Hierarchy and Device changed events come through with the X window set
to be the root window, not the stage window. We need to whitelist them
so that we can actually support hotplugging and device changes.
The x11 backend exposes a lot of symbols that are meant to only be used
when implementing a subclassed backend, like the glx and eglx ones.
The uninstalled headers are also filled with cruft declarations of
functions long since removed.
Let's try to clean up this mess.
Slave and floating devices should always be disabled, and not deliver
events to the scene. It is up to the user to enable non-master devices
and handle events coming from them.
ClutterInputDevice gets a new :enabled property, defaulting to FALSE;
when a device manager creates a new device it has to set it to TRUE if
the :device-mode property is set to CLUTTER_INPUT_MODE_MASTER.
The main event queue entry point, _clutter_event_push(), will
automatically discard events coming from disabled devices.
CLUTTER_BUTTON_* and CLUTTER_MOTION event types have axes data attached
to them, so we want to expose a common ClutterEvent method for
extracting that data.
The ClutterStageX11 implementation does most of the heavy lifting, so
subclasses like ClutterStageGLX and ClutterStageEGL do not need to
handle things like creating the stage Window and selecting events; just
chaining up and using the internal API will suffice.
Undeprecate the XInput-related X11 API: since we don't enable XI support
by default we still need to ask for it, and see if we have it after the
backend initialization sequence.
Event translation is now done where it belongs: we don't need a massive
switch in a file with direct access to private structure members.
So long, event_translate(); and thanks for all the fish.
We ask XI2 to get the client pointer for CLUTTER_POINTER_DEVICE, and
we use the attached keyboard device for CLUTTER_KEYBOARD_DEVICE. For
everything else, we return NULL.
We keep the symbol in the public header, but the definition is now
private. You could not sub-class InputDevice anyway, without the
instance structure, and the lack of padding in the class made actually
implementing devices in backends really hard.
This is a lump commit that is fairly difficult to break down without
either breaking bisecting or breaking the test cases.
The new design for handling X11 event translation works this way:
- ClutterBackend::translate_event() has been added as the central
point used by a ClutterBackend implementation to translate a
native event into a ClutterEvent;
- ClutterEventTranslator is a private interface that should be
implemented by backend-specific objects, like stage
implementations and ClutterDeviceManager sub-classes, and
allows dealing with class-specific event translation;
- ClutterStageX11 implements EventTranslator, and deals with the
stage-relative X11 events coming from the X11 event source;
- ClutterStageGLX overrides EventTranslator, in order to
deal with the INTEL_GLX_swap_event extension, and it chains up
to the X11 default implementation;
- ClutterDeviceManagerX11 has been split into two separate classes,
one that deals with core and (optionally) XI1 events, and the
other that deals with XI2 events; the selection is done at run-time,
since the core+XI1 and XI2 mechanisms are mutually exclusive.
All the other backends we officially support still use their own
custom event source and translation function, but the end goal is to
migrate them to the translate_event() virtual function, and have the
event source be a shared part of Clutter core.
Don't use ugly "#undef CLUTTER_DISABLE_DEPRECATED" inside source code
using deprecated symbols; we have the handy CLUTTER_COMPILATION define
that we can use as part of the "disable deprecated" conditional.