Since mutter has two X connections and does damage handling on the
frontend while fence triggering is done on the backend, we have a race
between XDamageSubtract() and XSyncFenceTrigger() causing missed
redraws in the GL_EXT_X11_sync_object path.
If the fence trigger gets processed first by the server, any client
drawing that happens between that and the damage subtract being
processed and is completely contained in the last damage event box
that mutter got, won't be included in the current frame nor will it
cause a new damage event.
A simple fix for this would be XSync()ing on the frontend connection
after doing all the damage subtracts but that would add a round trip
on every frame again which defeats the asynchronous design of X
fences.
Instead, if we move fence handling to the frontend we automatically
get the right ordering between damage subtracts and fence triggers.
https://bugzilla.gnome.org/show_bug.cgi?id=728464
The compositor maintains a ring of shared fences with the X server in order to
properly synchronize rendering between the X server and the compositor's GPU
channel. When all of the fences have been used, the compositor needs to reset
one so that it can be reused. It does this by first waiting on the CPU for the
fence to become triggered, and then sending a request to the X server to reset
the fence.
If the compositor's GPU channel is busy processing other work (e.g. the desktop
switcher animation), then the X server may process the reset request before the
GPU has consumed the fence. This causes the GPU channel to hang.
Fix the problem by having the compositor's GPU channel trigger its own fence
after waiting for the X server's fence. Wait for that fence on the CPU before
sending the reset request to the X server. This ensures that the GPU has
consumed the X11 fence before the server resets it.
Signed-off-by: Aaron Plattner <aplattner@nvidia.com>
https://bugzilla.gnome.org/show_bug.cgi?id=728464
If GL advertises this extension we'll use it to synchronize X with GL
rendering instead of relying on the XSync() behavior with open source
drivers.
Some driver bugs were uncovered while working on this so if we have
had to reboot the ring a few times, something is probably wrong and
we're likely to just make things worse by continuing to try. Let's
err on the side of caution, disable ourselves and fallback to the
XSync() path in the compositor.
https://bugzilla.gnome.org/show_bug.cgi?id=728464